Latest Content
Commentary
Generative AI is increasingly able to perform entry level work in white collar jobs, and this will impact workforces far into the future.
Across the white-collar economy, entry-level jobs are suddenly vulnerable to automation because they involve low-stakes assignments of the sort that generative AI is best at. AI could therefore sever the career ladder of industries like finance and law, forcing many would-be bankers and lawyers to look elsewhere for work.
Having used some of the more recent AI meeting tools as a meeting scribe, I can attest to the fact that they are increasingly able to perform routine tasks with a good degree of accuracy. Compared to the early days when we’d have a good laugh at the software’s attempts to summarise, it’s now a useful tool for taking notes and extracting action items.
Consider the legal field. Law is among the industries most exposed to generative AI’s capabilities because of its orientation toward language. Traditionally, the first few years of a newly accredited lawyer’s career is spent working under the tutelage of more senior lawyers and engaged in routine tasks—missives like “document review,” basic research, drafting client communications, taking notes, and preparing briefs and other legal documents. Advances in AI-powered legal software have the potential to create vast efficiencies in these tasks, enabling their completion in a fraction of the time—and a fraction of the billable hours—that it has historically taken junior lawyers and paralegals to complete them.
If we don’t need to train up junior lawyers, how do we grow the legal workforce? Or do we need to rethink the role of a lawyer?
Stripe APIs are adding payments and metering capabilities to LLM agentic workflows:
In the case you want to have an agent perform purchases:
Agentic workflows need not have exclusively virtual outcomes. Imagine a travel agent that can book flights for your company. Using LLMs and function calling we can assemble a set of agents that can search for flights online, return options, and ultimately identify a booking URL. With Stripe, you can embed financial services and enable the automation of the purchase flow as well. Using Stripe Issuing, you can generate single-use virtual cards that agents can use for business purchases. This enables your agents to spend funds. The Issuing APIs allow you to approve or decline authorizations programmatically, ensuring your purchase intent matches the authorization. Spending controls allow you to set budgets and limit spending for your agents
Additionally, it can be used for metering and billing:
Conducting agentic workflows have material cost – typically measured by token use or time. With usage-based billing, you can charge based on a customer’s usage of your product. The toolkit provides middleware to easily track prompt and completion token counts and send billing events for that customer.
The Issuing API sounds particularly useful in stopping an LLM Agent buying travel tickets to Yellowstone National Park, or worse. From the Claude announcment on computer use:
In one, Claude accidentally clicked to stop a long-running screen recording, causing all footage to be lost. In another, Claude suddenly took a break from our coding demo and began to peruse photos of Yellowstone National Park.
Ben Thompson on Shopify’s questionable self-awareness when they ventured into logistics:
Here’s the thing: logistics were and are a big challenge for merchants. It is a problem that needs to be solved. In this case, however, Shopify was too laser-focused on merchants; logistics was a situation where they needed to not just understand the problems being faced by their merchants, but also understand themselves and what problems they were actually capable of solving.
Specifically, Shopify is a software company, and that puts one hard constraint on your business: you need to go horizontal, not vertical. Software requires huge amounts of R&D but it also scales infinitely; software businesses profit by maximizing leverage on their investments, which means serving more customers with the same capabilities. Logistics, though, means the physical world, which means variable costs and a limited addressable market; this limits the leverage you get from software, without decreasing the need for R&D.
Self-awareness of one’s core strengths is as crucial as customer focus, and Shopify’s failure to stay within the bounds of what it could effectively manage as a software company resulted in significant setbacks.
Generative AI excels in codebases that are relatively clean, well-structured, and adhere to best practices. In these environments, LLMs can easily follow patterns, understand context, and generate useful suggestions or boilerplate code. Companies with younger, high-quality codebases will likely see the biggest productivity gains, as AI tools can navigate their code with precision and speed.
The team at Gauge.sh see the same thing:
There is an emerging belief that AI will make tech debt less relevant. Since it’s getting easier to write code, and easier to clean up code, wouldn’t it make sense that the typical company can handle a little more debt?
The opposite is true - AI has significantly increased the real cost of carrying tech debt. The key impact to notice is that generative AI dramatically widens the gap in velocity between ‘low-debt’ coding and ‘high-debt’ coding.
Companies with relatively young, high-quality codebases benefit the most from generative AI tools, while companies with gnarly, legacy codebases will struggle to adopt them. In other words, the penalty for having a ‘high-debt’ codebase is now larger than ever.
High-debt codebases are often a patchwork of custom solutions, undocumented hacks, and interdependent modules. This complexity is kryptonite for AI models, which struggle with code that deviates from standard patterns. LLMs are trained on large datasets filled with best practices and typical coding paradigms, not on the idiosyncrasies of a company’s specific, legacy system. As a result, AI-generated suggestions are more likely to make faulty assumptions, or even worsen existing issues in high-debt environments. In practical terms, this means the productivity boost that AI offers in low-debt environments simply isn’t likely to translate to high-debt codebases.
This widening gap has turned tech debt into an arguably more urgent and strategic problem. Back in 2003, Nicholas Carr's provocative Harvard Business Review article “IT Doesn’t Matter” argued that as IT became ubiquitous, its strategic value diminished. Carr's point was that once a technology is available to everyone, it ceases to be a source of competitive advantage.
While Carr was correct about IT’s ubiquity, he couldn’t have predicted how this ubiquity would lead to layers of accumulated complexity in codebases. Today, many companies, especially in industries like finance, are shackled by these legacy systems. For decades, banks and investment firms poured billions into proprietary trading systems, risk management platforms, and customer-facing applications. These were the “crown jewels,” meant to give them a competitive edge.
But ironically, the very systems that once differentiated them have now become corporate concrete shoes. They are mired in layers of custom code that can’t easily be modernized or replaced. Rather than enabling innovation, these systems prevent it. Companies find themselves allocating massive resources just to maintain the status quo, with little bandwidth left for new projects or innovation.
The problem isn’t just that tech debt exists; it’s that the cost of carrying it has escalated. Generative AI tools are a force multiplier, but only for those who are already well-positioned to take advantage of them. For companies with modern, well-maintained codebases, AI is a powerful accelerator. For those with tangled, legacy systems, AI is not a shortcut but a spotlight, highlighting the inefficiencies and fragility of their code.
In this new reality, tech debt is no longer just a drag on velocity—it’s a strategic risk. Companies that fail to address their tech debt may find themselves falling further behind, not just in their ability to deliver software but in their capacity to leverage the next wave of AI-driven innovation.
Existing benchmarks are becoming saturated with new AI models, highlighting the need for new benchmarks.
Companies conduct “evaluations” of AI models by teams of staff and outside researchers. These are standardised tests, known as benchmarks, that assess models' abilities and the performance of different groups' systems or older versions. However, recent advances in AI technology have meant many of the newest models have been able to get close to or above 90 per cent accuracy on existing tests, highlighting the need for new benchmarks. “The pace of the industry is extremely fast. We are now starting to saturate our ability to measure some of these systems [and as an industry] it is becoming more and more difficult to evaluate [them],” said Ahmad Al-Dahle, generative AI lead at Meta.
see also LiveCodeBench