For a long time, the limiting factor in knowledge work was not only skill. It was elapsed time. People had to gather the right documents, loop in the right people, draft from scratch, wait for replies, revise, and repeat. Generative AI hits that structure first. In early workplace studies, the gains are no longer theoretical: customer-support agents resolved more issues per hour with AI assistance, professionals doing writing tasks finished faster with better results, consultants working on tasks inside AI’s competence got quicker and better, and software developers using coding assistants often completed more work. Stanford’s 2025 AI Index, drawing across multiple studies, describes early measured gains in the rough range of 10% to 45%, depending on the task and setting.
Table of Contents
That does not mean every job has turned into a prompt and a button. It does mean the calendar has changed. A model can produce a first draft before a team has finished agreeing on the outline. It can keep iterating while people sleep. And because inference costs have dropped sharply while model performance has climbed, the old gap between idea and usable output has narrowed fast. Stanford reports that the cost of querying a model at GPT-3.5 level fell by more than 280-fold between late 2022 and late 2024, while benchmark performance on hard coding and reasoning tests improved at a pace that would have looked absurd just a year earlier.
A deadline that used to belong to teams
When people say AI now does in hours what once took teams weeks, they are usually talking about the wrong layer of work. The final decision still belongs to humans in most serious settings. The part that has collapsed is everything that sits before the decision: drafting, summarizing, restructuring, comparing, rewriting, synthesizing, and generating alternatives. Those steps used to eat entire workweeks because they involved blank-page effort and coordination drag. They were not glamorous, but they were expensive. McKinsey’s work on generative AI points straight at this mechanism: the biggest value sits in customer operations, marketing and sales, software engineering, and R&D, and the reason is simple. Those functions contain a lot of language-heavy, pattern-rich work that can be accelerated without waiting for physical processes or capital equipment to catch up.
That shift is bigger than a “faster tool” story. Knowledge work has always been slowed by handoffs. One person researches, another drafts, another edits, another checks tone, another pulls data, another turns the whole thing into slides, and by then the window for acting has partly closed. AI compresses those handoffs because one person can now do the first pass of several roles alone. A marketer can ask for three campaign angles, a competitive comparison, five email variants, and a first landing-page draft in one session. A product manager can turn a messy call transcript into action items, requirements, objections, and risk flags without waiting for a coordinator to clean it up later. A developer can generate boilerplate, tests, migration sketches, and documentation while still holding the architecture in mind. That does not erase teamwork. It changes where teamwork begins.
The economic background matters here. McKinsey estimates that generative AI could add $2.6 trillion to $4.4 trillion annually across the use cases it examined, with even higher total productivity effects once broader knowledge-worker gains are counted. That headline gets repeated so often it risks sounding abstract, but the underlying logic is concrete. Firms are not only buying output. They are buying shorter cycle times. If a team can move from rough idea to testable version in one day instead of two weeks, it can run more experiments, make fewer late guesses, and waste less management time coordinating the obvious. The real gain is not just that AI does work. It is that AI removes waiting from parts of work that used to be built around waiting.
There is another reason the change feels abrupt. Earlier software waves improved record-keeping, search, communication, or workflow routing. Generative AI goes after the point where a human used to stare at a blank screen and start assembling meaning from fragments. That is why its effect lands so hard in writing, analysis, support, coding, reporting, and planning. It attacks the start-up cost of thinking through text. Once that cost falls, the pace of the whole system changes.
Nonstop output changes more than speed
People often say AI works nonstop as if that were a minor detail. It is not minor. It is the core operational difference. Human work is bounded by sleep, shifts, meetings, context loss, and fatigue. Machine work is bounded by compute, guardrails, and error rates. Those are very different ceilings. A person can produce excellent work for a few focused hours. A deployed AI system can keep generating drafts, classifications, summaries, and code suggestions all night, then pick up again in the morning without needing recovery time.
That matters because the modern office was already running out of attention before generative AI arrived. Microsoft’s 2025 work research describes a workforce drowning in communication overhead: people interrupted every two minutes during core hours, 117 emails a day on average, 153 Teams messages on a typical weekday, and a steady spill of work into evenings and weekends. In the same 2025 Work Trend Index survey, 80% of workers and leaders said they lacked enough time or energy to do their work, while leaders increasingly looked to digital labor to expand capacity. That is the backdrop against which nonstop AI makes sense to management. AI is not entering a calm, well-designed system. It is entering an overloaded one.
What firms want from nonstop AI is not just more output. They want asynchronous capacity. They want the ability to keep work moving when nobody is in the room, when one time zone hands off to another, when support demand spikes, when a proposal has to be ready before the next meeting, or when a team needs ten options before choosing one. That is why the language around “digital labor” has gotten so direct. Microsoft’s 2025 research reports that 82% of leaders expect to use digital labor to expand workforce capacity in the next 12 to 18 months, and 46% say their companies are already using agents to fully automate workflows or processes. The phrase may sound inflated, but the management logic is plain: if intelligence becomes available on demand, firms stop treating every extra unit of thinking work as scarce headcount.
There is also a human side to this. When AI is used well, it can reduce the amount of spillover work humans drag into late evenings. In a field experiment across 66 firms and 7,137 knowledge workers, NBER researchers found that treated workers who actually used Microsoft 365 Copilot spent two fewer hours on email each week in the second half of the study and reduced time spent working outside regular hours. A separate Microsoft study on more than 6,000 workers found that regular users spent about half an hour less reading email each week and completed documents 12% faster. That is a modest result compared with the louder headlines, but it points to something practical: nonstop AI is most useful when it removes repetitive low-value drag from human schedules, not when it floods people with even more machine-made work to review.
So the important question is not whether AI can run around the clock. It obviously can. The harder question is whether organizations use that round-the-clock capacity to clear bottlenecks or to spray more unfinished output into already crowded workflows. Nonstop generation without judgment becomes noise. Nonstop generation tied to clear standards, deadlines, and review loops becomes a different production model altogether.
The work already collapsing into hours
The most useful way to look at AI’s speed is not through slogans but through task settings where someone actually measured it. The evidence is early, uneven, and strong enough to matter. Customer support, writing, consulting, and software development are already showing what timeline compression looks like when AI is folded into real work rather than demo scripts.
Early workplace results at a glance
| Setting | Measured effect | What changed |
|---|---|---|
| Customer support | 15% higher productivity on average | Faster handling, more chats per hour, small lift in resolution rates |
| Professional writing | Faster completion and better quality | Less time spent rough-drafting, more time editing and shaping |
| Consulting tasks inside AI’s competence | 25.1% faster, 12.2% more tasks completed, 40%+ higher quality | Better performance on realistic knowledge tasks that fit current model strengths |
| Software development | 26.08% more completed tasks in three field experiments; 55.8% faster in a controlled Copilot task | Faster code completion, more pull requests, more compilation activity |
The table pulls from the cited workplace and experimental studies rather than vendor marketing. The pattern is hard to miss: AI is strongest where the work is language-heavy, repetitive in structure, and still benefits from a human checking the result. It is much less predictable where the task demands deep local context, hidden constraints, or hard-to-verify reasoning.
The customer-support result is especially revealing because it shows more than speed. Brynjolfsson, Li, and Raymond found that AI assistance raised productivity by about 15% on average, but the bigger story sat inside the distribution. Less experienced and lower-skilled workers saw the strongest gains, and the tool appeared to help them move down the experience curve faster. In their data, agents with around two months of tenure and AI access performed as well as or better than agents with more than six months of tenure without AI. That is not just automation. It is partial skill transfer at scale.
The consulting result from the Harvard-BCG study shows both the upside and the warning label. On 18 realistic consulting tasks that fell within AI’s competence, consultants using GPT-4 completed more tasks, worked 25.1% faster, and produced much better output. Yet on a task deliberately placed outside the model’s frontier, AI users were 19 percentage points less likely to get the right answer. That is why the paper’s “jagged frontier” idea matters. AI does not improve work in one smooth line. It creates areas of sharp acceleration and areas of sharp failure inside the same workflow.
Software development gives the clearest picture of why people feel the pace of work changing so quickly. Controlled GitHub Copilot research found task completion 55.8% faster in a specific experiment. Later field experiments across Microsoft, Accenture, and another Fortune 100 company found a 26.08% increase in completed tasks among tool users. A BIS field experiment on coding reported that code output rose by more than 50%, though the statistically clear gains were concentrated among junior staff. The consistent message is not that every engineer gets the same boost. It is that a large share of routine implementation work now moves much faster than it did a short time ago.
Cheaper models made the acceleration unavoidable
Capability alone would not have changed work this quickly. The other half of the story is cost. A technology does not remake workflows just because it is impressive. It does so when it becomes cheap enough to use repeatedly, casually, and at scale. That threshold has now been crossed in many knowledge-work settings.
Stanford’s 2025 AI Index put a hard number on the shift. The cost of querying a model that performs at roughly GPT-3.5 level on MMLU fell from $20 per million tokens in November 2022 to $0.07 by October 2024. The same report says hardware costs have been dropping by about 30% a year, while energy efficiency has improved around 40% a year. Open-weight models also closed much of the performance gap with closed systems over the same period. That combination matters because it changes who can use AI, how often, and for what volume of work. A costly specialist tool becomes a background utility.
At the same time, the models themselves got much better at exactly the kind of work firms care about. Stanford reports that on SWE-bench, a benchmark built from real GitHub issues, top-model performance went from 4.4% in 2023 to 71.7% by 2024. That does not mean real software jobs are 71.7% automated. It does mean the systems now clear competence thresholds that used to block business adoption. Once a tool is both cheap and often good enough, organizations stop treating it as an experiment and start building it into ordinary workflows.
The adoption numbers reflect that shift, even if they should be read carefully. Eurostat reported that 20.0% of EU enterprises with 10 or more employees used AI technologies in 2025, up from 13.5% a year earlier, with large firms far ahead of smaller ones. Microsoft and LinkedIn’s 2024 Work Trend research found that 75% of global knowledge workers said they were already using generative AI at work, though self-reported use is broader and looser than firm-level deployment. Put together, those figures say two things at once: AI use is spreading fast, and deep organizational integration is still uneven. The race is no longer about awareness. It is about embedment.
That helps explain why the last two years have felt so compressed. Earlier enterprise AI often required custom models, narrow use cases, expensive data work, and slow deployment. Generative AI arrived with a different shape. A general model could already summarize, rewrite, classify, generate, compare, translate, and draft code. Then those abilities got cheaper almost immediately. Once that happened, the old economic objection—“great demo, but too expensive for daily use”—started to weaken. Cheap intelligence changes behavior faster than rare intelligence.
Team structure is starting to bend around AI
One reason AI feels so disruptive is that it changes not only how fast individuals work but how many people a piece of work seems to require. A lot of white-collar output was built around specialization because the cost of collecting and reshaping information was high. Someone gathered research, someone drafted, someone cleaned slides, someone reviewed tone, someone coordinated versions, someone took notes, someone turned decisions into follow-up. When AI collapses several of those steps into one workflow, the old team shape starts to look heavier than it did before.
Stanford’s 2025 AI Index points to this directly. Summarizing research on workplace effects, it notes that Microsoft studies found lower perceived mental demand and strong gains in common tasks, while a Harvard study reported sharply reduced collaborative overhead, with projects requiring 79.3% fewer collaborators on average after AI adoption. That does not mean large teams disappear. It means the threshold for when you truly need a large team rises. More work can now reach “good enough to review” before the second or third person gets involved.
Software development shows the structural shift in a more granular way. In a natural experiment on GitHub Copilot involving 187,489 developers, Hoffmann and co-authors found that access to Copilot increased the relative share of coding work and reduced the relative share of project-management activity, with stronger effects among lower-ability developers. That is a subtle but important change. AI does not simply make developers faster. It changes the mix of what they spend their time on. Less energy goes to coordination and routine management; more goes to core production and exploration.
You can see the same logic in management language. Microsoft’s 2025 Work Trend Index describes firms moving toward hybrid human-agent teams and argues that the important new skill is not typing faster but delegating, steering, and checking agents. In the survey behind that report, leaders were notably ahead of employees in agent familiarity, trust, and expectations that managing agents would become part of the job. That may sound futuristic, but the operational change is immediate. More employees now act less like solo contributors producing every intermediate artifact and more like small-unit directors of machine-assisted workflows.
That is why the phrase “AI replaces teams” is both true and false. It is false if it suggests whole departments vanish overnight. It is true if it points to the way AI strips out some of the coordination layers that once justified larger groups. A person with a strong prompt, domain knowledge, and authority to decide can now do the early work of several adjacent roles. The result is not always fewer people on payroll right away. Often it is leaner project formation, faster first drafts, and less reason to create extra meetings just to get work into shape.
The jagged frontier is still very real
All of this speed comes with a trap. Because AI is so good at producing plausible output, it is easy to mistake fast response for reliable performance. That is the central discipline problem of the AI era. A system can give you an answer instantly and still be the wrong worker for the task.
The Harvard-BCG consulting study remains one of the cleanest demonstrations of that problem. On tasks within the model’s competence, AI users did much better. On a task outside the frontier, they got worse. The authors call this a jagged technological frontier because adjacent tasks can sit on opposite sides of current model capability. A model may write a sharp market memo and then fumble a reasoning step that a competent junior analyst would catch. It may generate working code for a standard pattern and then drift badly when hidden constraints pile up. The danger is not only error. It is misplaced confidence.
Real-world software evidence makes the point even harder. In a 2025 randomized trial by METR, experienced open-source developers using early-2025 AI tools on repositories they already knew well took 19% longer to finish their tasks. The developers themselves expected the opposite. Even after doing the work, they still thought AI had sped them up. That is a remarkable result because it shows how easily subjective feeling and actual throughput can diverge. AI often makes work feel easier or more pleasant before it makes it truly faster. Sometimes it does both. Sometimes it does only the first.
METR’s broader work on long tasks helps explain the mismatch. Their measurements suggest current frontier systems do very well on short tasks and deteriorate as task length and dependency chains grow. On their task-length analysis, current models had nearly 100% success on tasks taking humans less than four minutes, but under 10% success on tasks taking more than around four hours. Claude 3.7 Sonnet, in that framework, had a 50% success “time horizon” of about one hour, and the frontier had been doubling roughly every seven months. That is impressive progress and a clear limit. It tells you why AI can be brilliant in bursts while still struggling with long, messy, exception-heavy projects.
There is another useful reality check in Anthropic’s labor-market work. Comparing theoretical task coverage with actual usage-based coverage, Anthropic found that AI use in professional settings remains well below what the models could in theory touch. In computer and math work, where one might expect the strongest penetration, Anthropic’s usage-based measure showed only 33% of tasks covered in observed use, even though theoretical exposure was far higher. In other words, capability does not automatically become deployment. Firms still face trust, verification, workflow, and governance problems, and those frictions slow real substitution.
So yes, AI can turn weeks into days in some settings. No, it does not follow that every long project is ready for end-to-end machine execution. The frontier is moving fast, but it is still jagged. The organizations that benefit most are not the ones that assume AI is universal. They are the ones that learn where it is dependable, where it needs scaffolding, and where it should be kept on a short leash.
Judgment becomes the scarce layer
Once AI compresses drafting and routine synthesis, the scarce part of work shifts. It shifts away from producing a first version and toward deciding what counts as correct, useful, safe, timely, and worth acting on. Judgment becomes the bottleneck. That is one reason so many studies find bigger gains for junior or lower-performing workers than for top experts. The tool is often best at raising the floor, not replacing the ceiling.
The customer-support study showed this clearly. Less experienced agents benefited far more than elite agents, and AI seemed to help them absorb the patterns of stronger performers faster. The software studies point the same way. The Microsoft-Accenture field experiments found stronger gains among recent hires and more junior staff. The BIS coding experiment also found the clearest statistically significant lift among entry-level employees. What AI often does first is not replace mastery. It shrinks the time it takes to reach competent output.
That changes labor markets in a messy, uneven way. The ILO’s updated occupational-exposure work says clerical occupations still face the highest exposure to generative AI, while exposure is also expanding in strongly digitized professional and technical roles. OECD analysis makes a related point from another angle: white-collar, tertiary-educated workers are likely to feel disruption because AI can automate non-routine cognitive tasks that were once treated as relatively protected. This is a break from the old assumption that automation mainly comes first for routine manual or low-skilled work. The new exposure map reaches into the office, not only the factory.
Yet the wage story is not one-directional doom. PwC’s 2025 AI Jobs Barometer argues that industries more exposed to AI have seen stronger growth in revenue per employee and faster wage growth, including in roles that people often describe as highly automatable. That does not prove every worker benefits. It does suggest that the first-order effect of AI in many settings is not simple devaluation of labor, but recomposition of labor. Some tasks get cheaper. Some skills become more important. Some workers become more productive and therefore more expensive. Others find that the part of their job that used to justify their role is now much easier to copy.
This is why the most durable skill in an AI-heavy workplace is likely to be neither raw typing speed nor generic familiarity with chatbots. It is disciplined judgment under acceleration. Can you frame the problem well? Can you tell the model what matters? Can you spot weak reasoning, missing evidence, false certainty, security risks, or legal trouble? Can you decide when the right move is to use AI, when to constrain it, and when to ignore it entirely? Microsoft’s own 2025 workplace research puts that shift in plain terms: workers need to know when to delegate to AI, how to iterate with it, and when to push back.
In that sense, AI does not remove the need for skilled people. It changes where skilled people earn their keep. The premium moves toward problem selection, system oversight, verification, exception handling, and final accountability. The faster AI gets, the more expensive bad judgment becomes.
A new deal between human work and machine time
The deepest change here is not that AI writes faster than people. It is that machine time is starting to sit inside ordinary business time. That sounds abstract, but its consequences are not. Planning cycles get shorter. Fewer ideas die in note form. More first drafts exist. More comparisons can be run before a decision. More routine analysis can happen between meetings instead of after them. The organizations that adapt will feel less like traditional offices with AI bolted on and more like human systems that continuously call on machine labor for the parts of work that used to create lag.
That is why the winning organizations will not be the ones that merely “adopt AI.” Plenty of companies now have access. The difference will come from workflow redesign. McKinsey, Microsoft, Stanford, and the better field studies all point in roughly the same direction: the biggest gains appear when AI is tied to real work, used repeatedly, and built into the rhythm of execution rather than parked as a novelty tool. If a company keeps the old approvals, the old meeting load, the old ownership confusion, and the old fear of acting, AI will mostly generate extra drafts. If it clears bottlenecks, sets standards, and decides where human review truly matters, AI starts to compress the whole system.
For workers, the adjustment is just as sharp. The safe assumption is no longer that your value lies in producing every intermediate artifact yourself. Your value increasingly sits in directing the process, setting the bar, and being the person who knows what good looks like. People who learn to treat AI as a rough but tireless collaborator will outrun people who either refuse it completely or trust it blindly. The first group gains speed. The second group loses time. The third group, the one that matters, gains judgment under speed.
The old timeline for knowledge work is not fully gone. Big decisions still need human accountability, and many hard projects still resist automation. But the center of gravity has shifted. What used to require weeks of coordinated effort can often reach reviewable form in a day. That alone is enough to reorder budgets, hiring, team design, and competitive pressure. AI does not need to replace the whole team to change the whole market. It only needs to erase enough delay that everyone else is now moving on a shorter clock.
FAQ
In some parts of knowledge work, yes, especially the early and middle layers such as research synthesis, drafting, variant generation, coding boilerplate, summarization, and routine support responses. The strongest evidence so far points to time compression, not universal full replacement. AI often gets work to a reviewable state much faster, while humans still handle approval, edge cases, and accountability.
The clearest early gains show up in customer support, writing, consulting tasks that fit current model strengths, and software development. Those domains share a pattern: they are rich in language, repetition, and structure, and they benefit from rapid first drafts or suggestions that a human can inspect.
Because AI often acts like a fast competence booster. It reduces the cost of getting to acceptable output and helps less experienced workers borrow patterns that stronger workers already know. Several studies found larger gains for newer, lower-skilled, or more junior workers than for experts.
No. It does mean companies can often do more with smaller project teams and less coordination overhead. Some tasks shrink or disappear, but new work also appears around supervision, system design, verification, exception handling, and AI governance. The labor-market effect is a reallocation story before it is a simple headcount story.
Because speed depends on setting, task shape, tool fit, and quality standards. In METR’s 2025 study, experienced open-source developers working on codebases they already knew were slower with AI assistance. Reviewing machine output, correcting drift, and handling subtle local constraints can erase the headline speed gains.
Long, messy, multi-step work with hidden dependencies is still difficult. METR’s task-length research shows strong performance on short tasks and much weaker performance on tasks that take humans hours. The Harvard-BCG study also showed that on tasks outside AI’s frontier, users can do worse, not better.
Judgment. The scarce layer shifts toward defining the problem, giving context, checking evidence, spotting failure, and deciding when AI should or should not be trusted. The people who do best are usually not the ones who worship or reject AI, but the ones who can direct it with discipline.
Author:
Jan Bielik
CEO & Founder of Webiano Digital & Marketing Agency

This article is an original analysis supported by the sources cited below
The 2025 AI Index Report
Stanford HAI’s annual report, used here for model costs, frontier performance, and the broader economic context around workplace AI.
Chapter 1 of the AI Index Report 2025
The chapter covering affordability, hardware cost trends, and energy efficiency in current AI systems.
Chapter 2 of the AI Index Report 2025
The technical-performance chapter used for benchmark gains, including the jump on SWE-bench.
Chapter 4 of the AI Index Report 2025
The economy chapter, used for workplace productivity findings and changes in collaboration patterns.
The economic potential of generative AI
McKinsey’s analysis of where generative AI creates economic value and which business functions are most affected.
Generative AI at Work
The Quarterly Journal of Economics paper on customer-support agents, widely cited for measured productivity gains and faster learning for novices.
Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence
The MIT working paper on professional writing tasks, used here for evidence that AI can improve both speed and output quality.
The Impact of AI on Developer Productivity Evidence from GitHub Copilot
The controlled Copilot experiment showing strong speed gains on a coding task.
The Effects of Generative AI on High-Skilled Work Evidence from Three Field Experiments with Software Developers
Microsoft Research’s landing page for the multi-company field experiments on software developers.
The Effects of Generative AI on High-Skilled Work Evidence from Three Field Experiments with Software Developers
The full paper used for the exact field-experiment estimates on completed software tasks.
Navigating the Jagged Technological Frontier
Harvard’s summary of the BCG consulting experiment, used for the idea of the jagged frontier and the task-level upside.
Navigating the Jagged Technological Frontier
The full working paper behind the Harvard-BCG study, used for exact measures on speed, quality, and failure outside the model frontier.
Generative AI in Real-World Workplaces
Microsoft Research’s synthesis of workplace studies on productivity gains from generative AI tools.
Early Impacts of M365 Copilot
The paper on measured time savings and document-speed gains among workers using Microsoft 365 Copilot.
Shifting Work Patterns with Generative AI
The NBER paper on cross-industry field evidence, used here for email-time reductions and lower after-hours work.
2025 The year the Frontier Firm is born
Microsoft’s large survey-based work report, used for leader expectations around agents and digital labor.
Breaking down the infinite workday
Microsoft’s telemetry-based special report, used for overload, interruptions, and after-hours work patterns.
20% of EU enterprises use AI technologies
Eurostat’s official update on business AI adoption in the European Union.
Generative AI and jobs A refined global index of occupational exposure
ILO research on which occupations face the highest exposure to generative AI.
Generative AI and jobs A 2025 update
ILO’s summary brief used here for the updated methodology and task-level occupational analysis.
Who will be the workers most affected by AI
OECD analysis used for the point that AI disruption reaches into white-collar cognitive work.
AI Jobs Barometer
PwC’s 2025 analysis of AI exposure, wages, and revenue per employee across industries.
Generative AI and labour productivity A field experiment on coding
The BIS field experiment on coding productivity, used for evidence of strong gains among junior staff.
Measuring AI Ability to Complete Long Tasks
METR’s research on task-length horizons, used for the limits of current autonomous AI on longer tasks.
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
METR’s study showing that experienced open-source developers in one setting were slower with AI assistance.
Labor market impacts of AI A new measure and early evidence
Anthropic’s research note comparing theoretical task exposure with observed usage-based coverage in professional work.



