Generative AI ROI: Why Most Enterprise Projects Look Like Failures on Paper

  13

In July 2025, MIT dropped a number that made CFOs uncomfortable and skeptics feel vindicated. After studying over 300 enterprise AI initiatives, MIT’s Project NANDA concluded that despite $30–40 billion in corporate investment, a staggering 95% of organizations saw zero measurable return from generative AI. The study even gave the divide a name The GenAI Divide.

And MIT wasn’t alone. Other highly respected institutions reached similarly sobering conclusions around the same time.

 

These are not numbers from tech blogs or vendor surveys. They come from MIT, RAND, Gartner, and McKinsey — institutions that businesses trust to tell the truth. So the question isn’t whether the data is real. It is. The real question is: what does it actually mean?

Just 5% of integrated AI pilots are extracting millions in value, while the vast majority remain stuck with no measurable P&L impact.”

— MIT Project NANDA, The GenAI Divide: State of AI in Business 2025

Before organizations start pulling the plug on their AI budgets, it is worth asking a harder question, one that these reports, for all their rigor, may not fully answer: Are businesses failing at AI, or are they failing at measuring AI?

Generative AI has only been available in enterprise-grade form since late 2022. That is less than three years. The internet existed for nearly a decade before businesses figured out how to profit from it. Is it fair to grade a three-year-old technology by the same standards used to measure a mature capital investment? That is exactly the debate this blog unpacks.

 

But wait, what are we actually measuring?

Before drawing conclusions from the data, it helps to ask one simple question: how did researchers decide what counts as “success”?

MIT’s study defined success very narrowly: deployment beyond the pilot phase, with measurable KPIs, and an ROI impact measured just six months after the pilot ends. That is a short window for any technology. And for something as transformational as generative AI, it may be the wrong window entirely.

From the MIT NANDA report (July 2025): Success was defined as “deployment beyond pilot phase with measurable KPIs” and “ROI impact measured 6 months post-pilot.” The report itself acknowledges this framing may not capture long-term or indirect value creation.

Every major technology looked “broken” at first

The same window in which researchers declared generative AI a failure is shorter than the average ERP implementation timeline.

The real problem: industrial-era metrics applied to cognitive-era change

Traditional ROI calculations work well for capital equipment. Buy a machine, measure how many units it produces, divide by cost. That model breaks down when the technology being measured changes how people think, not how many widgets they make.

 

“The obsession with traditional ROI in AI implementations reflects the same flawed thinking that has plagued every major technological transformation the expectation that a new technology will accrue immediate financial benefits.”

— UC Berkeley Professional Education, September 2025

This does not mean the failure data should be dismissed. It means the picture is incomplete — and that the more urgent question is not whether AI delivers value, but where and how that value actually shows up.

 

Where the real ROI hides

The failure statistics are real. But they are not the full picture. Even within the same MIT report that declared 95% of companies see no return, the data quietly tells a different story — one about where value is being created, and where it is being ignored.

Three patterns stand out across the MIT, Gartner, and McKinsey findings. Understanding them does not just explain the failure rate it points directly to where the wins are hiding.

1. Efficiency gains that never show up in earnings reports

Time compression the compounding ROI nobody measures

When a legal team cuts contract review time by 60%, or a marketing team produces in two hours what used to take two days, that value is real but it does not appear in a quarterly P&L statement. It shows up in capacity: the same team handles more work, moves faster, and makes fewer errors. Traditional ROI frameworks have no column for this. Yet over time, these efficiency gains compound. A 30% productivity lift across a 50-person team is, effectively, 15 additional full-time employees without the headcount cost.

2. The budget is going to the wrong place

Where companies invest vs. where the returns actually are

MIT’s own data reveals a striking misalignment. Most enterprise GenAI budgets flow into sales and marketing the most visible, easy-to-pitch functions. But the highest ROI is found in the least glamorous place: back-office automation. Replacing business process outsourcing, cutting external agency spend, streamlining operations these are the areas where AI is quietly delivering real returns. Companies are investing in the wrong rooms and then wondering why the lights aren’t on.

3. Workforce capability the ROI that retention reports catch

One person doing what used to take a team

Generative AI does not just make people faster it expands what they can do at all. A single analyst can now conduct research that previously required a consulting firm. A small content team can produce output at the scale of a much larger agency. This capability expansion is a form of workforce leverage that does not show up in P&L, but it surfaces clearly in employee satisfaction, retention metrics, and the ability to grow revenue without growing headcount proportionally. UC Berkeley’s research found that personal and small-team AI implementations consistently outperform enterprise-wide initiatives precisely because individuals can experiment, iterate, and experience this value directly.

“The real returns lie in functions that are often overlooked. Back-office automation produces the highest returns by streamlining processes, reducing outsourcing, and cutting costs.”

— MIT Project NANDA, The GenAI Divide: State of AI in Business 2025

In other words, the ROI is not missing. It is being created in the wrong functions, measured with the wrong tools, and recorded on the wrong spreadsheet. The companies finding real returns are not necessarily smarter, they are just looking in the right places.

 

Why most companies are getting it wrong

The ROI gap is real. But it is not random. Across the MIT, Gartner, McKinsey, and IBM findings, the same three structural problems appear again and again. They are not about the AI being bad, they are about the decisions made before the AI was ever turned on.

Buying the tool before defining the job

Strategy-free implementation the most common failure mode

Most organizations started with the technology and worked backwards to find a use for it. That is the opposite of how successful adoption works. When there is no clear strategy, teams run pilots that are interesting but not connected to any business outcome. Six months later, the pilot ends and nobody can explain what it changed.

Gallup (2024): only 15% of US employees say their workplace communicated a clear AI strategy

McKinsey: fewer than 30% of companies report their CEO directly sponsors the AI agenda

Dirty data the silent project killer

Poor data foundations make even great models useless

Generative AI is only as good as the data it is built on. Organizations that skip the foundational work data governance, quality controls, pipeline integrity end up deploying systems that give unreliable outputs. Users lose trust fast, and the project quietly dies. This is not a new problem, but AI amplifies it. A flawed data set does not just produce wrong answers, it produces confident wrong answers.

IBM: 42% of organizations cannot properly customize AI models due to poor-quality data

Gartner: predicts 60% of AI projects lacking “AI-ready data” will be abandoned by end of 2026

Building in-house when partnering works better

The DIY trap especially costly in regulated industries

Many enterprises especially in financial services, healthcare, and other regulated sectors chose to build proprietary AI systems from scratch. The reasoning made sense: control, compliance, customization. But the execution cost them dearly. Building in-house requires specialized talent, long development cycles, and deep integration expertise that most organizations simply do not have. MIT’s data is clear on the outcome.

The pattern is consistent across all three root causes: the failure is not in the technology. It is in the decisions made around the technology what problem to solve, what data to use, and whether to build or buy. These are strategy and integration problems. And that is exactly why the organizations that fix them first are the ones that end up on the right side of the GenAI Divide.

 

What the successful 5% are actually doing differently

The GenAI Divide is not just a story of failure. MIT’s report is equally clear about the other side of the small group of organizations that are extracting real, measurable value from generative AI right now. They are not necessarily bigger, richer, or more technically advanced. What separates them comes down to four consistent traits.

Deep workflow integration

AI is embedded into specific, high-value processes not deployed as a standalone chatbot layer on top of existing systems. It works inside the workflow, not alongside it.

Continuous learning loops

Systems are built to get smarter over time. Feedback is captured, outputs are reviewed, and models adapt to the specific context of the organization, not just generic inputs.

Business outcomes, not tech benchmarks

Success is defined by cost reduction, time saved, or revenue protected not by model accuracy scores or deployment velocity. The question asked is always: “what changed in the business?”

Right functions, not visible ones

Investment is concentrated in back-office operations, customer support, and process automation not in marketing pilots. The highest ROI functions are often the least exciting ones.

 

A better way to think about GenAI ROI

The goal is not to throw out ROI as a concept. It is to expand what gets measured. Generative AI creates value in at least four ways that traditional financial metrics miss entirely. Each one is real, trackable, and defensible; they just require a different kind of scorecard.

Return on efficiency

Time saved × volume = compounding capacity. A team handling 30% more work without headcount growth is a financial gain even if it never appears in a revenue line.

Decision quality

Better analysis, fewer errors, faster responses. Every avoided mistake and every faster customer resolution has a dollar value; it just needs to be calculated deliberately.

Capability score

What can the team do now that was impossible 12 months ago? Tasks that previously required outsourcing, specialist agencies, or larger headcount — now done internally.

Scalability unlocked

How much more revenue or output can the organization handle without proportional cost growth? This is the most strategically valuable metric and the hardest to see in six months.

None of these metrics require abandoning financial discipline. They require adding to it. A CFO who only tracks P&L impact will consistently undervalue AI investments and defund the ones that are quietly compounding the most value.

One important note: expanding the measurement framework is not a license to avoid accountability. Every metric above can be tracked with numbers. The point is not to replace financial rigor, it is to make sure the scorecard is actually measuring what the technology does.

 

The ROI is not missing it is being looked for in the wrong places

The MIT, Gartner, RAND, and McKinsey data is real. Most companies investing in generative AI are not seeing returns. But the failure is not the technology’s fault. It is the result of strategy-free implementation, poor data foundations, misdirected budgets, and measurement frameworks borrowed from a different era of technology.

Generative AI has been enterprise-viable for less than three years. The organizations that will look back on this period as a turning point are not waiting for better models. They are the ones fixing the strategy layer right now deciding where AI actually creates value, making sure the data is ready, and measuring what actually changes in the business.

The GenAI Divide is real. But it is not fixed. It is a decision and the window to make the right one is narrowing as enterprise contracts lock in across the market.

Getting AI to deliver real business value is not a technology problem. It is a strategy and integration problem. If the organization is sitting on pilots that haven’t scaled, or investments that haven’t returned, the path forward starts with an honest assessment of where and how AI is being deployed and what success is actually being measured against. 

JanBask helps enterprises audit their current AI roadmap, secure their data foundations, and target high-yield back-office automation. Partner with JanBask AI Consulting Services to turn stalled pilots into measurable enterprise returns.

Write a Comment

avatar
  Subscribe  
Notify of

Get Free Experts Consultation

Let's Connect