Is the generative AI hype bubble about to burst?

March 10, 2025

Two years after the launch of ChatGPT, generative AI has yet to produce a game-changing use case. Professor Paul Watta breaks down whether the trillion dollar bet on generative AI will pay off or go bust.

A headshot of Electrical and Computer Engineering Paul Watta wearing a yellow polo shirt
Electrical and Computer Engineering Professor Paul Watta. Photo by Annie Barker

The release of ChatGPT in November 2022 sparked some of the broadest societal discussions about the promise and perils of artificial intelligence in recent memory. In the year after its debut, it was easy to find stories about the potential for large language models, the AI technology underlying ChatGPT and similar products, to totally restructure certain industries. Some looked out even further and worried that AI could eventually threaten human civilization

In retrospect, the expectations were perhaps too tall for what was, at the time, a fascinatingly good chatbot that was still prone to lying and inexplicable hallucinations. Even with these limitations, investors have been bullish about the technology, with . But now, two years on, with super compelling use cases yet to materialize, some are starting to wonder whether the industry could be dangerously overvalued and overhyped. After all, the most commonly deployed uses — customer service chatbots, AI enhanced search (which has inspired an internet backlash and ), AI summaries of product reviews and help writing emails — aren’t life-changing. Even in the field of coding, where LLMs have arguably shown the most practical promise, applications are still limited. Moreover, there are huge concerns that ChatGPT and similar technologies are actually doing damage to society, by helping students cheat, erasing recent reductions in energy use and greenhouse gas emissions, and capitalizing on creators’ work without their permission. 

Electrical and Computer Engineering Professor Paul Watta says these are all valid concerns. His own take is that it’s “going to be tough” for big bets on generative AI to pay off, and it’s completely possible the industry is heading for a reality check soon. (More on this below.) But he’s also not writing off the potential for some game-changing use cases to still emerge. His main reason for tempered optimism is that LLMs have undergone quite a lot of meaningfully technical evolution over the past two years, a story that often gets lost in the media’s coverage of generative AI. He describes the initial releases of ChatGPT and similar products as “chat tools,” whereas the new generations are “reasoning models.” Watta says the former were basically extremely powerful predictive text machines: Based on a text-based prompt, the model would use its knowledge of the patterns of human language to string together words that felt like an appropriate answer. Sometimes the outputs were really great. Other times, particularly when the technology failed to capture the full meaning of the prompt, its answers were frustratingly unhelpful. This is, for example, why early generations of LLMs generally failed as customer service chatbots. They simply couldn’t accurately interpret the nuances of people’s troubleshooting questions, let alone integrate with things like databases of relevant customer information.

Watta says reasoning models emerged to overcome these shortcomings. The main difference is that a reasoning model essentially thinks before it speaks. Rather than generating a quick text-based output based on a single computational line of thought, a reasoning model breaks a prompt down into its component parts to try to better understand the context of the task being asked of it. It then considers multiple options for its response and chooses an answer based on refined numerical parameters established during the model’s training that steer it toward more desirable results. The newer models can also quickly ingest and respond to new information. “That’s really one of the best use cases for it right now. It can take in all kinds of documents — like business prospectuses from a bunch of companies — and generate a summary report for you that’s really quite good,” Watta says. “So that’s something that might have taken an intern 30 hours to do, and now you can do it in minutes.” Because of this fundamentally different architecture, reasoning models also do a reasonably good job of showing how they arrived at a conclusion. Original LLMs were black boxes — not even their designers had a clue how they were coming up with their responses.

The other reason that Watta is still not counting out LLMs is that the technology is evolving very rapidly. Watta says that Sam Altman, CEO of OpenAI (maker of ChatGPT), boasts that every version they release is a 10-fold improvement and that the newest version, expected any day now, could surpass that pace. That’s fast even for the tech sector. There is at least some concrete evidence to demonstrate that progress. For example, Watta keeps an eye on competitive programming challenges on platforms like , where companies post real-world unsolved programming problems for the world’s best developers to tackle. The earliest releases of LLMs were essentially non-competitors as coders. But  internal benchmarks for their most recent model put it in the Top 50. Watta wouldn’t be shocked if the pending releases of ChatGPT or other models end up in the Top 10, or even land at No. 1.  

A man in a yellow shirt looks at a laptop with a website of a large data center on the screen
Watta pulls up a website showing what ChatGPT actually looks like: huge data centers that use massive amounts of electricity. Watta says electricity consumption by AI-based technologies could become a hot-button policy issue in the future. Photo by Annie Barker

Even more interesting than the pace of development is that there now appears to be legitimate competition in the LLM space. Earlier this year, the Chinese startup DeepSeek released an LLM model that caused instant disruption. The big revelation wasn’t that DeepSeek was better than ChatGPT — though Watta says, by some benchmarks, DeepSeek slightly outperforms the best models out there. It was that this startup had managed to build this almost-just-as-good reasoning model using far less powerful technology, thanks to a U.S. trade policy which banned the highest-powered graphics processing units from being sold to Chinese companies. These chips, made by the U.S. company NVIDIA, were assumed to be essential to creating high-quality LLMs. “They appear to have proved that idea wrong,” Watta says. “What the DeepSeek team did is the classic startup story that Silicon Valley used to do. Startups never have enough resources, so you have to optimize what you have. The big players become lazy, they don’t look for new ideas, and so they overspend to solve a problem. A startup can’t do that. That’s how the little guy beats the big guy.” Moreover, Watta says it’s notable that DeepSeek made key features open source and published technical details about how they arrived at their performance breakthroughs. U.S. companies typically have just published benchmark performance but don’t share the details of how they got there.

Watta says, for him, the biggest takeaway from DeepSeek’s release wasn’t technological; it was its impact on the markets. The day after DeepSeek’s debut, NVIDIA’s stock fell by 17%. “The market lost half a trillion dollars. From one release. That’s scary,” he says. He argues that indicates we may be entering a period where venture capitalists become more discerning about their investments in LLM development. Moreover, that’s just one of many possible “bottlenecks” Watta and others foresee. There’s still the nagging question of how much more these models can be improved, given that they thrive on consuming human-produced data and . Newer models still lie and hallucinate, though not as much. In addition, some are growing increasingly worried about the electricity consumption of LLMs and AI-based technologies more generally. Currently, AI accounts for about 3% of global electricity use, but that number is expected to grow in the coming years. (By one estimate, a  and 10 times as much as a standard Google search.) Moreover, this growth in electricity demand from AI comes at a time when the push toward electrification in the transportation and heating sectors is already expected to strain the electric grid. Watta says if AI is seen as the reason for spikes in electricity cost, brownouts or an impediment in the fight against climate change, the public could sour on the industry.

Similarly, he says LLMs could run into hurdles with existing privacy law. Watta says one of the more hopeful applications for these new reasoning models could lie in solving complex medical problems. “But when you’re talking about people’s medical information, we have strict regulatory frameworks, like HIPAA, which are designed to protect people’s privacy. With something like that, there is no room for error,” he says. “People have already gotten a little uncomfortable with these technologies making decisions that impact their safety or their financial lives. Now what if someone uses the technology to make a virus that kills millions of people? That could create a major backlash. Even if it wasn’t something that catastrophic, if people’s views go negative, that could constrain the development and shake the markets, . Because, at least for now, the markets are still betting on a big pay day.”

###

Story by Lou Blouin