DeepSeek: Swim in Different Waters
On DeepSeek, the latest market reaction and other implications..
Just wrapped up an interview with our friend Chery Kang at CNBC about the recent market turbulence in AI stocks. [Update: video just became available, attached below]
The market's knee-jerk reaction to DeepSeek's efficiency breakthrough reminds me of the 2000 dot-com era: When Cisco (like today’s Nvidia) crashed, many thought it signaled the internet's death. Instead, it marked the beginning of its golden age.
Today's story rhymes: while markets may react nervously to efficiency gains in the short term, the long-term story is about expansion, not contraction. Here is why…
Broader Contexts on DeepSeek and others:
Our 2024 State of AI Report from last year predicted something counterintuitive::
We predicted: Architectural ingenuity (new emerging paradigms) > brute-force scaling
What happened: DeepSeek's Multi-head Latent Attention (MLA) and mixture-of-experts (MoE) design deliver GPT-4 performance at 70% lower cost
The old playbook—throwing more chips and data at the problem—is NOT the only option.
DeepSeek proves constraint breeds creativity.
Analogy: Old method = cramming books (data) + study hours (compute resources and chips). AI is no longer a student cramming textbooks. With Deepseek, It's a savant with a tailored curriculum with many “reflections” and “trial and error” built-in through reinforcement learning, which is more like how human actually learn!
DeepSeek's open-source nature is a pretty big deal: by open-sourcing efficient architectures, it democratizes access to state-of-the-art AI. From Silicon Valley giants to developers in South Africa, everyone can build on and improve the technology.
Let's be clear: DeepSeek isn't an overnight success story. It stands on the shoulders of giants—building upon ChatGPT's foundations and the broader open-source ecosystem. This is coevolution at work:
DeepSeek's breakthroughs likely leverage GPT-4's architectural insights and model distillation
Big Techs will be hosting/incorporating Deepseek’s R1 or its open-source distillation techniques in future models (they already have!)
The cycle continues—this isn't a “either/or” crossroad, it's coevolution
Closed-source moats (data, compute) are leaking. The future belongs to hybrids: specialized models for the masses, giants for the moonshots.
Think of DeepSeek's impact like the mobile revolution in computing:
PCs weren't replaced—the ecosystem expanded
Mobile democratized access, creating new markets
Similarly, DeepSeek isn't replacing large models—it's making AI accessible to the masses; Just as smartphones enabled new applications impossible on PCs.
Efficient AI will unlock novel use cases —> Golden age for developers and application companies
Furthermore, while this architectural breakthrough is significant, it may not be the final answer for future AI development, particularly as we move toward agent-based systems and AGI. We'll likely hit new bottlenecks that require fresh innovations.
But here's what makes AI progress beautiful: it's inherently non-linear. Each breakthrough creates new bottlenecks, which spark fresh innovations - the disruption isn't a bug, it's a feature.
Public Market Misunderstands Efficiency
I love the saying “Markets panic at efficiency, but innovation thrives on it” as this is what’s happening now.
Time horizon matters. When Cisco crashed in 2000, it wasn’t the end of the internet - it was the start of its golden age. The past few days’ sell-off in AI hardware stocks is similar: a knee-jerk to efficiency
DeepSeek’s $5M training cost excludes prior R&D expenses, and its success doesn’t negate the need for continued infrastructure investment
They are not in conflict necessarily: Efficiency gains (e.g., cheaper training/inference) could increase demand for AI computing as adoption widens.
In addition, there is a differentiation between Hardware and Software. In the scenario where hardware (chips) becomes more and more commoditized (i.e. “infrastructure-ed”), there is a more lucrative path for software/application layer companies.
Key Takeaways and Implications
At the foundation model layer: DeepSeek proves that architectural innovation (MLA, MoE, FP8, distillation) can outperform brute-force scaling. Investors will favor teams with expertise in algorithmic efficiency, not just access to GPUs
Closed-source AI’s moats might be leaking. Why pay premium prices for bloated models when open-source rivals match their performance at 10% of the cost?
Vertical AI will thrive: The next deca-unicorns would specialize - healthcare, finance, and other large TAM verticals, likely addressed by completely new AI-native companies with Agent and Agent-net built in from day one.
Domain expertise + algorithmic efficiency = new market leaders
Moat 2.0: We need to redefine moat as the software form factor evolves from passive static SaaS to “proactive, action-driven” AI-native businesses.
Traditional SaaS moats relied on data volume or network effects. AI-native businesses are different.
Unlike previous SaaS or mobile software businesses, the AI-native business durability lies in: (as we sated in our latest “2025 State of AI report” and our last observations)
Agentic workflows that improve with each interaction
Institutional learning loops that compound intelligence
Multi-agent networks that create emergent capabilities
Remember: information wants to be free, but intelligence wants to compound.
Feel free to reach out with thoughts or questions. The conversation around AI efficiency is just beginning.