• AI Emergence
  • Posts
  • DeepSeek & Qwen Outsmart Silicon Valley on a Shoestring Budget

DeepSeek & Qwen Outsmart Silicon Valley on a Shoestring Budget

Along with: The AI Disruption Era: Speed, Efficiency & Power Plays

Hey there, 

What times are we living in! So much happened in a week.

China’s DeepSeek is shaking up the AI world, proving that cutting-edge models don’t need billion-dollar budgets. With rapid innovation and cost-efficient techniques, it’s challenging industry giants and redefining what’s possible in AI development. As the landscape shifts, one thing is clear- efficiency may be the new advantage in the AI race.

On the other hand, OpenAI has claimed that DeepSeek is a distillation of their IP! Only time will tell what is true! But - as I said, exciting times indeed!

What would be the format? Every week, we will break the newsletter into the following sections:

  • The Input - All about recent developments in AI

  • The Tools - Interesting finds and launches

  • The Algorithm - Resources for learning

  • The Output - Our reflection

Table of Contents

DeepSeek just pulled off something that Silicon Valley’s giants couldn’t see coming. They built an AI model- faster, cheaper, and with less powerful chips- that’s outperforming America’s best. And they did it in just two months for under $6 million using Nvidia’s H800s. That’s a fraction of what Big Tech spends, and the results are making waves.

Independent benchmarks show DeepSeek-V3 beating Meta’s Llama 3.1, OpenAI’s GPT-4o, and Anthropic’s Claude Sonnet 3.5 in math, coding, and complex reasoning. 

Then, just this Monday, they dropped DeepSeek R1, a reasoning model that outperformed OpenAI’s o1. Even Microsoft CEO Satya Nadella, speaking at Davos, called it “super impressive” and warned that this shift “must be taken very seriously.”

And then? The market felt it. Nasdaq plunged 3.1%, with Nvidia taking the biggest hit- its stock crashed nearly 17%, marking a record one-day market-cap loss for a Wall Street stock. In fact, Nvidia’s wipeout was more than double the previous record, which- ironically- it set just last September.

The kicker here is that DeepSeek pulled this off without access to Nvidia’s powerhouse H100 chips (thanks to U.S. export restrictions). Instead, they used cost-efficient techniques like model distillation to make smaller models punch way above their weight.

So, who’s behind DeepSeek? That’s still a mystery. The lab came out of High-Flyer, a Chinese hedge fund managing $8 billion, and its enigmatic founder Liang WenFeng remains largely unknown. But they’re not alone. Other Chinese firms- like Kai-Fu Lee’s 01.ai and ByteDance- are proving that you don’t need bottomless budgets to build game-changing AI. ByteDance even claims its latest model outperforms OpenAI’s o1.

“They had to innovate under constraints,” said Perplexity CEO Aravind Srinivas. “And in doing so, they built something far more efficient.”

Could this be the moment China leapfrogs the West in AI? Buckle up. This battle is just getting started. (source)

DeepSeek isn’t stopping at language and reasoning. With Janus-Pro, they’re going straight for OpenAI’s DALL-E 3, proving that their cost-efficient, compute-optimized approach isn’t just for text- it’s reshaping multimodal AI too.

The strategy? The same one that helped them beat GPT-4o and Llama 3.1. But this time, it’s about pushing the limits of AI-generated images- without relying on top-tier chips.

Despite U.S. chip restrictions, DeepSeek built Janus-Pro using model distillation, packing serious power into a 7-billion parameter model. And it’s already leading benchmarks like GenEval and DPG-Bench.

Meanwhile, their chatbot has skyrocketed to #1 on the Apple App Store, backed by High-Flyer Capital Management. And Wall Street is taking notice. DeepSeek’s compute-efficient AI models could shake up the entire AI chip market- raising a big question:

Can the U.S. hold onto its AI lead? And what happens if high-powered chips become... less necessary? (source)

Alibaba just made its boldest AI move yet. With Qwen2.5-Max, it’s directly challenging DeepSeek-V3, GPT-4o, and Meta’s Llama 3.1-405B- an aggressive play that signals China’s AI arms race is in full swing. And the timing? Day one of Lunar New Year- a clear flex to show that Alibaba isn’t backing down.

DeepSeek’s rapid rise has already shaken both Silicon Valley and China’s AI landscape, forcing giants like Alibaba, ByteDance, and Baidu to level up. Lower inference costs, stronger reasoning, and more efficient architectures- it’s a full-scale battle for AI dominance, and Qwen2.5-Max is Alibaba’s answer.

Qwen2.5-1M: The Long-Context Powerhouse

But Alibaba isn’t just stopping at Qwen2.5-Max. It has also released Qwen2.5-1M, a model built for long-context reasoning, packing a massive 1 million-token memory. That means it can retain and analyze information over long sequences-ideal for legal analysis, deep research, and multi-step reasoning.

The release includes two open-source models:

  • Qwen2.5-7B-Instruct-1M

  • Qwen2.5-14B-Instruct-1M

And here’s the kicker- a high-speed inference framework that processes massive inputs 3x to 7x faster than traditional models. And yes, it’s all open-source, available on Hugging Face and Modelscope for developers to experiment with.

Why Qwen2.5-1M Matters?

  • 1M Token Context Window- AI that actually remembers what you said 50 pages ago. Perfect for in-depth research.

  • Dual Chunk Attention (DCA) & Sparse Attention- Long-context performance without lag.

  • 29+ Language Support- Making AI accessible across global applications.

  • Up to 7x Faster Inference- Free for developers, optimized for speed.

With Qwen2.5-Max and Qwen2.5-1M, Alibaba is taking the fight directly to DeepSeek, OpenAI, and Meta. It’s clear- the AI wars are just heating up. (source) (source)

DeepSeek-R1 shocked the AI world by proving that powerful reasoning models can be trained without human supervision- just pure reinforcement learning. It delivered performance on par with OpenAI’s o1, all while keeping costs low. 

But here’s the catch- DeepSeek hasn’t released its training code or datasets. No breadcrumbs, no behind-the-scenes look. That’s got the AI community buzzing with one big question: How did they pull this off?

If DeepSeek won’t share, the community will reverse-engineer it. That’s where Open-R1 steps in. The project aims to reconstruct DeepSeek-R1’s training pipeline, validate its claims, and push open-source reasoning models forward. The team is tackling the missing pieces- curating high-quality reasoning datasets, replicating DeepSeek’s RL techniques, and refining the training process for even better results. (source)

OpenAI has rolled out ChatGPT Gov, a secure AI assistant designed specifically for federal, state, and local agencies in the U.S. Unlike standard ChatGPT, this version meets strict government cybersecurity standards like FedRAMP High, IL5, CJIS, and ITAR, ensuring it can handle sensitive and non-public data securely.

Agencies can self-host ChatGPT Gov on Microsoft Azure’s commercial or government cloud, giving them full control over data security and privacy. OpenAI says this setup enhances efficiency while maintaining compliance, making AI more accessible for government use.

Some agencies are already putting it to work:
🔹 Air Force Research Laboratory – Using ChatGPT for administration, coding, and AI training.
🔹 Los Alamos National Lab – Exploring GPT-4o for research in bioscience.
🔹 Minnesota’s Translation Office – Delivering faster, more cost-effective multilingual services.
🔹 Pennsylvania’s AI pilot – Saving employees 105 minutes per day on routine tasks. (source)

Donald Trump is back, and so is his bold approach to policy change. His latest executive order on AI wipes out Biden-era regulations, shifting the focus from risk assessments and public safeguards to economic growth, national security, and "human flourishing."

One of the biggest changes? Tech companies will no longer be required to disclose AI model details before release, rolling back Biden’s stricter oversight. Critics warn that weakening anti-discrimination safeguards could lead to biased AI systems, while supporters argue that cutting red tape will keep the U.S. competitive in the global AI race.

Trump has also introduced a new AI czar—David Sacks, now serving as the Special Advisor for AI and Crypto, tasked with delivering an AI action plan in 180 days.

While some see this as a game-changer for AI innovation, others worry it could unleash AI risks without enough regulation. One thing’s certain—AI policy in the U.S. is taking a sharp turn. (source)

Meta is going all-in on AI, with plans to pour $65 billion into infrastructure in 2025- one of the largest AI investments we’ve seen yet.

What’s the money going toward? A 2-gigawatt data center, massive hiring sprees, and a GPU arsenal that will top 1.3 million by year’s end. In short, Meta is gearing up for an AI-dominated future.

“This will be a defining year for AI,” Zuckerberg said, making it clear that AI won’t just complement Meta’s business- it will be its core.

The move comes as Microsoft, Amazon, and OpenAI ramp up their own AI spending, with some budgets surpassing $80 billion. And then there’s the jaw-dropping $500 billion ‘Stargate’ venture- a mega-partnership between SoftBank, Oracle, and OpenAI that’s turning the AI arms race into a full-blown war.

Meta’s strategy? Go big on open-source AI while doubling down on consumer AI products like Ray-Ban smart glasses and its AI assistant, which Zuckerberg says will hit 1 billion users in 2025.

Meta’s move is a clear signal: Zuckerberg has no intention of finishing second in the AI battle. (source)

Meta AI now remembers key details from 1:1 chats on WhatsApp and Messenger, enabling context-aware responses. If you mention being vegan, future food suggestions will reflect that. Users can delete stored memories anytime.

Additionally, Meta AI is enhancing personalized recommendations across Facebook, Messenger, and Instagram by considering user activity, location, and past interactions- suggesting events, dining spots, and more.

Meta AI aims to make it one of the most personalized AI experiences by rolling out new features designed to provide information and recommendations tailored to each user.

Currently rolling out in the U.S. and Canada, these updates make Meta AI more intuitive, seamless, and truly personal. (source)

OpenAI has finally entered the Agentic era with Operator, an AI agent designed to automate web-based tasks. Operator can help you with online workflows such as booking tickets, ordering groceries, filling out forms, and even creating memes. 

Although we are excited to get our hands on it. It is currently in research preview for ChatGPT Pro users in the U.S.

Operator uses OpenAI’s Computer-Using Agent (CUA) model, which combines GPT-4’s reasoning abilities with vision capabilities. It interacts with websites by scrolling, clicking, and navigating menus, much like a human user. If it encounters an issue, Operator can adjust its actions or defer to the user when needed.

Key Features

  • Automates Online Tasks: Takes over repetitive actions to save time.

  • Manages Multiple Workflows: Handles several tasks simultaneously, like booking flights while shopping online.

  • Customizable: Allows users to tailor instructions for specific websites or tasks.

As AI tools like Operator evolve, it raises questions about how agents might reshape how we approach our digital tasks.

Will tools like these become essential productivity aids, or will they face challenges in adoption? Time will tell.

Project Stargate has the internet buzzing, featuring big names like Donald Trump, Elon Musk, Sam Altman, and more. 

On the first day of his second term, President Trump announced what he called "the largest AI infrastructure project in history." Major tech companies like OpenAI, Oracle, and SoftBank are leading the initiative, with plans to build state-of-the-art data centers in the U.S.

Stargate focuses on creating AI infrastructure across the U.S., starting with $100 billion in funding and growing to $500 billion over four years.

Key Highlights of the Project

  • Data Centers: Initial plans include building at least 10 data centers, starting in Texas, with a potential expansion to 20 nationwide.

  • Tech Collaborations: Partnerships with Microsoft, NVIDIA, and other tech giants will provide the computing power needed for advanced AI development.

  • Sectors of Impact: Applications are expected in areas like healthcare, including AI-driven early cancer detection and vaccine development.

Who’s Leading It?

  • SoftBank: Managing financial responsibilities.

  • OpenAI: Overseeing operations.

  • Key Figures: Masayoshi Son (SoftBank CEO) as chairman, with Sam Altman (OpenAI) and Larry Ellison (Oracle) playing critical roles.

Reactions and Skepticism

While Trump emphasized the project’s potential to maintain America’s competitive edge against countries like China, not everyone is convinced. Elon Musk raised doubts about SoftBank’s ability to secure the necessary funding for such an enormous initiative, calling the project’s feasibility into question. (source)

Runway is back with another game-changer. Frames, its latest AI image generation model, takes artistic control to the next level- giving creators unmatched stylistic precision while keeping visuals sharp and consistent.

Why does this matter? Because keeping a cohesive aesthetic across AI-generated content has been a challenge. Frames solves that, letting artists experiment freely without losing their unique visual identity.

The rollout has already started- Gen-3 Alpha users and API integrations are getting first dibs. This means smoother workflows, better creative control, and a powerful new tool for building detailed, immersive worlds. (source)

Sam Altman has confirmed that O3-Mini's final version is ready, with rollout expected in a few weeks. In response to user feedback, OpenAI will launch the API and ChatGPT simultaneously. Altman assures: "It's very good." (source)

Want to create a compelling video but don’t have editing skills? Simply upload a text, voice recording, or song to Tellers, and let AI generate a professional-quality video for you.  Turn your ideas into engaging videos effortlessly! Tellers is an AI-powered automatic video editing tool that transforms text, voice, or even songs into dynamic videos in seconds.

How to Access:

  • Create an Account- Sign up on Tellers and log in.

  • Upload Your Content- Add a text script, voice recording, or even a song.

  • AI Video Generation- Tellers automatically processes your input and generates a video.

  • Edit & Customize- Use built-in editing tools to fine-tune visuals, effects, and transitions.

  • Export & Share- Download your video or share it directly on social media.

  • Big news! We just launched two free courses on DeepSeek this week - "Getting Started with DeepSeek" and “DeepSeek from Scratch”. The courses dive into the AI model shaking up the industry. Wondering how DeepSeek was built and how it stacks up against GPT-4o, Claude Sonnet 3.5, and o1? These courses are all that you need- giving you a step-by-step guide to its architecture, strengths, and real-world applications.

This week, we also celebrate a milestone- the 50th edition of ‘AI Emergence’

It’s been an incredible journey tracking AI’s rapid evolution, and we couldn’t have done it without you. 

On this occasion- I would love to hear from you: What have you enjoyed most? What would you like to see more of? 

Let me know!

Reply

or to participate.