• AI Emergence
  • Posts
  • Qwen 32B better than R1 and o3-mini for 1/10th the cost

Qwen 32B better than R1 and o3-mini for 1/10th the cost

Along with - OpenAI’s AI Agents: Worth More Than a PhD Salary?

Hey there 👋

Intelligence has a price and it’s dropping. 

Alibaba’s Qwen just launched a 32B AI model that scores between R1 and o3-mini on LiveBench - but at 1/10th the cost. It sound’s exciting, isn’t it? Cheaper AI means more accessible applications. But here’s the flip side - sending data to China is raising concerns.

Meanwhile, in the US, OpenAI is playing a different game. Last week, Sam Altman dropped the GPT 4.5 that’s not necessarily smarter - just smoother, with better conversations and fewer hallucinations.

Two very different strategies. Which one wins in the long run? We’ll have to wait and see.

For now, let’s jump into this week’s updates.

What would be the format? Every week, we will break the newsletter into the following sections:

  • The Input - All about recent developments in AI

  • The Tools - Interesting finds and launches

  • The Algorithm - Resources for learning

  • The Output - Our reflection

Table of Contents

Alibaba just introduced QwQ-32B, an AI model that’s turning heads—not because it’s the biggest, but because it’s ridiculously efficient. Despite having only 32 billion parameters, it delivers performance on par with DeepSeek R1’s massive 671B-parameter model in coding, math, and reasoning tasks.

What you need to know:

  • Efficiency That Competes with Giants – This model pulls off big-model performance with fewer parameters, making it more resource-friendly while keeping up with the best in math, coding, and general reasoning.

  • Reinforcement Learning for Smarter AI – Instead of just predicting the next word, QwQ-32B learns and improves through trial and error, making it better at solving complex problems over time.

  • Open-Source and Ready to Use – It’s available under Apache 2.0 on Hugging Face and ModelScope, meaning developers can tweak, integrate, and build on it without restrictions.

How the Market Reacted

  • Alibaba’s Stock Shot Up – Investors took notice, sending Alibaba’s Hong Kong-listed shares up 7% and U.S. stock up 8.61% after the announcement. Clearly, the market sees this as a major step forward in Alibaba’s AI game.

  • Taking on DeepSeek and OpenAI – With QwQ-32B performing at DeepSeek R1’s level (despite being 20x smaller), Alibaba is making a statement: China’s AI race is heating up, and efficiency is the new battleground. (Source)

OpenAI is rolling out a new line of AI agents, with the top-tier models priced at $20,000 per month. These agents are built for specialized tasks—ranging from sales automation to software development and even PhD-level research.

How These AI Agents Stack Up

  • PhD-Level Research Agent ($20,000/month) – Designed for complex academic research, handling data analysis, hypothesis generation, and advanced problem-solving.

  • Software Developer Agent ($10,000/month) – Aimed at automating coding, debugging, and software design tasks.

  • High-Income Knowledge Worker Agent ($2,000/month) – Supports professionals like consultants and analysts by streamlining workflow-heavy tasks.

The Big Question: Is This Pricing Justified?

At these rates, OpenAI is targeting deep-pocketed enterprises willing to bet big on AI automation. But will businesses see enough value to justify these costs? That remains to be seen. (Source)

OpenAI just launched GPT-4.5, and while expectations were sky-high, the reaction has been… divided. Instead of a leap in reasoning power, OpenAI has focused on making the model feel more human, with improved conversational flow and emotional intelligence.

Sam Altman called it “the first model that feels like talking to a thoughtful person.” Some compare it to Midjourney’s impact on visuals—AI-generated writing that’s more expressive and natural. But does that actually add value?

What’s New in GPT-4.5?

  • Better Conversations – Feels more natural and engaging.

  • Fewer Hallucinations – Down to 37% from GPT-4’s 61%.

  • Enhanced Emotional Intelligence – Picks up on tone and intent more effectively.

  • Broader Knowledge Base – More up-to-date and versatile.

Sounds great, right? But here’s the catch:

The GPU Bottleneck & the Cost Problem

GPT-4.5 is massive—so much so that OpenAI maxed out its GPUs just trying to roll it out. This forced them to stagger access, prioritizing paying users.

And then there’s the price. OpenAI is charging $75 per million input tokens and $150 per million output tokens—making it 10-25x more expensive than competitors like Grok 3, Gemini 2 Flash, or DeepSeek R1. If you’re running an AI-powered business, costs could skyrocket into the millions. Even OpenAI admits they’re unsure if they’ll offer it via API long-term because of how compute-heavy it is.

So the big question: If OpenAI isn’t confident in GPT-4.5’s long-term viability, why should anyone else be?

Did GPT-4.5 Shake the Markets?

Right around its release, NVIDIA’s stock took a hit. Coincidence? Maybe. But AI’s reckless spending habits are finally being questioned. Investors are realizing that even the most impressive AI models need to be financially sustainable, and right now, GPT-4.5 isn’t.

For years, the industry has been chasing bigger and more expensive models. But with GPU shortages, rising costs, and increasing scrutiny on profitability—are we reaching the tipping point? (Source)

How do you measure a salesperson’s productivity? Simple - by how much time they spend with clients, not buried in operational tasks. The last thing you want is for them to waste time hunting for information instead of selling. Microsoft is stepping in to fix that.

Microsoft just rolled out two new AI-powered sales agents - Sales Agent and Sales Chat - designed to make sales teams more efficient by seamlessly integrating into their workflows.

Sales Agent: The Autonomous Sales Assistant

Sales Agent works in the background, researching leads, scheduling meetings, and even engaging with potential customers. It pulls data from CRM systems, Microsoft 365, and the web to personalize interactions. 

It can qualify leads and even close smaller deals independently, keeping sales teams focused on high-value opportunities. 

With built-in integration into Dynamics 365 and Salesforce, it eliminates the need for manual data entry and tracking.

Sales Chat: Instant Sales Insights, On Demand

Sales Chat acts as a real-time sales assistant, pulling insights from CRM data, pitch decks, emails, and meeting notes. It helps reps prep for calls, spot at-risk deals, and stay on top of their pipeline - all through simple, natural language queries.

With this move, Microsoft is taking on competitors like Salesforce, which has been aggressively pushing its own AI-driven sales tools. But Microsoft’s edge? Deep integration with Teams, Outlook, and the entire Microsoft 365 suite, making these AI agents a natural fit for businesses already in the Microsoft ecosystem. (Source)

Salesforce is expanding AI automation with AgentExchange, a new marketplace for its Agentforce platform. 

Think of it as the AppExchange for AI agents—where businesses can build, buy, and monetize AI-powered automation tools, tapping into the $6 trillion digital labor market.

What you need to know:

  • AI Agent Marketplace – Developers and partners can sell and integrate AI agents just like traditional Salesforce apps.

  • Pre-Built AI Solutions – Launching with 200+ partners and vetted AI actions, templates, and workflows for faster deployment.

  • Custom AI Development – Businesses can build AI components and integrate them seamlessly into Salesforce, Slack, and CRMs.

What do you think? Can it become the Agentic “App store” / “Play Store” for the enterprise? (Source)

OpenAI is making a major move into academic research with NextGenAI, a $50 million initiative aimed at tackling high-impact AI challenges. 

This program includes 15 top research institutions—including Harvard, MIT, Oxford, and Duke—offering grants, cloud compute credits, and API access to fuel AI experimentation at scale.

Sam Altman described it as a way to “accelerate breakthroughs that wouldn’t happen as quickly in isolation.” 

The response so far? Mostly positive. Many see this as OpenAI bridging the big tech vs academia gap and addressing concerns that universities are falling behind in the AI race.

But not everyone is convinced. Critics argue that these partnerships need to be balanced—will this truly democratize AI research, or is it just a talent and IP pipeline for OpenAI? The real question: Could the next big AI breakthrough come from a university, thanks to this push? (Source)

Amazon is entering the GenAI game with Nova - a new reasoning model set to launch in June 2025. 

Unlike models that just spit out fast answers or take forever to “think,” Nova seems to be designed to do both - quick responses when needed and deeper reasoning for complex tasks.

What Makes Nova Stand Out?

  • Hybrid Reasoning – Nova switches between fast replies and detailed problem-solving, making it flexible for different use cases.

  • Taking on the Big Players – Amazon wants Nova to rank among the top 5 AI models, competing with OpenAI’s o3-mini, Claude 3.7, and Google’s Gemini.

  • Cost-Effective AI – Unlike some of its pricier competitors, Nova is being built with cost-efficiency in mind, making it an attractive option for businesses.

  • Enterprise-Focused – Amazon is targeting business use cases, positioning Nova as a tool for better decision-making and problem-solving at scale.

This is Amazon’s biggest move in AI yet, and they’re clearly going after OpenAI and Google. With hybrid reasoning and a lower price point, Nova could be a strong contender - especially for companies looking for a balance between speed, accuracy, and affordability. (Source)

Inception Labs just introduced Mercury, a diffusion-based large language model (dLLM) that takes a completely different approach from traditional LLMs like GPT-4o or Claude. Instead of generating text word by word, Mercury refines an initial rough draft through multiple passes—similar to how AI image generators work. This could be a game-changer for AI speed, efficiency, and accuracy.

What Makes Mercury Different?

  • Diffusion Architecture – Unlike typical LLMs that generate text sequentially, Mercury iteratively improves its output, which could lead to more coherent and polished responses.

  • 10x Faster Processing – Mercury is reportedly 10x faster than state-of-the-art LLMs, processing over 1,000 tokens per second on NVIDIA H100 chips—without requiring specialized hardware.

  • AI-Powered Coding – A dedicated Mercury Coder model has been released, outperforming GPT-4o Mini and Claude 3.5 Haiku in code generation tasks.

  • Enterprise-Ready – Mercury is available via API and on-premise deployments, making it accessible to businesses that want AI capabilities without relying on cloud services.

Andrej Karpathy, called Mercury’s approach "an intriguing departure" from traditional text generation. He pointed out that while diffusion models have revolutionized image and video generation, text generation has largely resisted this approach - until now.

Karpathy urged AI enthusiasts to try out Mercury, noting that it could expose both new strengths and unexpected weaknesses in AI text generation. His reaction reflects a growing curiosity in the AI community - could diffusion models be the key to faster, more efficient, and higher-quality text generation?

Is This the Beginning of the End for Traditional LLMs?

If Mercury delivers on its promise, it could reshape how AI models are built and deployed—offering a new path that moves away from the compute-heavy, auto-regressive models we’ve relied on for years. But the big question remains: can it scale up and match the depth and reasoning abilities of traditional models?

For now, Mercury is an exciting experiment that could change the game—or prove why diffusion hasn't worked for text until now. Either way, it’s one of the most fascinating AI developments to watch. (Source)

Google just dropped Gemini Code Assist, an AI-powered coding assistant that’s free for individual developers, including students, freelancers, and startups. 

With 180,000 free code completions per month (woah!), it’s one of the most generous AI coding tools available, far surpassing GitHub Copilot’s free-tier limits.

What you need to know:

  • AI-Powered Coding Help – Built on Google’s Gemini 2.0, it provides code completion, generation, and debugging across multiple languages.

  • Seamless IDE Integration – Works with VS Code, JetBrains, GitHub, and Google Cloud, making it easy to plug into existing workflows.

  • High Free Usage Limits – Offers 180,000 completions/month, significantly higher than GitHub Copilot’s 2,000 completions/month on its free tier.

  • Automated Code Reviews – Includes AI-driven code suggestions, security checks, and refactoring recommendations to improve developer efficiency.

How Does It Compare to Microsoft Copilot? Why This Matters

Google is making a bold move by offering more free AI coding support than Microsoft or GitHub, clearly targeting developers who want powerful AI without a price tag. While GitHub Copilot still benefits from Microsoft’s deep GitHub integration, Gemini Code Assist's wider language support, built-in code review, and generous free tier make it a serious contender.

We are already in the race of AI coding tools, this has forced competitors to either drop prices or add new features to keep up. If you’re a developer, now’s the time to explore your options - Gemini Code Assist might just be the most cost-effective AI coding assistant yet.

What do you think about it? (Source)

Vibe coding is becoming the norm, so let’s look at a tool that can level up your UI/UX game.

UXPilot lets you create wireframes or high-fidelity designs just by using prompts—no designer needed. Perfect for anyone looking to build a website without diving into Figma.

We tried it out and built a newsletter wireframe in minutes. Check it out here!

  • Snap’s Set-And-Sequence – AI can now animate characters from a single video clip, letting them perform in new scenes. A game-changer for storytelling and meme-making.

  • Google’s LearnLM – A next-gen AI tutor that adapts to learning styles, explains concepts step by step, and creates lesson plans—making AI-powered education more interactive.

  • Zoom’s Chain of Draft (CoD) – A new LLM prompting method that speeds up responses by keeping reasoning concise, making AI outputs faster and cheaper without losing accuracy.

This week has been all about AI agents taking center stage—more than just new models, we’re seeing a clear shift toward agent-driven applications.

Do you have a go-to agent you use regularly? Let me know in the comments!

Until next time!

Reply

or to participate.