How can companies like Z.ai and DeepSeek offer their models at such a significant cost difference compared to American competitors like Anthropic and OpenAI? Is the price difference due to model size, training efficiencies, economic factors, or a strategic pricing move? Additionally, is there any data on the adoption of these Eastern models in the West?

Episode #107

The $5.5 Million Breakthrough: DeepSeek’s AI Disruption

Discover how DeepSeek-V3 is disrupting the AI market with massive cost savings and technical innovations like Multi-Head Latent Attention.

0:00/0:00

Download Episode

Episode Details

Published: Dec 26, 2025
Duration: 17:41
Audio: Direct link
Pipeline: V4
TTS Engine
Topics: large-language-models quantization open-source-ai

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

In the rapidly evolving world of artificial intelligence, the "moat" surrounding industry titans is often measured in dollars—specifically, the hundreds of millions required to train the world’s most powerful models. However, as Herman and Corn discuss in the latest episode of My Weird Prompts, a new wave of models from the East is proving that intelligence might not have to be quite so expensive. The discussion centers on the emergence of DeepSeek and Z.ai, companies that are delivering high-tier performance at a fraction of the cost of their Western counterparts.

The $5.5 Million Question

The episode kicks off with a startling comparison of training costs. While industry leaders like OpenAI’s GPT-4 are rumored to have cost over $100 million to train—with future models projected to reach the billions—DeepSeek recently released its V3 model with a training price tag of just $5.5 million. Herman explains that this isn't just a minor discount; it is a fundamental shift in the economics of AI. For the average user, these savings translate into lower "token" costs, allowing developers to build complex applications without the prohibitive overhead traditionally associated with high-end LLMs.

Architectural Wizardry: MLA and FP8

A significant portion of the conversation focuses on how DeepSeek achieved these efficiencies. Herman breaks down two key technical innovations: Multi-Head Latent Attention (MLA) and FP8 mixed precision training.

MLA is described as a sophisticated compression system for the model’s "attention" mechanism. In standard models, maintaining context requires massive amounts of memory. MLA allows the model to retain that same context in a much smaller digital footprint, enabling faster processing on less expensive hardware.

Complementing this is FP8 training. Herman uses the analogy of measuring wood: while some models try to measure to the nearest nanometer (using extreme mathematical precision that drains computing power), DeepSeek utilized FP8 to perform calculations with "just enough" precision to be accurate. This "mixed precision" approach drastically reduced computational time and energy consumption, proving that smarter math can sometimes replace raw power.

Innovation Born of Necessity

Corn and Herman also touch upon the geopolitical and economic factors driving these breakthroughs. With export bans limiting access to the latest high-end chips in China, companies like DeepSeek have been forced to innovate within constraints. This "necessity is the mother of invention" scenario has led to software that extracts maximum utility from every processor cycle.

Furthermore, the hosts discuss the strategic use of open-source licensing. By releasing models like DeepSeek-R1 under the MIT license, these companies are encouraging rapid global adoption. Herman notes that it is difficult for paid services to compete with a product that is "free and very good," especially when the open-source nature allows developers to host the models on their own servers, ensuring data privacy and security.

The Shift Toward Sovereign AI

A common concern regarding Eastern AI models involves security and data sovereignty. However, Herman argues that the open-source nature of these models actually provides a solution. Because developers can "air-gap" these models—running them on private, disconnected servers—they can utilize high-level intelligence without sending proprietary data to a third-party provider in another country. This has led to a surge in adoption among Western startups and enterprises on platforms like OpenRouter, where the price-to-performance ratio of DeepSeek is becoming impossible to ignore.

The Future: AI as a Commodity

As the episode concludes, Herman and Corn reflect on what this means for the future of the industry. The competitive advantage of being the "biggest" is shrinking, forcing Western companies to pivot toward "agentic" AI—models that don't just talk, but can actively use tools and perform tasks like booking flights or managing calendars.

The ultimate takeaway is that high-quality intelligence is rapidly becoming a commodity, much like electricity or water. As costs plummet and efficiency rises, the specific model powering an application will become less important than the utility it provides. In this new era of AI, the winners are the developers and users who now have access to world-class brainpower at a price point that was unthinkable only a year ago.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

Cover · OG · Instagram

Episode #107: The $5.5 Million Breakthrough: DeepSeek’s AI Disruption

Hey everyone, welcome back to My Weird Prompts! I am Corn, and as always, I am hanging out here in Jerusalem with my brother.

Herman Poppleberry, at your service. It is a beautiful day to talk about some seriously heavy-duty technology.

It really is. And we have got a great one today. Our housemate Daniel sent us a voice note about something that has been making waves in the tech world lately. He was asking about these artificial intelligence models coming out of the East, specifically companies like DeepSeek and Z dot A I.

Yeah, Daniel is always keeping his ear to the ground with this stuff. He noticed that these models are hitting the market with price tags that are just a fraction of what companies like OpenAI and Anthropic are charging. We are talking about a massive gap in cost.

It is wild. I mean, I have been seeing the headlines, but when you actually look at the numbers, it feels like there is a glitch in the matrix. How can one company charge thirty times less than another for something that performs almost as well, or sometimes even better?

That is the multi-million dollar question, Corn. Or in the case of DeepSeek, maybe the five point five million dollar question.

Wait, five point five million? That sounds like a lot of money to me, but in the world of training these massive models, that is actually peanuts, right?

Exactly. To put it in perspective, the industry standard for training a top-tier model like GPT-four was rumored to be well over one hundred million dollars. Some estimates for the next generation of Western models are heading into the billions. And then DeepSeek comes along and says, hey, we built DeepSeek-V-three for about five point five million dollars in training costs.

Okay, hold on. We need to back up. If I am a regular person just using these tools to write emails or code, why should I care about the training cost? And how on earth did they get it that low?

Well, you care because those savings get passed directly to you. When the training and inference costs are lower, the price per token, which is basically how these companies measure the text they generate, drops through the floor. It makes it possible for developers to build much more complex apps without going bankrupt.

That makes sense. So, is it just that they are using cheaper labor or something? Or is there actual wizardry happening under the hood?

It is a bit of both, but the technical wizardry is the real story here. DeepSeek-V-three uses some incredibly clever architectural tricks. One of the big ones is something called Multi-Head Latent Attention, or M L A.

Okay, Herman, you know the drill. Break that down for a sloth like me. What is Multi-Head Latent Attention when it is at home?

Think of the model’s attention mechanism as its ability to look back at what has already been said to understand the context. In older models, that process takes up a huge amount of memory. It is like having a giant filing cabinet where every single word has its own massive folder. M L A is like a super-efficient compression system. It allows the model to keep all that context in a much smaller space, which means it can process information much faster and on less expensive hardware.

So it is like they found a way to pack the same amount of brainpower into a smaller suitcase?

Precisely. And they also used something called F P eight mixed precision training.

You are doing it again with the letters and numbers, Herman.

Sorry! Basically, when computers do math, they can use different levels of precision. Think of it like measuring a piece of wood. You could measure it to the nearest millimeter, or the nearest nanometer. Using nanometer precision is much harder and takes more computing power. F P eight is a way of doing the math with just enough precision to get the right answer without wasting energy on unnecessary detail. DeepSeek figured out how to use this throughout their entire training process, which saved them a massive amount of computational time.

That is fascinating. It sounds like they are just being more efficient with the actual code and the way the math is handled. But Daniel also asked if this was a strategic pricing move. Like, are they taking a loss just to get people to use their stuff?

That is definitely part of the conversation. There is a strategy in the business world called blitzscaling, where you price things super low to capture the market. But with DeepSeek and Z dot A I, the efficiency seems real. They aren't just subsidizing the cost; they have actually reduced the cost of production. It is like the difference between a luxury car brand and a company that figures out a way to mass-produce high-quality engines for a tenth of the price.

It feels a bit like the early days of any industry where someone comes in and just disrupts the whole cost structure. But I wonder about the economic factors too. I mean, they are based in China, right? Does that play a role?

It does. There are a few layers to that. First, yes, the cost of highly skilled engineering talent in China can be lower than in Silicon Valley, though that gap is closing for top-tier A I researchers. But more importantly, there is a huge focus on efficiency because they have had to work around hardware limitations.

Oh, you mean the export bans on high-end chips?

Exactly. When you cannot just throw ten thousand of the latest and greatest chips at a problem, you have to get creative with how you use the hardware you do have. That necessity has driven a lot of this innovation in efficiency. They are getting more out of every single cycle of the processor.

That is a classic "necessity is the mother of invention" situation. They had to be better because they couldn't just be bigger.

Right. And then there is the open-source element. DeepSeek-R-one, for example, is fully open-source. They released the weights under an M I T license. This means anyone can take it, modify it, and run it on their own servers.

Wait, so they are giving away the secret sauce for free?

Pretty much. And that creates a massive amount of adoption very quickly. It is hard to compete with "free and very good."

I can see why that would shake things up. Let's take a quick break to hear from our sponsors, and then I want to get into how people in the West are actually using these models.

Larry: Are you tired of your garden looking like a boring collection of plants? Do you wish your petunias had more... personality? Introducing Larry’s Bio-Luminescent Garden Gnomes! These aren't your grandfather’s lawn ornaments. Each gnome is infused with a proprietary blend of deep-sea algae and glow-in-the-dark isotopes. They don't just sit there; they emit a faint, soothing hum that may or may not stimulate plant growth. Are they safe for pets? We haven't seen any evidence to the contrary! Do they require batteries? No, they run entirely on ambient moonlight and your own sense of wonder. Transform your backyard into a neon wonderland that can be seen from low earth orbit. Larry’s Bio-Luminescent Garden Gnomes - because the night is too dark and your lawn is too quiet. BUY NOW!

...Alright, thanks Larry. I am not sure I want my lawn visible from space, but I appreciate the enthusiasm. Anyway, Herman, back to the A I stuff.

Yeah, let's leave the glowing gnomes aside for a moment. We were talking about adoption in the West.

Right. Daniel was asking if there is any data on people in the U S or Europe actually using these Eastern models. Because there is a lot of talk about "sovereign A I" and security concerns, right?

There absolutely is. But the data shows that despite those concerns, the adoption is skyrocketing, especially among developers and startups. If you look at platforms like OpenRouter, which is a service that lets developers access dozens of different A I models through a single interface, you can see the trends clearly.

And what are the trends saying?

They are saying that people follow the value. DeepSeek models have been surging in popularity on those platforms. For a lot of developers, especially those building software as a service, or S a a S, the price-to-performance ratio is just too good to ignore. If you are a small startup and you can get ninety-five percent of the performance of a Western model for three percent of the cost, you are going to take that deal almost every time.

I mean, I would. If I am trying to build a new app on a budget, that is a huge difference. It is the difference between being able to afford to run your business and not.

Exactly. And it is not just about the price. Because these models are often open-source, developers feel like they have more control. They can host the models themselves, which actually solves some of those privacy and security concerns Daniel was hinting at. If you run the model on your own servers, your data isn't being sent off to a third party.

Oh, that is a good point! I hadn't thought of it that way. I always assumed "open" meant "less secure," but it is actually the opposite if you have the technical skills to manage it yourself.

Right. You can "air-gap" it, meaning you keep it completely disconnected from the internet if you want to. That is a huge selling point for enterprise customers who are worried about their trade secrets leaking into a public A I training set.

So, we have got technical efficiency, we have got strategic open-sourcing, and we have got this massive cost advantage. Does this mean the Western companies like OpenAI and Anthropic are in trouble?

I wouldn't say they are in trouble, but the game has definitely changed. The "moat," or the competitive advantage, that they had by just being the first and the biggest is shrinking. They are being forced to justify their higher prices. You see them pivoting more towards "agentic" capabilities—basically, A I that can actually do tasks and use tools, rather than just generating text.

Like A I that can actually book a flight for you or manage your calendar?

Exactly. That requires a different kind of reliability and reasoning that the Western models are still leading in, at least for now. But the Eastern models are catching up fast. DeepSeek-V-three and R-one have shown incredible reasoning capabilities, especially in math and coding.

It is like a race where one group is focused on being the most powerful, and the other group is focused on being the most efficient and accessible.

That is a great way to put it. And the interesting thing is that this competition is actually pushing the Western companies to be more efficient too. We are seeing a lot more "small" or "medium" models coming out of OpenAI and Google that are much cheaper to run than their flagship versions.

So, in the end, the user wins?

In terms of cost and access, absolutely. We are entering an era where high-quality intelligence is becoming a commodity. It is becoming like electricity or water. You don't think about the cost of every single light switch you flip; you just use it. A I is heading in that direction.

That is a big shift. I remember when using these things felt like this precious, expensive resource. Now you are saying it is going to be everywhere.

It already is. And the adoption in the West is only going to grow as more companies integrate these cheaper models into their back-end systems. Most people using an app won't even know which model is powering it. They will just notice that the app is faster and maybe cheaper or has more features.

It is kind of like how most people don't know what kind of database their favorite website uses. It just works.

Exactly. And that is the ultimate goal of any technology—to become invisible.

So, to wrap up Daniel’s question... the price difference is a mix of genuine technical innovation in how the models are built, a strategic move to gain market share through open-source, and a focus on efficiency born out of necessity. And the adoption in the West is very real, especially among the people actually building the tools we use every day.

Spot on, Corn. You have been paying attention!

I try, Herman. I might be a sloth, but I can keep up when the topic is this interesting. I think the takeaway for everyone listening is that the A I landscape is much bigger than just the names we hear in the news every day. There is a whole world of innovation happening, and it is making these tools more accessible to everyone.

It really is an exciting time. I can't wait to see what Daniel sends us next week. He always finds the most interesting threads to pull on.

He really does. Well, that is all for today’s episode. Thank you so much for joining us on this deep dive.

It was a blast. Remember to keep asking those weird questions!

You can find us on Spotify and at our website, myweirdprompts dot com. We have got an R S S feed there and a contact form if you want to send us your own prompts.

We love hearing from you. Until next time!

This has been My Weird Prompts. Thanks for listening!

Goodbye everyone!

See ya!

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.