#1855: AI Is Turning Your Photos Into 3D Models

From blocky polygons to photorealistic assets, AI is transforming how 3D models are made.

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-2010
Published: Mar 31
Duration: 20:41
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: generative-ai gaussian-splatting computer-vision

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The Evolution of 3D Modeling: From Manual Labor to AI Generation

For decades, creating 3D assets was a painstaking process of manual labor. Artists spent hours clicking and dragging individual vertices to build shapes, a method that defined the industry from the blocky polygons of 90s games to the detailed characters of modern AAA titles. Today, that process is undergoing a radical transformation. Generative AI is now taking the wheel, turning text prompts and simple photos into fully realized 3D models, fundamentally changing how digital worlds are built.

The Technical Leap: From Meshes to Splats
At the heart of this shift is the move away from traditional mesh modeling—where objects are built from a net of triangles—toward more fluid, AI-driven techniques. One of the most significant advancements is Gaussian Splatting. Instead of calculating light hitting solid surfaces, this method uses millions of tiny 3D blobs of color and transparency that "snap" into a sharp image when viewed. It is significantly faster to render and allows for complex visual fidelity without the heavy computational load of traditional rendering.

However, spatial consistency remains the biggest challenge. Unlike generating a 2D image, where a weirdly shaped paw might go unnoticed, a 3D model must look correct from every angle. Tools like Meshy and Tripo AI solve this through multi-view synthesis. They generate a series of images from different perspectives—front, back, side, top—and use a process called "Score Distillation Sampling" to optimize a 3D representation that matches those views perfectly.

The Rise of Clean Topology and Auto-Rigging
A major historical limitation of AI-generated 3D models was "digital spaghetti"—a messy tangle of thousands of tiny, useless triangles that crashed game engines. The latest tools, however, are outputting "clean topology." This means the model has an organized structure of polygons that behaves predictably. For example, a table leg is generated as a sturdy cylinder rather than a hollow, jagged shell.

Furthermore, the process of animating these models is also being automated. "Rigging"—the creation of a digital skeleton inside a model so it can move—used to take specialized artists days. New AI pipelines can now identify joints and "paint" weight to the mesh automatically, allowing a static 3D statue to become a moving puppet in minutes. This "physics-aware" generation ensures that models are not just visually accurate but structurally sound.

Democratization and the "Asset Flip" Fear
This technology is democratizing game development. An individual in a bedroom can now generate hundreds of high-quality props in two weeks, a task that previously required a small team six months. This has led to a flood of indie games on platforms like Steam, raising concerns about homogenization. Because many developers use similar AI training sets, there is a growing fear of a "generic look" where worlds feel technically perfect but lack specific, human intentionality—a digital version of a hotel room that looks nice but feels empty.

The debate over the "soul" of art is intensifying. While AAA studios have used procedural generation for years without backlash, indie developers face a "purity test." The controversy surrounding games like Clair Obscur: Expedition 33, which was disqualified from some awards for using AI assets, highlights this double standard. However, the most successful workflows emerging today are hybrid. Artists use AI to "block out" scenes and generate base shapes, then use traditional tools like Blender to add the specific details—a coffee stain on a desk, a chip in a teacup—that give a world personality.

The Future of the Artist
The consensus on the job market is that while entry-level roles for creating basic assets like crates and barrels are at risk, the role of the artist is shifting rather than disappearing. The "quality floor" has been raised, allowing creators to focus less on technical grunt work and more on narrative, design, and refinement. As the tools handle the heavy lifting of measuring and structuring, human artists are freed to imbue digital worlds with the imperfections and stories that make them feel lived in.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#1855: AI Is Turning Your Photos Into 3D Models

I was looking at some old screenshots of Tomb Raider from the nineties the other day, and it struck me how much we’ve collectively hallucinated that those graphics were good. Lara Croft’s nose was basically a single sharp triangle. If she turned her head too fast, she could have poked an eye out. It’s wild. Today’s prompt from Daniel is about exactly that evolution, specifically how AI is now taking the wheel to transform 3D modeling and game assets from those blocky polygons into something almost indistinguishable from reality.

It is a massive shift, Corn. I’m Herman Poppleberry, by the way, and I’ve been diving into the technical side of this all week. We’re moving from a world where every single vertex had to be manually placed by a human artist—literally clicking and dragging points in a virtual space—to a world where you can literally just describe an object and have it appear in three dimensions. Today’s episode is actually powered by Google Gemini three Flash, which is fitting since we’re talking about the cutting edge of generative tech.

Dorothy: See, that sounds like magic to someone like me. I always tell you guys I’m technically inept, but I do play the occasional game to relax. I remember playing Pong where the ball was just a square, and the "paddles" were just slightly longer rectangles. Now I see my grandkids playing games where you can see the individual threads on a character's sweater or the way light refracts through a glass of water. It’s beautiful, but hearing that a computer is just dreaming these things up makes me wonder if we’re losing the soul of the art.

That’s the big question, Dorothy. We’ve gone from "pixel art" to "polygons" and now to "prompting." Daniel’s prompt specifically mentions tools like Meshy and the rise of image-to-three-D generation. It’s not just for big studios anymore; it’s hitting the point where an individual can spin up a whole world from their bedroom. But Herman, before we get into the "soul" of it, can you break down what’s actually happening under the hood when someone uses something like Meshy? Because "text to three-D" sounds a lot more complicated than just making a flat image.

It is significantly more complex because of the spatial consistency requirement. When you generate a two-D image of a cat with something like Midjourney, the AI only has to worry about that one perspective. If the paws look a bit weird, you might not even notice. But with three-D, if I rotate the model, the cat’s tail needs to be in the same place relative to its ears from every single angle. The way tools like Meshy or Tripo AI handle this is through something called multi-view synthesis. They essentially generate a series of images from different perspectives—front, back, side, top—and then use a process called "Score Distillation Sampling" to optimize a three-D representation until it matches those images from every direction.

Wait, I’ve heard you mention Gaussian Splatting before. That’s the one that looks like a cloud of colored fuzzy dots that snap into a sharp image, right?

That’s a good way to visualize it. Traditional modeling uses a mesh—a net of triangles. Splatting uses millions of tiny three-D Gaussians, which are basically little blobs of color and transparency. It’s much faster to render than older methods because the computer doesn't have to calculate the physics of light hitting a solid surface in the same way. But the real breakthrough Daniel pointed out for 2026 is that these tools are now outputting "clean topology." In the past, AI-generated models were a mess of "digital spaghetti"—thousands of tiny, useless triangles that would make a game engine crawl to a halt or crash.

So it’s the difference between a sculptor carving a statue out of one piece of marble versus someone just gluing a billion grains of sand together into a person-shape?

Precisely. The "sand" version looks fine until you try to move it. Meshy’s January update actually introduced physics-aware mesh generation. It understands that a table leg needs to be a sturdy cylinder, not a hollow, jagged shell. It understands structural integrity.

Dorothy: So, if I wanted to make a digital version of my favorite antique teapot, I wouldn’t have to learn how to draw every little curve on a computer screen? I could just take a photo of it?

Well, not "exactly," I should say—you would use a process called photogrammetry or the newer AI-driven image-to-three-D. Traditional photogrammetry required you to take maybe a hundred photos from every possible angle, often using a tripod and controlled lighting. It was a chore. Now, with the models we have in March twenty twenty-six, you can often provide just three or four photos, and the AI "hallucinates" the parts you missed based on its understanding of how teapots generally look. If it sees the front of a handle, it can infer what the back of that handle looks like.

It’s the "hallucination" part that gets people nervous, though. If the AI is filling in the gaps, it’s making creative decisions. Does it choose to add a chip in the porcelain? Does it decide the bottom should be flat or rounded? Dorothy, when you’re playing a game, do you care if the rocks and trees in the background were made by a person or a math equation?

Dorothy: That’s a tough one. If I’m walking through a digital forest and it feels real, maybe I don’t care in the moment. But there’s something about knowing a human artist spent hours carving the bark on a specific "hero tree" that makes the world feel intentional. If everything is just generated by a prompt like "make generic forest," doesn't it all start to feel a bit... I don't know, empty? Like a hotel room that’s decorated nicely but has no personality? You can tell no one actually lives there.

That’s a very real concern in the industry right now, often called the "asset flip" fear. We’re seeing a flood of indie games on Steam where the environments look amazing because the developers used AI to generate five hundred high-quality props in two weeks—something that would have taken a small team six months or more a few years ago. But because they’re all using similar training sets from tools like Meshy or Rodin, the games are starting to have a "look." It’s a high-quality look, but it’s a homogenized one.

But isn't that just the nature of tools? When everyone started using the Unreal Engine, people complained that every game looked "gray and oily." Now we just accept it as a baseline.

It’s like the "Uncanny Valley" but for world-building. Everything is technically perfect, which makes the lack of specific, weird human choices stand out more. A human artist might put a random coffee stain on a desk because they imagine the character who works there is messy. The AI might just make a perfect, clean desk. But let’s look at the flip side. Herman, you mentioned the "Expedition thirty-three" incident in the notes Daniel sent over. That was a huge flashpoint for this debate, wasn't it?

It was a massive controversy for the indie scene. Clair Obscur: Expedition thirty-three is this gorgeous game with an incredible art style, and when the developers admitted they used generative AI for parts of their workflow—specifically for some background assets and textures—there was a huge backlash. They were even disqualified from some awards. The community is currently enforcing a "purity test" on indie devs.

Dorothy: But why is it only the small creators getting in trouble? That doesn't seem fair.

That is the irony, Dorothy. AAA studios—the giants like Ubisoft or EA—have been using procedural generation and AI-assisted tools for a decade to build their massive open worlds. They use algorithms to place every tree in a forest or to generate the layout of a city. Nobody bats an eye because it's hidden behind a big brand. But because indie games are seen as "artisan," the use of AI is viewed by some as "cheating."

It feels like a double standard. If a solo dev uses AI to make a rock so they can spend more time on the story and the characters, isn't that a net win for the player? I mean, how many ways can you manually sculpt a rock before it becomes a waste of human potential? Is there really a "soul" in a 3D rock?

That’s the argument for the "quality floor." AI raises the floor so that a single person can produce something that looks like a big-budget production. But the "quality ceiling" is still held by human artists who can take those AI-generated "blocks" and refine them. The most successful workflows I’m seeing right now are hybrid. An artist uses Meshy to "block out" a scene—getting the scale and the general shapes right in seconds—and then they go in with Blender or ZBrush to add the specific details that give the piece "soul," as Dorothy put it.

Dorothy: I suppose it’s like using a cake mix. You can just bake the mix and it’s fine, but the best bakers use the mix as a base and then add their own fresh eggs, some high-quality chocolate, and their own frosting. It’s still "homemade" in a way, but the heavy lifting of measuring the flour and sugar was done for them.

I love that analogy, Dorothy. It really captures the efficiency gain. But Herman, let's talk about the technical barrier that’s actually falling. You mentioned "rigging" and "topology" earlier. For a non-technical listener, why is that such a big deal? Because to me, a 3D model is just a shape. Why does it matter how the "wires" are arranged inside it?

This is where the rubber meets the road for game design. Imagine a 3D model of a character. If you want that character to walk, you have to give it a "skeleton"—this is called rigging. Each "bone" in that skeleton is assigned to certain parts of the mesh. If the mesh—the "skin" of the model—is a disorganized mess of triangles, when the character bends its elbow, the skin will collapse or stretch in weird, jagged ways. It looks like the character’s arm is being put through a paper shredder. Historically, AI could make a "statue," but it couldn't make a "puppet."

And that changed recently? I assume that’s why we’re seeing more AI characters that actually move now.

It’s changing fast. The latest iterations of tools like Tripo and Meshy have "auto-rigging" pipelines. They use machine learning to identify where the joints should be—shoulders, elbows, knees—and then they "paint" the weights so the mesh deforms smoothly. This is a process that used to take a specialized technical artist hours or days per character. Think about the labor involved in making sure a cape doesn't clip through a character's legs when they run. Now, you can upload a drawing of a monster, get a 3D mesh in sixty seconds, and have it walking and jumping in a game engine five minutes later.

Dorothy: That is incredible. I remember when making a movie like Toy Story took years and years of computers crunching numbers, and hundreds of people just to make the hair move right. Now you’re saying I could potentially make my own little animated character just by talking to my computer?

You really could, Dorothy. And that brings up the democratization aspect. It’s not just for games. People are using these tools for 3D printing, for virtual reality social spaces, and even for historical preservation. We’ve talked about AI restoration before in the context of film, but this is taking it to the next dimension.

Let’s pivot to the "Asset Flip" concern again, because I think it’s important. If everyone has access to these tools, are we going to see a "Death of the Artist" or just a shift in what an artist does? Herman, you’re always reading these industry papers—what’s the consensus on the long-term job market for 3D modelers? Is it all doom and gloom?

The consensus is that "junior" roles are in trouble. The entry-level work of making crates, barrels, and generic "background filler" is being automated away. If you’re a 3D artist whose only skill is making high-quality props, your value is dropping. However, the demand for "Technical Artists" and "Art Directors" is skyrocketing. Someone needs to know how to prompt the AI, how to fix the errors it inevitably makes—like when it gives a character six fingers—and how to ensure that the five thousand AI assets all fit a cohesive visual style. It’s moving from "crafting" to "curating."

It’s the difference between being the guy who lays the bricks and being the architect. The architect still needs to understand how bricks work, but they aren't the ones getting their hands dirty every day. But what about the copyright side of this? Daniel mentioned "Ethically Sourced" models. That’s a huge part of the conversation in 2026. If I use an AI to make a character, who owns that character?

It’s the only way forward for professional studios. You can’t build a hundred-million-dollar game on a foundation of "stolen" data. We’re seeing a split in the market. There are the "wild west" tools that train on everything they can scrape from the internet, which are popular with hobbyists. Then there are tools like Adobe’s Firefly for 3D or specific enterprise versions of Meshy that only train on licensed datasets or public domain work. For a professional, the "legal safety" of the asset is just as important as the poly count. If a studio uses an AI that was trained on a specific artist's work without permission, they could face a massive lawsuit down the line.

Dorothy: That makes me feel a bit better. If the artists are being compensated for the data being used to train the computer, it feels less like stealing and more like... well, like a very advanced library. But I still worry about the "sameness." If I’m playing a game, I want to be surprised. I want to see something I’ve never seen before. Can an AI truly be "weird" or "surprising," or is it always just an average of everything it’s already seen?

That is the million-dollar question. AI is, by definition, a "regression to the mean." It’s looking for the most likely pixel or the most likely vertex based on its training. To get something truly "weird," you usually need a human to push the parameters or to combine things in a way that the AI wouldn't think of on its own. Like, "make me a toaster that is also a sentient jellyfish." The AI can do that, but a human had to have the weird idea first.

And that’s where the "soul" stays in the machine, Dorothy. The "weirdness" comes from the prompt and the intent. One thing that really blew my mind recently was seeing how these tools are being used for "Digital Twins." People are taking these AI photogrammetry tools and creating perfect 3D replicas of their own homes or neighborhoods to use in VR. It’s a level of personal creativity that was impossible for a non-techie five years ago.

Wait, how does that work? Could I take my old family photo albums and recreate my grandmother's living room?

We're getting there. If you have enough photos of a space, the AI can reconstruct the geometry. Imagine being able to walk through your childhood home in VR, perfectly recreated from just a few old polaroids and some AI "hallucination" to fill in the corners the camera didn't catch. That’s a powerful application that has nothing to do with "asset flipping" or "cheapening art." It’s about memory and connection.

It’s also about iteration speed. In the professional world, "blocking" is the stage where you figure out if a game level is fun to play. You use simple gray boxes to represent buildings and obstacles. Now, you can use "AI blocking." Instead of gray boxes, you have rough, textured versions of the actual intended buildings. You can feel the "vibe" of the level on day one instead of month six. That leads to better games because you have more time to experiment with the actual gameplay.

Dorothy: So it’s like a sketch? An artist wouldn't just paint a masterpiece; they’d do a bunch of quick pencil drawings first to see where the light falls. This AI is like a pencil that can draw in three dimensions?

It’s a high-fidelity sketching tool. And as we look toward the rest of twenty twenty-six, the quality is only going up. We’re expecting another two-times jump in texture resolution and mesh complexity by the end of the year. The gap between "AI generated" and "Hand sculpted" is closing so fast that soon, you won't be able to tell the difference just by looking at the final product. You’ll only know by looking at the "credits" of the game.

Which brings us back to the "purity test." Will players eventually stop caring? I mean, we used to have "hand-painted" backgrounds in movies, and now it’s all CGI. People complained at first, calling it "fake," but now we just care if the movie is good.

Dorothy: I think we’ll always care about the story. If the AI helps a person tell a better story, then I’m all for it. But if the AI is used to just fill space because the creators were lazy... well, players can sense that. We’re smarter than people give us credit for. We know when something was made with love and when it was made with a button press. It's the difference between a handwritten letter and a form letter.

I think that’s the perfect takeaway. AI is a tool, not a replacement for intent. For our listeners who are curious about this, where should they start? Herman, what’s the "entry drug" for 3D AI?

If you’re a hobbyist, I’d say check out the free tiers for Meshy or Luma AI. You can literally just upload a photo of your dog or a cool rock you found and see it turn into a 3D model in real time. It’s a "magic moment" that really helps you understand the scale of what we’re talking about. For the professionals listening, the practical takeaway is to start integrating these into your "blocking" and "prototyping" workflows. Don't try to replace your final assets yet, but use it to speed up your ideation. The iteration speed is where the real competitive advantage lies.

And keep an eye on those "ethically trained" models. If you’re building something you want to sell, you don't want a "copyright time bomb" ticking in your asset folder. You don't want to find out three years later that your main character's boots were trained on a stolen design.

Dorothy: And don't forget to play the games! At the end of the day, all this fancy technology is just there to help us have a little fun and escape into another world for a bit. Whether it’s a triangle nose or a photorealistic face, it’s the joy of the game that matters.

Well said, Dorothy. From "Pong" to "Prompting," it’s been a wild ride. This has been a fascinating look at the "new dimension" of generative AI. I’m definitely going to go play around with some text-to-three-D tools this afternoon and see if I can make a three-D version of my favorite coffee mug.

Just don't try to drink out of the digital one, Corn. It’s not quite that realistic yet. You'll just end up with a wet chin and a broken monitor.

Give it until twenty twenty-seven, Herman. Give it until then. Maybe we'll have haptic feedback that lets me feel the ceramic.

We should probably wrap this up. Huge thanks to our producer, Hilbert Flumingtop, for keeping the gears turning behind the scenes. And a big thanks to Modal for providing the GPU credits that power this show—without those, we’d be standing in the digital dark.

If you enjoyed this deep dive into the world of three-D assets, do us a favor and leave a review on your favorite podcast app. It really helps other people find the show and helps us keep making these. Tell us what you want to hear about next—maybe AI in music or virtual fashion?

This has been My Weird Prompts. We’re on Spotify, Apple Podcasts, and all the usual spots. You can also find us at my weird prompts dot com for the full archive and RSS feed.

Thanks for listening, and we’ll catch you in the next dimension.

Goodbye, everyone.

See ya.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#1855: AI Is Turning Your Photos Into 3D Models

Downloads

You Might Also Like

#1855: AI Is Turning Your Photos Into 3D Models