#3994: Panorama vs Photosphere: The Secret Metadata That Makes Photos Spin

A single line of metadata separates a static wide photo from a rotatable 360° view. Here's how it works.

Featuring

Listen

0:00

Episode Details

Episode ID: MWP-4173
Published: Jun 30
Duration: 38:02
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: deepseek-v4-pro
Topics: metadata-analysis image-generation data-integrity

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

A standard panorama and a photosphere look nearly identical in your thumbnail gallery. But inside the file, they're fundamentally different formats — and a few lines of embedded metadata make all the difference.

A panorama stitches a single row of images into a wide, flat JPEG. No special tags, no spherical instructions. Your gallery app treats it like any other wide photo: static, pinch-to-zoom, no rotation.

A photosphere (also called a spherical panorama or 360 photo) captures multiple rows of overlapping shots — typically nine to twelve frames across three rows. The phone's gyroscope guides you through a grid, ensuring enough overlap for seam-free stitching. The result is an equirectangular projection flattened onto a rectangle, just like a world map. The critical difference: XMP metadata embedded in the JPEG tells compatible viewers to wrap this image onto the inside of a virtual sphere.

The key tag is GPano:ProjectionType=equirectangular. Additional tags define the full canvas dimensions and the cropped area that actually contains image data — which is why partial photospheres (less than a full 360 degrees) work so naturally. The viewer renders only the populated region and creates soft boundaries where the image ends.

Most phone manufacturers skip multi-row capture because it demands more processing, storage, and user patience. A single-row panorama finishes in under a second; a photosphere takes ten to fifteen seconds of guided capture plus processing time. The resulting file is also larger — 15-25 MB versus 5-10 MB for a standard panorama — partly because equirectangular projections stretch detail at the poles, requiring higher base resolution.

Platforms like Google Photos detect the GPano metadata and add a gyroscope-driven spherical viewer. Third-party tools like Kuula or Pannellum do the same in browsers using WebGL. Understanding this metadata layer transforms the fluke of a spinning photo into a repeatable, controllable outcome.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#3994: Panorama vs Photosphere: The Secret Metadata That Makes Photos Spin

Daniel sent us this one, and I think just about everyone who's owned an Android phone has tripped over this exact thing without knowing what to call it. You swipe through the panorama mode, you get a long flat strip, perfectly fine. Then one day you use a different app, or you borrow someone's Pixel, and suddenly you're tilting your phone around and the image moves with you, like you're peering through a window into the place you were standing. Daniel's been getting these by fluke. His OnePlus native camera won't do it, but a dedicated app will. So he's asking: what are these actually called, why does one app produce them and another doesn't, how do you stitch something even wider manually, what's the viewer technology that makes the rotation work, and where on the internet can you share these so they actually look good?

That fluke moment is the exact right entry point, because it reveals the whole invisible architecture. You think you're just taking a wide photo, but something fundamentally different has happened under the hood, and the phone knows it. The image file itself carries instructions.

Instructions for what? The gallery app?

The file is saying, "I am not a rectangle. Wrap me around a sphere." And if the app knows how to read that instruction, you get the rotatable view. If it doesn't, you get what looks like a funhouse mirror stretched across your screen. The difference between those two outcomes is about four lines of text buried in the file's metadata.

It's not a different kind of image file. It's the same JPEG with a secret handshake attached.

That's the perfect way to put it. The secret handshake is a set of XMP tags, specifically in what's called the Google Photo Sphere namespace. The image itself is an equirectangular projection — taking a sphere and flattening it onto a rectangle, the same way a map of the earth gets stretched at the poles. The metadata tag that matters most is GPano colon ProjectionType equals equirectangular. That one line is the difference between a panorama and a photosphere.

That's the name Daniel's looking for.

That's the Google branding, and it's become the generic term even though technically it's a spherical panorama or a three-sixty photo. The key thing is that a photosphere doesn't have to be a full sphere. Daniel mentioned he's getting partial rotations, and that's completely normal. The metadata can describe a partial sphere, a cylinder, even just a wide arc. The viewer reads the crop boundaries and the field of view from the XMP tags and maps only the part of the sphere that actually has image data.

When Daniel's OnePlus native panorama mode sweeps across a scene, what's it actually producing?

A standard wide JPEG with no spherical metadata at all. It's stitching a single row of images into a rectilinear or cylindrical projection and saving it flat. The gallery app sees it and thinks, "This is a very wide photo," and displays it as a static strip you can pinch to zoom. There's no instruction to map it onto a three-dimensional surface, so the rotate option never appears.

Which makes the OnePlus camera sound broken, but it's not. It's just optimizing for something else.

A single-row panorama sweep is fast. You pan left to right, the phone stitches in near real time, and you get a result in under a second. Multi-row capture, which is what a photosphere requires, is a completely different beast. You're guiding the user through a grid of overlapping shots, maybe nine to twelve images across three rows, and the stitching algorithm has to match features across both horizontal and vertical overlaps. That takes more compute, more storage, more battery, and more patience. Most manufacturers just don't want to put that in the default camera app.

The dedicated app Daniel's using is doing the multi-row thing.

And if I had to guess which app, it's probably Google Camera's Photo Sphere mode, or a port of it. The way it works is clever. You point the phone at a starting position, and a dot appears on screen. You move the phone to center the dot over a target circle, and it captures a frame. Then the dot moves to the next position, and you follow it, building up a grid. The app uses the phone's gyroscope to track orientation and ensure sufficient overlap between frames. Once you've covered the grid, it stitches everything into an equirectangular projection and embeds the XMP metadata. The whole process takes maybe ten to fifteen seconds of capture and another ten to fifteen of processing.

The resulting file is bigger.

A standard panorama might run five to ten megabytes. A photosphere of the same scene can easily hit fifteen to twenty-five, partly because you're capturing more total pixel data across the grid, and partly because the equirectangular projection is inherently less efficient at storing detail. The top and bottom of the sphere get stretched way out, so you need a higher base resolution to maintain quality in the center. A good photosphere is often eight thousand by four thousand pixels or larger.

That stretching at the poles is the same reason Greenland looks the size of Africa on a Mercator map.

Exactly the same geometric distortion. And it's why, if you open a photosphere in a regular image viewer that doesn't understand the metadata, it looks bizarre. The top third is smeared sky, the bottom third is stretched ground, and the middle looks sort of normal but weirdly curved. You'd think the file was corrupted.

Which is probably what happens when Daniel's OnePlus panorama ends up somewhere that does expect a photosphere. The handshake fails in the other direction.

If you take a standard panorama and manually inject photosphere metadata, the viewer will try to wrap it onto a sphere, and it'll look like you're standing inside a funhouse. The geometry has to match the metadata. You can't just lie to the viewer and expect it to work.

The viewer is the other half of this equation. Daniel asked what it's called.

Spherical viewer, or three-sixty-degree viewer. On Android, the most common one is built right into Google Photos. When Google Photos detects the GPano metadata in a JPEG, it adds a small icon, usually a circle with arrows or a little globe, and tapping it switches to a gyroscope-driven mode where you can tilt and rotate the phone to look around. It's using the same sensor fusion that powers AR features. Third-party viewers like Kuula or the open-source Pannellum library do the same thing in a web browser, rendering the equirectangular image onto the inside of a WebGL sphere and letting you drag or tilt to look around.

This is where the timing of Daniel's question lands well. Android fifteen and Google's spatial media push.

It's a quiet convergence. Google has been adding spatial photo and video support to Pixel devices, and the underlying CameraX API keeps getting better hooks for multi-frame capture and gyroscope-assisted alignment. The tools that used to require a dedicated app are slowly trickling into the default experience. But the terminology gap is still huge. Most people don't know the word photosphere, don't know what equirectangular means, and don't know that the magic is in the metadata. They just know that sometimes the photo spins and sometimes it doesn't.

Which is exactly the fluke Daniel described. He's been getting these images by accident because he stumbled into an app that does the multi-row capture and metadata injection, while his default camera does the fast single-row thing and calls it a day.

That distinction, between a panorama and a photosphere, is the entire episode in two words. Everything else we're going to dig into — the stitching algorithms, the XMP namespace, the manual desktop workflow, the sharing platforms — all of it flows from understanding that those are two fundamentally different formats that happen to look similar when you glance at a thumbnail.

The thumbnail is the great deceiver.

A photosphere thumbnail just looks like a slightly curved wide shot. You don't know what you've got until you open it and try to move.

Let's lay out the vocabulary before we go deeper. A standard panorama is a wide image stitched from a single row of photos, saved as a flat JPEG with no spherical metadata. A photosphere, also called a spherical panorama or a three-sixty photo, is an equirectangular projection stitched from a multi-row grid, with XMP metadata that tells the viewer to wrap it onto a sphere. The viewer is a spherical viewer, most commonly Google Photos on Android, and it uses the phone's gyroscope to let you rotate the view. And the reason Daniel's OnePlus doesn't do this natively is that its panorama mode is optimized for speed and simplicity, not for spherical capture.

That's the map. And I want to underline one thing about the metadata, because it's the part that surprises people the most. The XMP tags aren't just a binary flag that says "sphere on" or "sphere off." They describe the exact geometry of the capture. There's GPano colon FullPanoWidthPixels and FullPanoHeightPixels, which define the full equirectangular canvas. Then there's CroppedAreaImageWidthPixels and CroppedAreaImageHeightPixels, and CroppedAreaLeftPixels and CroppedAreaTopPixels, which define the rectangle within that canvas that actually contains image data. So if you captured a partial sphere, maybe just a hundred and eighty degrees horizontally and ninety degrees vertically, the metadata tells the viewer exactly where the image sits on the virtual sphere and leaves the rest blank. The viewer then only renders the populated region and lets you rotate within those bounds.

It's not just "this is a sphere." It's "this is a sphere, here's where the image starts and stops, don't let the user spin into empty blackness.

And that's why a partial photosphere feels so natural. You can rotate freely within the captured area, but you hit a soft boundary at the edges. The metadata is drawing an invisible fence around your photo.

Which means if you wanted to fake a full sphere from a partial capture, you'd need to edit those crop boundaries in the metadata and fill the empty canvas with something, which is a whole other art form.

It's doable. You can take a partial photosphere, expand the canvas in Photoshop or GIMP, use content-aware fill or manual painting to extend the sky and ground, update the crop metadata to match the new full canvas, and suddenly you've got a complete three-sixty sphere. The results range from convincing to nightmarish depending on the complexity of the scene and your patience.

Nightmarish photospheres sounds like a genre I'd like to explore.

There's probably a subreddit for it. But the point for Daniel is that the metadata is doing heavy lifting at every stage. It's not a footnote. It's the mechanism that makes the whole thing work, and understanding it is the difference between getting the fluke and controlling the outcome.

The fluke is what brought him here. He's been accidentally making photospheres with a third-party app, wondering why his native camera won't cooperate, and now he knows. The native camera is making panoramas. The third-party app is making photospheres. Same phone, same sensor, completely different output because of what gets written into the file after the shutter closes.

Or rather, after the shutters close. A photosphere is multiple exposures, each one a separate shutter actuation, and the real work happens in the stitching. Which is where we should go next.

Yeah, let's pop the hood on that stitching process. How does the phone take nine or twelve individual photos and turn them into one seamless sphere without visible seams or ghosting?

The short answer is feature matching, and the algorithm family that dominates this space is SIFT, Scale-Invariant Feature Transform, or its faster cousin ORB. Both scan each image for distinctive points — corners, edges, texture patterns — and describe those points in a way that's robust to changes in scale, rotation, and lighting. Then they compare the feature sets between overlapping images and find correspondences, essentially saying "this cluster of pixels in image A is the same physical feature as this cluster in image B." Once you have enough correspondences, you can compute a geometric transform that warps image B to align with image A.

This is happening for every pair of overlapping images in the grid.

Every pair, and the grid makes it harder than a single-row panorama because each image overlaps with multiple neighbors. In a three-by-three grid, the center image overlaps with eight others. The algorithm has to solve for all of those alignments simultaneously, which is called a bundle adjustment. It's optimizing the position and orientation of every image in three-dimensional space to minimize the total alignment error across all the overlaps. That's computationally expensive, which is why photosphere stitching takes longer than panorama stitching.

This all runs on the phone's processor. Not in the cloud.

On-device, which is impressive. A modern phone chip has dedicated image signal processors and neural engines that can accelerate feature matching, but it's still a heavy lift. Google Camera's implementation is particularly well-optimized. It uses the gyroscope data to get a rough initial estimate of where each image sits relative to the others, which dramatically reduces the search space for feature matching. Instead of comparing every image to every other image blindly, it knows roughly which images overlap and by how much, so it only has to refine the alignment rather than discover it from scratch.

That gyroscope trick is clever. It turns a computer vision problem into a sensor fusion problem and makes the whole thing faster.

It's the same principle that makes AR work. The phone always knows its orientation in space, and that prior knowledge prunes the computation. Occipital's old three-sixty Panorama app was built entirely on that idea, using the gyroscope for alignment with almost no feature matching at all, which was fast but less accurate for scenes with close foreground objects where parallax becomes an issue.

Parallax being the thing where objects at different distances shift relative to each other when you move the camera.

Right, and it's the Achilles' heel of any stitching algorithm. If you rotate the camera around the lens's optical center, parallax is minimal and everything aligns beautifully. But if you're holding the phone at arm's length and pivoting around your body, the lens is moving through space, and nearby objects shift against the background. The stitching algorithm has to warp and blend those misalignments, and sometimes it fails. That's when you get ghosting, double images, or those weird stretched artifacts where a person's face gets smeared across two frames.

The photosphere fail. I've seen those. Someone's arm becomes a translucent fan.

And it's worse with multi-row capture because you're moving the phone in two axes. Single-row panoramas mostly deal with horizontal parallax. Photospheres have to handle vertical parallax too, which is why the capture process guides you to rotate the phone around a fixed point rather than waving it around. The dot on the screen is trying to keep your pivot consistent.

The capture UI is doing half the work of the stitching algorithm just by constraining how you move.

The interface is part of the engineering. Google Camera's Photo Sphere mode doesn't just capture images. It's a guidance system that enforces the geometric constraints the stitcher needs to produce a clean result. The dot, the target circle, the grid pattern — all of it is designed to produce a set of images with predictable overlap and minimal parallax. It's a beautiful piece of design that most users experience as just "follow the dot," without realizing they're participating in a fairly sophisticated computer vision pipeline.

That pipeline is completely absent from the OnePlus native camera, which is optimized for the exact opposite thing. Speed over geometry.

OnePlus made a choice. Their panorama mode is about getting a result in under a second with minimal user effort. It's a single sweep, real-time stitching, no gyroscope guidance beyond keeping the arrow on the line. The output is a flat JPEG with no spherical metadata because the capture geometry doesn't support spherical mapping. It's not a bug. It's a product decision.

A product decision that leaves Daniel with two options. Use an app that does the full photosphere pipeline, or build one himself with manual stitching on a desktop.

Both are completely viable paths. You can get a great result straight from the phone with the right app, or you can go full control-freak with a DSLR, a tripod, and an open-source stitcher, and the output format is exactly the same. The viewer doesn't care how the equirectangular JPEG was made, as long as the metadata is correct.

Which brings us to the question of what you do with these things once you've made them. Daniel asked about sharing platforms, and that's a whole landscape of its own.

It really is, and it's changed a lot in the last couple of years. The viewer technology is the same everywhere — a WebGL sphere with the equirectangular image mapped onto it — but the platform has to explicitly choose to render it that way when it detects the metadata.

Even if the image file contains the handshake, the platform has to be looking for it.

Has to be looking for it, and has to care. Twitter, for example, does not. You upload a photosphere to Twitter, it strips the metadata or ignores it, and you get a distorted rectangle. Facebook, on the other hand, has supported three-sixty photos since twenty sixteen, and as of twenty twenty-five it still does. It detects the ProjectionType equals equirectangular tag and automatically enables the interactive viewer. LinkedIn added support in twenty twenty-five as well, which surprised a lot of people. It's not a platform you associate with immersive photography, but it works.

LinkedIn being quietly competent at something is a recurring theme.

It's the dark horse of social media. But the platform I'd actually recommend for quality is Kuula. It's a dedicated three-sixty hosting service, built specifically for spherical photos and virtual tours. It doesn't compress the image as aggressively as Facebook, it gives you an embed code for your own website, and it supports features like custom hotspots and VR headset viewing. For someone who cares about presentation, it's the best option short of self-hosting.

Self-hosting means using something like Pannellum.

Pannellum is the go-to. It's an open-source JavaScript library, absurdly lightweight, and it handles equirectangular images natively. You drop the library into a webpage, point it at your JPEG, and it renders a fully interactive spherical viewer with mouse drag, touch, and gyroscope support. It reads the XMP metadata automatically, so if your file is tagged correctly, it just works. js can do it too with a sphere geometry and a texture map, but Pannellum is purpose-built and much simpler to set up.

The sharing stack, from easiest to most control, is something like Facebook or LinkedIn for quick social posting, Kuula for quality hosting with embeds, and Pannellum on your own site if you want full control over presentation and compression.

That's the stack. And I'd add Google Maps via the Street View app as a special case. If you capture a full three-sixty photosphere and publish it through Street View, it becomes part of Google's public map data. Anyone browsing Street View in that location can stumble onto your photo. It's not a sharing platform in the social media sense, but it's a fascinating way to contribute to the collective geographic record.

The idea that your photosphere of a park bench becomes part of Google's canonical map of the world is both impressive and slightly unsettling.

It's the Wikipedia model applied to physical space. Anyone can contribute, and the aggregate becomes more useful than any single capture. But you do lose control over how it's presented and used.

Which loops back to Daniel's original question about control. He's been getting photospheres by fluke. The whole arc of this conversation is about moving from fluke to intention, understanding what's happening under the hood so he can decide exactly how wide, how tall, how high-resolution, and where it ends up.

The good news is that the barrier to intention is low. The terminology is the hardest part, and we've covered that. Photosphere, equirectangular projection, XMP metadata, spherical viewer. Once you have those four concepts, the rest is just choosing your tools and your platform.

The tools range from "follow the dot on your phone" to "stitch a grid of raw files on a desktop," and the platform range is everything from LinkedIn to your own website. The common thread is that four-line handshake in the metadata.

Four lines of XML that turn a flat image into a window. It's one of those quiet pieces of engineering that millions of people interact with every day without ever knowing it exists.

Which is exactly the kind of thing this show exists to dig into.

Let's get into the mechanics. How does the stitching algorithm actually work when you're building one of these from scratch?

Let's start with the name, because that's the thing Daniel tripped over first. The rotatable images are called photospheres. That's Google's term, and it's stuck as the generic label even though technically they're spherical panoramas or three-sixty photos. The format underlying them is equirectangular projection, which is the same way you'd flatten a globe onto a rectangular map. The poles get stretched, the equator stays natural, and the whole thing looks wrong until the right software wraps it back onto a sphere.

A standard panorama and a photosphere might both be JPEG files, but they're fundamentally different shapes underneath.

Completely different geometries. A standard panorama is usually a cylindrical or rectilinear projection stitched from a single horizontal sweep. It's wide, but it's flat. You can pan left and right by scrolling, but you can't tilt up into the sky or down at your feet. A photosphere is a spherical projection, which means it captures a full or partial sphere of image data around the camera position. The viewer maps that data onto the inside of a virtual sphere, and you can rotate freely in any direction.

The thing that tells the viewer which geometry to use is the metadata.

The metadata is the whole game. A photosphere embeds XMP tags in a namespace called GPano, short for Google Panorama. The critical tag is GPano colon ProjectionType equals equirectangular. That one line tells any compatible viewer, "I am not a flat image, wrap me around a sphere." Without it, the viewer treats the file as a regular JPEG, and you get that bizarre stretched-rectangle look. With it, the viewer knows to render the image onto spherical geometry and let you look around.

Daniel's OnePlus native camera is producing panoramas that simply don't include that tag.

And it's not just the missing tag. The capture geometry itself is different. OnePlus's panorama mode does a single-row sweep, left to right, and stitches those frames into a wide cylindrical projection. There's no vertical coverage beyond what the lens sees in that one row, so even if you injected the equirectangular metadata manually, the image wouldn't work as a photosphere. It would wrap onto a sphere and look like a thin horizontal ribbon with black above and below. The capture has to match the metadata.

Whereas the dedicated app Daniel's using is doing a multi-row grid capture, covering both horizontal and vertical angles, and then embedding the correct metadata.

That's the photosphere workflow. Google Camera's Photo Sphere mode is the canonical example. You're not sweeping in one direction. You're tilting the phone up, down, left, right, following a dot that guides you through a grid of overlapping frames. The app captures maybe nine to twelve individual photos across three or four rows, then stitches them into a single equirectangular image and writes the GPano metadata. The result is a file that knows it's a sphere, and any compatible viewer will treat it accordingly.

That's why Daniel's getting these by fluke. He's using an app that does the full pipeline, while his default camera app does the fast single-row thing and calls it a day.

The fluke is actually the system working exactly as designed. He just didn't know there were two different systems.

The stitching algorithm itself is where things get genuinely impressive. When Google Camera captures those nine or twelve frames, each one overlaps its neighbors by about thirty to forty percent. The algorithm needs to find the exact points where those overlaps match up, and it does that with feature detection. The two workhorse algorithms are SIFT and ORB. They scan every image for distinctive points — corners, edges, high-contrast texture patches — and build a mathematical description of each point that stays recognizable even if the image is rotated, scaled, or shot at a slightly different exposure.

Once it has hundreds or thousands of those correspondences between any two overlapping images, it can compute a geometric transform, essentially a warping function that says "if I stretch and rotate image B by this exact amount, it will line up perfectly with image A." Then it does that for every overlapping pair in the grid. The center image in a three-by-three grid overlaps with eight neighbors, so the algorithm has to solve all of those alignments simultaneously. That's called bundle adjustment, and it's optimizing the three-dimensional position and orientation of every frame to minimize the total error across all overlaps at once.

Which is computationally expensive.

But Google Camera has a trick. It uses the phone's gyroscope data to get a rough initial estimate of where each frame sits in space relative to the others. Instead of blindly comparing every image to every other image, the algorithm already knows roughly which frames overlap and by how much. The feature matching only has to refine the alignment, not discover it from scratch. That prunes the search space dramatically and makes on-device stitching feasible in ten or fifteen seconds instead of minutes.

The gyroscope is doing half the work before the computer vision even starts.

It's sensor fusion in the best sense. The phone knows its orientation in space at all times, and that prior knowledge turns a hard computer vision problem into a much easier refinement problem. Occipital's old three-sixty Panorama app took that idea to its extreme, relying almost entirely on gyroscope alignment with minimal feature matching. It was fast but struggled with parallax errors when foreground objects were close to the camera.

Parallax being the thing where nearby objects shift position against the background when you move the lens.

It's the Achilles' heel of any stitching system. If you're holding the phone at arm's length and pivoting around your body, the lens is moving through space, not rotating around its optical center. A tree branch three feet away shifts dramatically against the sky, while mountains in the distance barely move at all. The stitching algorithm has to warp and blend those misalignments, and when it fails you get ghosting, double images, or that classic photosphere fail where someone's arm becomes a translucent smear across two frames.

The capture interface is designed to minimize exactly that. The dot on the screen, the target circle, the way it makes you hold still before each capture.

That guidance system is part of the engineering. It's trying to keep your pivot point as consistent as possible, rotating the phone around a fixed point in space rather than waving it around. The less translation, the less parallax, the cleaner the stitch. Most users experience it as just following a dot, but they're actually participating in a fairly sophisticated geometric constraint system that makes the algorithm's job tractable.

None of this exists in the OnePlus native camera because they made a different trade-off entirely.

OnePlus optimized for speed and simplicity. Single-row sweep, near-real-time stitching, minimal processing, small file size. The output is a flat JPEG with no spherical metadata because the capture geometry doesn't support spherical mapping. It's not a bug, it's a product decision. Multi-row photosphere capture requires more time, more storage, more battery, and a more complex user interface. Most manufacturers have decided that's not what the average user wants from their default camera app.

Which brings us to those third-party apps. Google Camera's Photo Sphere mode is the gold standard, but it's not the only option.

The Street View app from Google is actually a separate thing that also captures full photospheres, specifically for publishing to Google Maps. It uses the same underlying stitching pipeline but optimized for full three-sixty coverage. Then there's three-sixty Panorama by Occipital, and a handful of others. But as of mid twenty twenty-six, Google Camera's Photo Sphere mode remains the best on-device option for Android. It's available natively on Pixel devices, and there are ported versions floating around for some other phones, though your mileage varies depending on the specific hardware and Android version.

The file that comes out the other end is noticeably larger than a standard panorama.

A standard single-row panorama might run five to ten megabytes. A photosphere of the same scene can easily hit fifteen to twenty-five megabytes. Part of that is simply more pixel data — you're capturing a multi-row grid instead of a single sweep. But part of it is inherent to the equirectangular projection. The top and bottom of the sphere get stretched way out, so you need a higher base resolution to maintain acceptable detail in the center of the view. A good photosphere is often eight thousand by four thousand pixels. The metadata itself is tiny, just a few hundred bytes of XML, but the pixel payload is substantial.

That metadata, those few hundred bytes, is the difference between a window and a smeared rectangle.

It's one of the highest-leverage bits of data in consumer photography. Four lines of XML — GPano colon ProjectionType, FullPanoWidthPixels, CroppedAreaImageWidthPixels, CroppedAreaImageHeightPixels — and suddenly a flat JPEG becomes an immersive environment. The viewer reads those tags, constructs a virtual sphere, maps the image onto it, and hands control to the gyroscope. Without those tags, the same pixels are just a weirdly distorted photo that looks like a mistake.

That's how you capture a photosphere. But what if you want to go beyond what your phone can do, or share your creation with the world? Daniel asked specifically about the viewer technology, and about manual stitching for maximum quality. Let's tackle the viewer first.

The technical name is a spherical viewer or a three-sixty-degree viewer. On Android, the one most people already have is built into Google Photos. When Photos detects the GPano metadata in a JPEG, it adds a small globe icon, and tapping it switches to a gyroscope-driven mode. You tilt the phone, the view rotates. It's using the same sensor fusion that powers ARCore. For third-party options, there's an open-source Android app called Panorama Viewer that handles equirectangular images natively, and Kuula has a mobile app that adds VR headset support and custom hotspots.

Hotspots being clickable points you can embed in the sphere that link to other views or information.

It turns a single photosphere into a navigable tour. But the viewer Daniel's probably already using without realizing it is just Google Photos. It's the default for a reason.

If he wants to go bigger than what a phone can capture, stitch something from a real camera?

That's where the desktop workflow comes in, and it's the path to stunning results. The process is: capture a grid of overlapping photos manually, stitch them into an equirectangular projection on a computer, then inject the XMP metadata so viewers recognize it. The gold standard for stitching is an open-source program called Hugin. It's free, it's been around forever, and it handles multi-row stitching natively. You load your images, it detects feature points across all the overlaps, and you manually place a few control points to guide the alignment. The interface looks like it's from two thousand and four, but the math underneath is superb.

The alternative is PTGui.

PTGui is the commercial option, and it's more polished. Faster control point detection, better default blending, and a much cleaner interface. It costs about a hundred and fifty dollars for a personal license. For most people, Hugin is more than enough. The workflow with either one is roughly the same. You shoot a grid, maybe four columns by three rows with a DSLR on a tripod, making sure each frame overlaps its neighbors by about thirty percent. Load them into the stitcher, align, blend, and export as an equirectangular JPEG at the highest resolution your computer can handle. Ten thousand by five thousand pixels is a realistic target.

That's just a big flat image at that point. It won't rotate anywhere.

The metadata is the last step. You take that equirectangular JPEG and run it through a tool called the Photo Sphere Metadata Injector. It's a web app, dead simple. You upload the image, it writes the GPano tags — ProjectionType equals equirectangular, the full panorama dimensions, the crop area — and you download the tagged file. Alternatively, you can use ExifTool from the command line if you want to script it. The result is a file that any spherical viewer will recognize as a photosphere, even though it was stitched on a desktop from DSLR frames.

The pipeline is: capture a grid with overlap, stitch in Hugin or PTGui, inject metadata with the web app, and you've got a photosphere that looks dramatically better than anything a phone can produce.

A full-frame DSLR with a wide-angle lens captures way more detail per frame than a phone sensor, and the stitching software gives you control over seam placement and blending that a phone app just automates. The downside is time. Aligning control points in Hugin takes five or ten minutes of manual work, and the final render can take a while depending on resolution. But for something you want to print or present professionally, it's the only way to go.

Once you've got that file, the question becomes where to put it so people can actually spin it around. Daniel asked about sharing platforms.

The landscape is uneven. Facebook has supported three-sixty photos since twenty sixteen, and it's still one of the best places for quick sharing. It detects the ProjectionType equals equirectangular tag on upload and automatically enables the interactive viewer. The image appears in the feed as a normal post, but with a small compass icon, and people can drag or tilt to look around. LinkedIn added the same support in twenty twenty-five, which caught a lot of people off guard. It's not a platform you associate with immersive media, but it works.

Twitter does not.

Twitter does not. Upload a photosphere to Twitter and it appears as a distorted, stretched rectangle. The platform either strips the metadata or simply doesn't look for it. Same with Instagram, interestingly. For all its focus on visual media, Instagram doesn't support spherical photos natively. You can upload them, but they're flat.

Which makes Facebook the default social option.

For social sharing, yes. But the platform I'd actually recommend for quality is Kuula. It's a dedicated three-sixty hosting service built specifically for spherical photos and virtual tours. The compression is much less aggressive than Facebook's, it gives you an embed code you can drop into any website, and it supports features like custom hotspots, VR headset viewing, and even basic analytics. For someone who cares about how their photosphere looks, it's the best option short of self-hosting.

Self-hosting means Pannellum.

Pannellum is the go-to open-source library. It's absurdly lightweight, a single JavaScript file and a CSS file, and it reads the XMP metadata automatically. You drop it into a webpage, point it at your photosphere JPEG, and it renders a fully interactive spherical viewer with mouse drag, touch, and gyroscope support. js can do the same thing with a sphere geometry and a texture map, but Pannellum is purpose-built and much simpler to set up. If you know enough HTML to embed a YouTube video, you can embed a Pannellum viewer.

There's also Google Maps via the Street View app, which is a different kind of sharing entirely.

That's the public archive route. You capture a full three-sixty photosphere with the Street View app, publish it, and it becomes part of Google's map data. Anyone browsing Street View in that location can stumble onto your photo. It's not a sharing platform in the social media sense, it's contributing to a

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#3994: Panorama vs Photosphere: The Secret Metadata That Makes Photos Spin

Downloads

You Might Also Like

#3994: Panorama vs Photosphere: The Secret Metadata That Makes Photos Spin