#3061: How Polygraphs Actually Work (And Why They Fail)

The DOE is phasing out polygraphs. Here's why the "lie detector" has never actually worked.

Featuring

Listen

0:00

Episode Details

Episode ID: MWP-3231
Published: May 24
Duration: 29:22
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: deepseek-v4-pro
Topics: national-security espionage human-intelligence

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The polygraph, commonly called a lie detector, measures only four things: heart rate, blood pressure, respiration, and galvanic skin response. It does not detect deception—it detects arousal. The machine has no way of knowing why someone's heart is racing, yet the entire industry rests on the unproven assumption that lying produces a unique physiological signature called the "Pinocchio response." After a century of searching, no such marker has been found.

The technology's origin story is a cautionary tale. John Augustus Larson built the first continuous polygraph in 1921 and quickly became horrified by what he'd created, calling it a Frankenstein's monster. William Moulton Marston—who later created Wonder Woman and her Lasso of Truth—claimed 96% accuracy based on anecdotes, not controlled studies. That number still gets cited today. The modern Control Question Test builds confirmation bias directly into its procedure, with examiners who know the case and have already formed opinions deciding what counts as a "significant" reaction.

The real-world failures are devastating. Aldrich Ames passed two polygraphs while actively spying for the Soviet Union. Wen Ho Lee was flagged as deceptive despite being innocent of all 59 charges. A 2003 National Academies review found that in real-world screening contexts—where the base rate of actual spies is vanishingly low—accuracy drops to near-chance levels. The DOE's April 2026 decision to phase out polygraphs for 15,000 employees may finally signal the beginning of the end for a technology that has caused immense damage with no scientific foundation.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#3061: How Polygraphs Actually Work (And Why They Fail)

Daniel sent us this one — do polygraph tests actually work for detecting truthfulness, and how far has the technology actually advanced? It's a great question, because the gap between what people think these machines do and what they actually measure is about the size of the Grand Canyon.

It really is. And the timing on this is remarkable. Just last month, in April, the Department of Energy announced it's going to phase out polygraph screening for all fifteen thousand of its employees by twenty twenty-eight. That's huge. They're citing a twenty twenty-five National Academies report that found no evidence polygraphs can detect deception above chance levels. This is an agency that's been using these things for decades, and they're walking away.

Which raises the obvious question — if the Department of Energy is finally saying "we're done," why are the FBI, CIA, and half the Fortune 500 still using them? What is actually happening when someone gets hooked up to one of these machines?

Let's start with what the polygraph actually measures, because this is where most people get it wrong. A polygraph records four things: heart rate, blood pressure, respiration rate, and galvanic skin response — that's how much your sweat glands are working. That's it. It's not a lie detector. It's an arousal detector. The machine has no way of knowing why your heart is beating faster. It could be deception, or it could be that you're terrified of being falsely accused, or you're angry about the question, or you just remembered you left the oven on.

It's basically a fancy mood ring with a printout.

That's not far off. Think of it this way — if I hooked you up to a polygraph and asked you whether you'd ever cheated on a test in high school, your heart rate might spike. Is that because you're lying about never having cheated? Or because you're embarrassed that you did? Or because the question reminded you of a completely unrelated stressful memory from high school? The machine can't tell the difference. It just sees a spike and records it.

The entire premise rests on what — the idea that lying causes a unique physiological signature?

The field calls it the Pinocchio response — the notion that deception produces an involuntary, measurable stress reaction that's distinct from every other kind of stress. And that's the core assumption that's never been proven. It conflates anxiety about being caught with anxiety about being accused. And it completely misses two categories of people: those who can lie without stress — psychopaths, trained intelligence officers — and those who stress while telling the truth, which is basically any innocent person in a high-stakes interrogation.

The machine can't tell the difference between "I'm lying" and "I'm terrified you won't believe me." Those look identical on the readout.

And that's not a bug in the design — it's a fundamental limitation of measuring peripheral nervous system activity. There is no known physiological marker that uniquely corresponds to deception. We've been looking for over a century, and we haven't found it. It would be like trying to diagnose a specific disease by only measuring whether someone has a fever. Yes, a fever tells you something is happening, but it doesn't tell you whether it's the flu, COVID, or a bacterial infection. The polygraph gives you a fever reading for the nervous system and then claims it can tell you the specific cause. It can't.

Which brings us to the origin story, because this whole enterprise was built on shaky ground from day one. Walk me through John Augustus Larson.

Larson was a Berkeley police officer in the early nineteen twenties — he had a PhD in physiology, which was unusual for a cop. In nineteen twenty-one, he built the first continuous polygraph by combining a blood pressure cuff with a pneumograph for measuring respiration. The blood pressure cuff idea came from William Moulton Marston — we'll get to him in a second. Larson's machine was first used in nineteen twenty-two to help exonerate a suspect in a theft case at a Berkeley sorority house. And for a moment, it looked like science had solved the problem of deception.

Within a year, the same machine was being used to coerce confessions from innocent people.

Larson himself became horrified by what he'd created. He later called the polygraph a Frankenstein's monster and spent years trying to warn people about its limitations. But by then, the genie was out of the bottle. And what's striking about Larson is that he was actually a careful scientist by the standards of his day. He understood the limitations. He just couldn't control what other people did with his invention. It's a classic story in the history of technology — the inventor loses control of the thing they built, and it takes on a life of its own.

That's because of Marston.

William Moulton Marston. This guy is one of the most fascinating and problematic figures in the whole story. He claimed his systolic blood pressure test was ninety-six percent accurate at detecting lies. Where did that number come from? He had no controlled studies — just anecdotes, mostly from criminal cases where he'd already decided who was guilty. But Marston was a master self-promoter. He testified in court cases, he wrote articles for popular magazines, he appeared in advertisements for Gillette razors claiming the polygraph proved their blades were superior.

This is also the guy who created Wonder Woman, correct?

And Wonder Woman's Lasso of Truth — which compels anyone bound by it to tell the truth — that's Marston's polygraph fantasy made literal. He genuinely believed he'd invented a truth serum, and he spent his entire career selling that myth to the public. The ninety-six percent figure still gets cited today by polygraph examiners defending their profession. It's a marketing number from the nineteen twenties that's never been replicated in a controlled study. Let that sink in. The foundational accuracy claim for an entire industry is a century-old advertising slogan.

The public perception of infallibility was manufactured before the technology was even refined.

Manufactured and then reinforced by decades of institutional use. Here's how a modern polygraph exam actually works, and this is where it gets troubling from a scientific standpoint. The standard method is called the Control Question Test, or CQT. The examiner asks two types of questions. Relevant questions — "Did you steal the money?" "Did you leak the classified document?" And control questions — "Have you ever lied to a friend?" "Have you ever taken something that didn't belong to you?

The control questions are designed to make everyone uncomfortable.

And that's the mechanism. The assumption is that a truthful person will react more strongly to the control questions — because they're worried about being exposed as someone who's done something wrong in their past — while a deceptive person will react more strongly to the relevant questions, because those are the ones that threaten them. The examiner compares the physiological responses and makes a judgment about which set of questions triggered more arousal.

The control questions are completely subjective. The examiner decides what counts as a "significant" reaction.

The examiner knows which questions are which. They know the case. They've usually already formed an opinion about the subject's guilt. Imagine a study where the person grading the test already knows which subjects are supposed to pass and which are supposed to fail, and they're the ones deciding what counts as a meaningful score. No scientific journal would accept that methodology. It's confirmation bias built into the procedure. The Wen Ho Lee case is the textbook example of how this goes wrong. Lee was a Taiwanese-American scientist at Los Alamos National Laboratory. In nineteen ninety-nine, he was accused of espionage — stealing nuclear secrets for China. He was given a polygraph, and the examiner determined he'd failed. But here's the thing: the examiner already believed Lee was guilty before administering the test. The questioning was adversarial from the start. Lee was eventually cleared of all fifty-nine charges. The polygraph results were never admitted in court, because they wouldn't have survived a Daubert hearing — but the damage was done. His reputation was destroyed, and he spent nine months in solitary confinement.

That's a false positive — an innocent person flagged as deceptive. But you've got the opposite problem too.

Aldrich Ames is the case that should have ended polygraph screening for national security forever. Ames was a CIA officer who spied for the Soviet Union for nine years. During that period, he passed two polygraph exams — one in nineteen eighty-five and one in nineteen eighty-six — while actively betraying agents. At least ten Soviet assets working for the US were executed because of information Ames provided. And the CIA's own post-mortem found that the polygraph gave Ames what they called "a false sense of security." He knew he could beat it, and he was right.

How did he beat it?

Ames later said he just stayed calm. He was a trained intelligence officer who understood that the machine measures arousal, not deception. He didn't need to do anything elaborate — he just needed to not be anxious while lying. And that's the fundamental asymmetry here. The people you most want to catch — trained spies, psychopaths, sophisticated criminals — are exactly the people who can most easily defeat the test. Meanwhile, anxious innocent people get flagged.

The polygraph fails in both directions. It catches innocent people and misses guilty ones. That's not a detection tool — that's a randomizer.

We have the data to back that up. In two thousand three, the National Academy of Sciences conducted the most comprehensive review of polygraph research ever done. They looked at fifty-seven studies. Their conclusion: polygraph testing discriminates between truth and deception at rates, and I'm quoting, "well above chance, but well below perfection." But here's the critical caveat — in real-world settings, where you don't know the base rate of deception in the population being tested, accuracy drops to near-chance levels.

Unpack that for me. Why does the base rate matter?

Because if you're screening ten thousand employees and only one of them is an actual spy — which is roughly the real-world scenario for the DOE or the CIA — even a test that's eighty percent accurate will generate so many false positives that you'll spend all your time investigating innocent people. If your test has a twenty percent false positive rate and you're screening ten thousand people with one actual spy, you'll flag two thousand innocent people and you might — might — catch the spy. That's not a screening tool. That's an institutional harassment machine. And those two thousand false positives? Each one of those is a person whose career gets derailed, whose colleagues start whispering, whose security clearance gets suspended while they're being investigated. The human cost is enormous even before you get to the question of whether you caught the spy.

The NAS specifically warned against using polygraphs for national security screening.

They said, explicitly, that the polygraph should not be used for employee screening in national security contexts because the error rates are too high and the consequences of false positives — careers destroyed, lives ruined — are too severe. And the government basically ignored that report for twenty-three years.

Until the DOE phase-out last month.

Until the DOE phase-out. And the numbers that drove that decision are staggering. The DOE was spending forty million dollars a year on polygraph screening. Between twenty ten and twenty twenty-five, exactly zero espionage cases were initiated based on polygraph results. Forty million dollars a year, fifteen thousand employees, zero spies caught. That's the definition of security theater. It's like installing an expensive security camera system that records nothing but static, and then continuing to pay the maintenance contract for fifteen years because nobody wants to admit they bought a dud.

We just saw another case that proves the point.

Teixeira is the Massachusetts Air National Guardsman who leaked classified documents on Discord — one of the most significant intelligence leaks in recent years. He passed a polygraph in twenty twenty-two as part of his security clearance renewal. The leak wasn't discovered through polygraph screening. It was discovered through social media monitoring. The polygraph gave the military a false sense of security — "we checked, he's clean" — while the actual threat was posting classified documents in a chat room.

We've established the polygraph doesn't work. The natural follow-up is: what about the newer technology? fMRI lie detection, voice stress analysis, thermal imaging. Are any of these actually better?

They all face the same fundamental problem. There is no unique physiological or neurological signature for deception. Let's go through them. fMRI lie detection — functional magnetic resonance imaging — emerged in the early two thousands. Companies like No Lie MRI, founded in two thousand six, and Cephos, founded in two thousand four, claimed they could detect deception with over ninety percent accuracy by looking at brain regions associated with cognitive control — the prefrontal cortex, the anterior cingulate. The idea is that lying requires more cognitive effort than telling the truth, so you'd see increased activation in those areas.

Which sounds more scientific than a blood pressure cuff.

It looks more scientific. You get beautiful brain scans with colored blobs. But a twenty sixteen study by the MacArthur Foundation Research Network on Law and Neuroscience found that fMRI lie detection fails completely when subjects use countermeasures. Simple things like mental arithmetic — counting backward by sevens — or biting your tongue during control questions. These countermeasures create enough neural noise to mask any deception-related activity. As of May twenty twenty-six, no fMRI lie detection evidence has been admitted in any US court. Not one case.

It's the same problem in a more expensive box.

A much more expensive box. An fMRI machine costs millions of dollars. A polygraph costs a few thousand. You've upgraded the hardware but not the concept. It's like building a million-dollar device to measure whether someone has a fever and then claiming it can diagnose specific lies. The pretty brain scans give it an aura of scientific legitimacy, but underneath the colored blobs, it's the same flawed assumption — that deception leaves a unique trace in the body.

What about voice stress analysis? I've heard of insurance companies using this.

Voice stress analysis, or VSA, is sometimes called the "truth phone." The claim is that lying produces micro-tremors in the vocal cords that can be detected by software analyzing voice recordings. The US Government Accountability Office tested this in two thousand four. They found it performed at chance levels. A coin flip. Despite that, VSA is still used by some police departments and insurance companies for fraud screening. It's cheaper than a polygraph — you just need a phone call — but it's just as scientifically bankrupt. And here's the insidious thing about VSA — because it can be done over the phone without the subject even knowing, it's sometimes used in customer service calls or insurance claim interviews where the person on the other end has no idea they're being "tested." You're being judged by a coin flip and you don't even know it's happening.

I remember reading about a "Pinocchio effect" around the eyes.

In two thousand two, researchers at the Mayo Clinic found that blood flow increases around the eyes during deception — they called it the Pinocchio effect, yes. The idea is that the cognitive effort of lying triggers increased blood flow to the periorbital region. But a twenty nineteen replication study found the effect disappears when subjects are trained in countermeasures. And this technology has migrated into airport security. The TSA's SPOT program — Screening Passengers by Observation Techniques — uses behavioral cues, including thermal imaging in some implementations, to identify suspicious passengers.

How well does SPOT work?

A twenty thirteen GAO report found that SPOT officers referred passengers for secondary screening at rates no better than random chance. The GAO recommended that TSA stop funding the program. TSA ignored them. As of twenty twenty-six, SPOT is still operational at major airports, still consuming hundreds of millions of dollars, still producing results indistinguishable from guessing. Think about what that means in practice. Every day, thousands of people get pulled aside for additional screening based on a system that's literally no better than flipping a coin. That's not security. That's an expensive inconvenience generator.

There's a pattern here. Every few decades, we repackage the same failed premise in new technology, give it a scientific-sounding name, and sell it to institutions that are desperate for a technological solution to the problem of trust.

The institutions keep buying it because the alternative is admitting that you can't reliably detect deception with a machine. That's a hard thing for a security agency to accept. It means you have to do actual investigation — interviews, evidence gathering, corroboration. Messy, slow, expensive human work.

Which brings us to what the DOE is doing instead.

The DOE isn't just dropping polygraphs and doing nothing. They're replacing them with continuous evaluation programs — monitoring financial records, travel patterns, social media activity. The idea is that instead of hooking someone up to a machine once every five years and asking them if they're a spy, you continuously monitor for behavioral indicators that something might be wrong. It's not perfect either — there are privacy implications, obviously — but it's based on actual behavior, not physiological arousal.

That's how they caught Teixeira. Not with a polygraph, but by noticing that classified documents were appearing on Discord and tracing them back to him.

The polygraph gave him a clean bill of health. The behavioral monitoring caught him. That's the future of security screening, for better or worse. And this is the key distinction — the polygraph asks "are you a threat?" and hopes your body betrays you. Continuous evaluation asks "are you doing things that threats do?" and looks at actual behavior. One is mind-reading, the other is pattern recognition. Only one of them has any empirical grounding.

If someone listening is asked to take a polygraph — for a job application, for a security clearance — what should they know?

First, know your rights. The nineteen eighty-eight Employee Polygraph Protection Act prohibits most private employers from requiring polygraph tests. There are exceptions — security firms, pharmaceutical companies handling controlled substances, and a few others — but in general, if you're applying for a job at a private company and they ask for a polygraph, you can refuse and they can't legally retaliate. Federal employees have fewer protections. The FBI, CIA, NSA, and other agencies can still require polygraphs as a condition of employment or clearance.

Though the DOE phase-out suggests that's changing.

The DOE decision is the first major crack in what's been an eighty-year institutional consensus. When an agency that size says "we're done with this," it creates a precedent. Other agencies will face pressure to justify why they're still using a technology that their peer agency abandoned after finding zero evidence of effectiveness. It's going to be harder for the FBI to defend its polygraph program in congressional hearings when someone can point across the table and say, "The Department of Energy ran the numbers, found zero spies caught in fifteen years, and cut the program. What do your numbers show?

The second thing people should understand is that the real danger of polygraphs isn't the false positives — it's the false sense of security. You mentioned this with Ames and Teixeira. The polygraph creates a "we checked, they're clean" illusion that lets actual threats operate undetected.

That's the organizational psychology dimension. If you're in a security role, or you're advising an organization on security practices, the message is: push for continuous evaluation over periodic polygraphing. A polygraph once every five years is a snapshot of someone's nervous system on a particular Tuesday afternoon. It tells you nothing about what they might do on Wednesday. Continuous evaluation — financial monitoring, travel tracking, social media analysis — that's ongoing and behavior-based. It's not perfect, and it raises its own civil liberties concerns, but at least it's measuring something real.

The civil liberties piece is worth flagging. Continuous evaluation means the government is watching your bank account, your travel, your online activity. There's a real tension between security and privacy here.

And that's a conversation we should have. But the polygraph isn't the privacy-friendly alternative — it's just the ineffective one. It invades your privacy too — polygraph examiners routinely ask about sexual behavior, drug use, political beliefs — and it doesn't even produce useful information. You get all the intrusion with none of the security benefit. At least with continuous evaluation, the intrusion is tied to things that actually correlate with espionage risk. Polygraphs just dig through your personal life because they need to find something to make you nervous about.

If someone wants to dig deeper into the science, where should they start?

The two thousand three National Academies report, "The Polygraph and Lie Detection," is available free online. It's comprehensive, readable, and devastating. The twenty twenty-five follow-up report is behind a paywall, but the executive summary is public and it essentially reaffirms and strengthens the twenty zero three conclusions. For the history, Ken Alder's book "The Lie Detectors" from two thousand seven is the definitive account — it covers Larson, Marston, and the whole strange century of trying to turn anxiety into evidence.

I want to circle back to something you said earlier about there being no unique signature for deception. Because that's the through-line here. The polygraph, fMRI, voice stress analysis, thermal imaging — they're all looking for a physiological tell that doesn't exist. And the reason it doesn't exist is that lying isn't one thing. It's a complex cognitive behavior that looks different in different people, in different contexts, for different reasons.

That's the lesson that should worry us as we look at the next generation of deception detection technology. AI-powered behavioral analysis — facial micro-expressions, voice pattern analysis, writing stylometry. These are all statistical pattern-matching exercises. They find correlations between certain behaviors and certain outcomes in training data. But correlation isn't causation, and the training data is always limited. We risk repeating the polygraph mistake — assuming that a statistical pattern is a truth serum. You train an AI on videos of people lying and telling the truth in a lab setting, and it learns the patterns of lab subjects performing for researchers. That doesn't mean it'll work on a real spy in a real interrogation.

The polygraph industry sold the idea that anxiety equals deception. The fMRI industry sold the idea that cognitive effort equals deception. The AI industry is going to sell the idea that some pattern in your face or voice or word choices equals deception. But it's the same category error in a new package.

Institutions will keep buying it because the alternative — admitting that you can't automate trust — is uncomfortable. It means investigations are hard. It means some spies will go undetected. It means security is probabilistic, not binary. That's a harder sell than "this machine can tell if you're lying." People want a magic button. They want to push it and get a green light or a red light. The messy truth is that security doesn't work that way and never has.

The DOE phase-out matters because it's an institution finally saying: we tried the machine, it didn't work, we're going to do the harder thing instead. That's rare in government.

It really is. And the forty million dollars a year they're saving isn't nothing. But the bigger win is that they're dismantling a system that created a false sense of security while harassing innocent employees. That's the polygraph's legacy, and it's been the legacy for a hundred years. Every few years, some agency does an internal review, finds the polygraph doesn't work, and then keeps using it anyway because the institutional cost of admitting failure is too high. The DOE actually followed through. That's the exception that proves the rule.

To answer the original question directly — do polygraph tests actually work for detecting truthfulness? They measure physiological arousal, which correlates with deception no better than a coin flip in real-world settings. And how far has the technology advanced? Not very far at all. The fMRI and AI-based successors face the same fundamental problem — there's no unique signature for deception — and are vulnerable to the same countermeasures. We've spent a century trying to build a truth machine, and we're no closer than John Augustus Larson was in nineteen twenty-one.

The only thing that's advanced is our understanding of why it doesn't work. And our willingness, finally, in some corners of government, to act on that understanding.

Now: Hilbert's daily fun fact.

Hilbert: In nineteen twelve, a buzkashi tournament in Lhasa was delayed for three days because the carcass used for play refracted light so intensely under the high-altitude Tibetan sun that riders could not distinguish it from the surrounding dust until dusk each evening.

...right. I have so many questions. Buzkashi is the Central Asian sport where riders on horseback compete to drag a goat carcass across a goal line, correct?

Hilbert: The carcass is typically beheaded and disemboweled, then soaked in water overnight to toughen the hide. The Lhasa incident involved a particularly large specimen that had been prepared with an unusual amount of salt in the soaking solution, which created a crystalline surface layer that acted as a prism under direct sunlight.

The goat carcass became a disco ball.

Hilbert: A disco ball that weighed approximately forty kilograms and was being dragged at high speed by mounted riders, yes.

I'm impressed that you know the salt content of a goat carcass from nineteen twelve Tibet.

Hilbert: The tournament records were kept by a British diplomatic attaché who had been sent to Lhasa to negotiate trade agreements. He attended the tournament as a cultural observer and wrote a seventeen-page report on the delay, which he attributed to "atmospheric conditions of a most peculiar character." The report is in the British Library. I can provide the shelf mark.

Of course it is. Thank you, Hilbert.

I'm going to be thinking about prismatic goat carcasses for the rest of the day.

Where does this leave us? The polygraph persists because of institutional inertia — agencies have used it for eighty years, and replacing it means admitting billions of dollars were wasted. The DOE phase-out is the first major crack in that wall, and it probably won't be the last. But the deeper lesson is that no technology can replace a well-designed investigation. The polygraph promised to turn security into a binary — truthful or deceptive, clean or dirty — and that promise was always a fantasy.

The future of deception detection isn't a better machine. It's better investigation, continuous behavioral monitoring where appropriate, and the humility to admit that some threats will evade detection. That's less satisfying than the fantasy of a truth serum, but it has the advantage of being real. And after a hundred years of chasing the fantasy, real is starting to look pretty good.

This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop. If you enjoyed this episode, rate us five stars and tell a friend — we're trying to beat the algorithm. Find us at myweirdprompts dot com.

I'm Herman Poppleberry.

I'm Corn. We'll catch you next time.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#3061: How Polygraphs Actually Work (And Why They Fail)

Downloads

You Might Also Like

#3061: How Polygraphs Actually Work (And Why They Fail)