#2685: Plugin Data Storage for AI Agents

How to separate user data from plugin code across Linux, macOS, and Windows in agentic AI environments.

Featuring

Daniel

Corn

Herman

Listen

0:00

Episode Details

Episode ID: MWP-2846
Published: May 7
Duration: 37:19
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: deepseek-v4-pro
Topics: data-storage ai-agents cross-platform

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

Daniel set out to build a suite of open-source Claude Code plugins and quickly hit a fundamental software design problem: how do you cleanly separate plugin code from user data, secrets, and preferences in a way that works across Linux, macOS, and Windows? Left to its own devices, the agent was dumping user data into the installation directory — the classic mistake of confusing application code with user content. The solution requires understanding decades-old operating system conventions and adapting them to the agentic context.

On Linux, the XDG Base Directory specification defines XDG_DATA_HOME, defaulting to ~/.local/share when unset. macOS uses ~/Library/Application Support. Windows splits between AppData/Roaming for preferences and AppData/Local for caches. But agents don't know these conventions unless explicitly instructed. The recommended pattern: a single environment variable (like PLUGIN_DATA_STORE) as the root, with each plugin namespacing itself underneath — backup-plugin writes to PLUGIN_DATA_STORE/backup-plugin/logs, Contentful plugin to PLUGIN_DATA_STORE/contentful-plugin. An onboarding skill detects the OS, proposes the correct default path, and persists the configuration to both the shell profile and a fallback config file.

For secrets, the principle of least knowledge applies. Plugins should never know how secrets are stored — they only request credentials by name. The plugin's instructions declare required secrets (CONTENTFUL_CMA_TOKEN, GITHUB_PAT) and their purpose, while the agent resolves them from whatever backend the user has configured, whether dotenv, Doppler, HashiCorp Vault, or the OS credential store.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2685: Plugin Data Storage for AI Agents

Daniel sent us this one — and I have to say, this is the kind of prompt that makes you realize how fast the tooling around these agents is maturing. He went on what he called a Claude Code plugin development binge, and the core question he landed on is actually a fundamental software design problem: when you're shipping agent skills as public, version-controlled plugins, how do you cleanly separate the plugin itself from user data, user secrets, and user preferences, in a way that works across Linux, Windows, and macOS, and flexes to whatever secret management system the end user already has? And he wants us to get into the specifics.

Oh, this is a great one. And by the way, today's script is coming to us courtesy of DeepSeek V four Pro. So if anything sounds especially brilliant, that's why.

We'll see if it holds up when we get into the weeds.

The thing that jumps out at me immediately is that Daniel basically rediscovered, inside the agentic AI context, a problem that desktop application developers have been solving for decades. The XDG Base Directory specification on Linux, the Application Support and AppData conventions on macOS and Windows respectively — these are well-trodden paths. But the twist here is that he's not building a traditional application. He's building a plugin that runs inside an agent's context, and the agent is the one making decisions about where to put things. That changes the game.

Because normally, if I'm writing a desktop app, I compile it, I ship a binary, and I program it to call some OS-specific API that returns the correct data directory. End of story. But here, the plugin is essentially a set of instructions the agent reads and interprets. The agent is the runtime. And the agent, left to its own devices, was dumping stuff into dot Claude inside the installation folder, which is exactly the wrong place.

It's the classic mistake of someone who doesn't understand Unix conventions. The installation directory is for the application itself. It should be treated as read-only after install. User data goes in the user's home directory, in a location the operating system has designated for that purpose. The agent doesn't know this unless you tell it.

Which brings us to Daniel's specific question. He's trying to define a pattern, not just for one plugin, but for an ecosystem of plugins he's building and open-sourcing. He wants a single, consistent approach to user data storage that any of his plugins can follow, regardless of the operating system, and regardless of the user's secret management setup.

Let me lay out what the actual standards are, because this is the foundation everything else builds on. On Linux, you've got the XDG Base Directory specification, which defines XDG_DATA_HOME. If that environment variable is set, you use it. If it's not set, the default is dot local slash share inside the user's home directory. That's where user-specific data files are supposed to live. On macOS, the convention is to use the Application Support directory inside the user's Library folder, and the proper way to locate that programmatically is through the NSSearchPathForDirectoriesInDomains API, but since an agent can't call Objective-C APIs directly, the practical path is tilde slash Library slash Application Support. On Windows, you're looking at AppData slash Roaming or AppData slash Local, depending on whether the data should roam with the user's profile across machines. Roaming for preferences and configuration, Local for caches and logs that don't need to sync.

Daniel mentioned that on Linux, there's actually a system-level variable for this, which Claude pointed out to him. That's XDG_DATA_HOME. But here's the thing that I think trips people up: XDG_DATA_HOME is not always set. It's an optional override. Most Linux distributions don't set it by default because the default fallback works fine. So if your plugin says "use XDG_DATA_HOME," you also need a fallback rule.

And the fallback is the one I just mentioned: dot local slash share. But even that needs to be fully qualified — it's relative to the user's home directory. So the full path on a typical Linux system would be something like slash home slash username slash dot local slash share slash plugin-name. And you'd construct that by checking if XDG_DATA_HOME is set, and if not, defaulting to tilde slash dot local slash share.

Daniel's specific worry was that Claude kept defaulting to dot Claude inside the installation directory. That tells me the system prompt or the skill instructions weren't explicit enough about where user data should go. The agent was pattern-matching to "I need a place to store data related to Claude" and picked the Claude directory, which is logical in the dumbest possible way.

It's the kind of mistake that makes perfect sense if you remember that the agent doesn't have the benefit of years of learning where things go on a Unix system. It's reasoning from first principles every time. You have to be explicit. And this is where I think Daniel's insight about workspaces is actually quite clever, because a workspace gives you a bounded context. The agent knows "this is my working directory." But the problem is that the workspace is also where the plugin lives, and he doesn't want user data commingled with plugin code.

Let's talk about the solution he landed on and whether it holds up. He's using an environment variable to point to a data store location. The plugin defines the variable name, and the user — or the onboarding process — sets it. On Linux, he's suggesting the agent should check XDG_DATA_HOME first and use that as a default. That's a solid approach, but I see a few edge cases.

Let's walk through them. The first edge case is: what if the user runs multiple plugins from Daniel, and they all need to store data? His approach of using a plugin-name subdirectory inside the data store is correct. You don't want plugin A's logs ending up in plugin B's directory. But the question is whether each plugin defines its own environment variable, or whether there's a single variable that all plugins share, with each plugin namespacing itself underneath.

The single variable approach is cleaner. Something like CLAUDE_PLUGIN_DATA_HOME, and then each plugin creates its own subdirectory. If every plugin defines its own variable, you end up with environment variable sprawl, and users have to configure a half-dozen things instead of one. Daniel seems to be leaning toward a consistent pattern, which suggests he's thinking about the single-variable approach.

That's what I'd recommend. Define one environment variable — call it something like PLUGIN_DATA_STORE or CLAUDE_USER_DATA — and have every plugin use that as the root, then append its own name. So the backup plugin writes to whatever that variable points to, slash backup-plugin, slash logs, slash whatever. The Contentful blog manager writes to the same root, slash contentful-plugin. Clean separation, one configuration point.

Now we hit the cross-platform question Daniel raised. He knows Linux. He's less sure about the conventions on macOS and Windows. And he wants the plugin to work for users on those systems without them having to manually figure out where things should go.

This is where the onboarding skill becomes critical. Daniel mentioned that each plugin should have an onboarding conversation with the agent, where the agent asks the user about their environment and configures things accordingly. That's actually a really elegant pattern. Instead of trying to write conditional logic that covers every possible system, you have the agent do discovery at setup time. The agent can run commands to detect the operating system, check for existing environment variables, look at what secret managers are available, and then set the appropriate configuration.

The onboarding skill would do something like: run uname to detect the OS, then on Linux check for XDG_DATA_HOME, on macOS suggest tilde slash Library slash Application Support slash plugin-name, on Windows check for APPDATA and suggest a path under AppData slash Roaming. It presents the user with the proposed location, asks for confirmation or a custom path, and then writes that to the environment variable or a config file that subsequent sessions will read.

This is where I want to get specific about implementation, because Daniel asked for specifics. The way I'd structure this is a two-layer system. Layer one is the plugin itself, which ships with a set of skills, MCP configurations, and a system prompt or instructions file. None of these contain hardcoded paths. They all reference a variable. In the instructions, you'd say something like: "Your user data directory is specified by the environment variable PLUGIN_DATA_HOME. If this variable is not set, run the onboarding skill to configure it. Store all persistent data in subdirectories of this path, organized by function: logs, cache, state, and so on.

That's the declarative approach. The agent reads the instruction and knows to check for the variable before doing anything that writes user data. But there's a subtlety here that I think a lot of people miss. Environment variables are session-scoped. If the user runs Claude Code in a new terminal window, the variable might not be set unless it's been added to their shell profile. So the onboarding skill also needs to handle persistence.

That's an excellent point. On Linux and macOS, you'd want to append the export command to dot bashrc or dot zshrc, or better yet, to a dedicated file like dot claude-plugin-config that gets sourced by the shell profile. On Windows, you'd set it as a user environment variable through the system settings or through a PowerShell profile. The onboarding skill should detect the user's shell and write the appropriate configuration. And it should probably also create a fallback config file in a known location — maybe dot config slash claude-plugins slash config dot json — so that if the environment variable isn't set in a particular session, the agent can still find the data directory.

I like the config file as a fallback. Environment variables are great for flexibility, but they're fragile. A config file in a well-known location is more robust. The agent checks the environment variable first, and if it's not set, it looks for the config file. If neither exists, it runs the onboarding skill. That's a resilient pattern.

Now, let's talk about secrets, because that's the other half of Daniel's question, and it's where things get genuinely tricky. The Contentful CMA token example is a good one. That's a secret that should never, ever end up in a log file, in version control, or in a plugin's public repository. And Daniel's concern is that different users will have different secret management setups. Some will use dot env files. Some will use Doppler or HashiCorp Vault or AWS Secrets Manager. Some will use their operating system's credential store. The plugin needs to work for all of them.

The plugin author can't possibly anticipate every secret backend a user might have. This is where the principle of least knowledge comes in. The plugin should not know how secrets are stored. It should only know how to ask for a secret by name. The actual resolution — where the secret comes from — is the user's responsibility to configure, and the agent's responsibility to implement at runtime.

How do you actually do that? I think the cleanest pattern is to define, in the plugin's instructions, a list of required secrets by name and purpose. Something like: "This plugin requires a Contentful CMA token, referenced as CONTENTFUL_CMA_TOKEN. It requires a GitHub personal access token, referenced as GITHUB_PAT. These secrets will be used for the following operations..." And then the onboarding skill handles the actual collection and storage.

Collection is the easy part. The onboarding skill asks the user for each secret, one at a time. The hard part is storage and retrieval. If the user says "I use Doppler," the onboarding skill needs to know how to configure Doppler access. If the user says "just put it in a dot env file," the skill needs to know where to put that file and how to ensure it's gitignored.

This is where I think the agent-centric approach actually has an advantage over traditional application development. In a traditional app, you'd have to write separate integrations for every secret manager you want to support. With an agent, you can give it a general instruction: "Ask the user how they manage secrets. If they use a tool you're familiar with, configure it. If they want a dot env file, create one in the user data directory, not in the plugin directory, and ensure it's excluded from version control. Store secrets in a way that survives across sessions.

The key phrase there is "not in the plugin directory." That's the whole problem Daniel is trying to solve. If the dot env file ends up in the plugin directory, it's one git commit away from being public. The secrets have to live in the user data directory, which is outside the plugin's version control entirely.

Let me propose a concrete directory structure, because I think that'll help make this tangible. In the user data directory — wherever PLUGIN_DATA_HOME points — you'd have something like this. For the backup plugin: a directory called backup-plugin, and inside that, a config directory, a logs directory, a state directory, and a secrets directory. The secrets directory contains a dot env file that's gitignored at the plugin level, but also isn't inside the plugin repository at all because the entire user data tree is outside the plugin directory.

The plugin's instructions would say: "Your secrets are stored in the secrets subdirectory of your user data directory. Load them at the start of each session. Do not write secrets to any other location. Do not include secrets in logs or error messages." That last part is critical. Agents love to be helpful and verbose, and if an API call fails, the agent might echo back the token it used in the error message. You have to explicitly tell it not to.

That's a real footgun. I've seen agents do exactly that — "I tried to connect using the token abc123xyz and got a 401 error.Now that token is in the chat history and potentially in logs. The instructions need to be explicit: redact secrets from all output.

We've covered data directories and secrets. Let's talk about the broader architectural question Daniel is really asking. He's building a plugin ecosystem. He wants a consistent pattern that works across all his plugins, across all operating systems, and across all user configurations. Is the environment-variable-plus-config-file approach sufficient, or is there a better abstraction?

I think it's sufficient, but I'd add one more layer. A plugin manifest. Each plugin ships with a small JSON or YAML file that declares its requirements: what data directories it needs, what secrets it expects, what environment variables it uses, what operating systems it supports. The agent reads this manifest at the start of a session and uses it to validate that everything is configured correctly. If something's missing, it runs the onboarding skill. This is much cleaner than having the agent discover problems through trial and error.

I like that. It's declarative infrastructure for agent plugins. The manifest says "I need these things," and the agent's job is to ensure those things exist before proceeding. It separates the what from the how, which is exactly the right abstraction level for agentic systems.

It solves the cross-platform problem Daniel was worried about. The manifest doesn't specify paths. It specifies needs. "I need a data directory for logs. I need a data directory for state. I need a secret called CONTENTFUL_CMA_TOKEN." The agent, equipped with the onboarding logic and the OS detection we talked about, figures out the appropriate paths based on the user's platform and preferences.

The manifest also helps with the team use case Daniel mentioned. If a team is using Doppler or HashiCorp Vault, the manifest tells the agent "this plugin needs these secrets," and the agent can be configured to fetch them from the team's secret manager rather than asking each developer to set them up individually. The plugin author doesn't need to know about Doppler. The manifest just declares the dependency, and the team's agent configuration handles the resolution.

Which is exactly the separation of concerns Daniel is aiming for. He wants to focus on shipping plugin updates — new skills, improved instructions, better workflows. He doesn't want to be in the business of managing user secrets or debugging path issues on Windows. The plugin is the plugin. The user's environment is the user's environment. The agent bridges the gap.

There's one more thing I want to dig into, because Daniel mentioned it and I think it's actually the most interesting part of his whole setup. He said Claude Code "miniaturizes Agentic AI." That's a really sharp observation. What he's describing is that the workspace format, combined with skills and MCP, creates a bounded, focused instance of an agent that's much more capable than a generic chatbot but much more constrained than a fully autonomous system. It's a Goldilocks zone.

It's the constraint that makes it powerful. A fully autonomous agent with access to your entire file system and all your APIs is terrifying and error-prone. But an agent that operates inside a workspace, with a specific system prompt, a specific set of skills, and a specific data directory — that's a tool. You know what it can do, you know where it stores things, and you can inspect its work.

Daniel's backup workspace is a perfect example. The system prompt says "your task is backing up this computer." The skills define how to do backups. The MCP gives access to the tools needed. The workspace provides a place for logs and state. Everything is scoped. If the agent goes off the rails, it can only go off the rails within that scope.

The plugin pattern he's building takes that a step further. He's packaging these scoped, purpose-built agent configurations and sharing them. Someone else can take his backup plugin, run the onboarding skill, and have a backup agent tailored to their system, with their data directory, their secrets, their preferences. The plugin provides the structure. The agent provides the adaptation. The user provides the specifics.

It's almost like a Docker container for agent behavior. The plugin defines the image. The user's environment provides the volumes and environment variables. The agent is the runtime. And just like Docker, the art is in getting the separation right so that the image is portable and the data is persistent.

It highlights why the data directory problem Daniel ran into is so fundamental. In Docker, if you don't mount a volume, your data disappears when the container stops. In agent plugins, if you don't specify a data directory outside the plugin, your data ends up in version control or in the installation directory. Same class of problem, different domain.

Let's synthesize this into a concrete implementation pattern, because Daniel asked for specifics. I'd structure it as four components. One: a plugin manifest that declares data directory needs, secret requirements, and supported platforms. Two: an environment variable, PLUGIN_DATA_HOME, that points to the root of the user's plugin data directory, with OS-appropriate defaults if not set. Three: a config file fallback in dot config slash claude-plugins, so sessions without the environment variable still work. Four: an onboarding skill that detects the user's OS, shell, and secret management preferences, and configures everything accordingly.

I'd add a fifth component: a secrets resolution protocol. The plugin's instructions should say: "When you need a secret, check the following sources in order. First, environment variables. Second, the secrets directory in the user data directory. Third, ask the user." This gives the agent a clear, predictable order of operations, and it lets users layer their own secret management on top. If they use Doppler, they can configure Doppler to inject environment variables, and the agent will pick them up from step one without knowing anything about Doppler.

For the dot env approach, which is probably what most individual users will go with, the onboarding skill creates a dot env file in the user data directory's secrets subdirectory, and the agent loads it at the start of each session. The key is that this dot env is not in the plugin directory. It's in the user data directory, which is outside version control entirely.

There's one edge case worth mentioning. What about secrets that are needed across multiple plugins? If Daniel's backup plugin and his Contentful plugin both need a GitHub token, does the user have to configure it twice?

That's where the single PLUGIN_DATA_HOME variable pays off. If the secrets are stored in a shared location — say, PLUGIN_DATA_HOME slash secrets slash dot env — and each plugin loads from that shared file, then a GitHub token configured once is available to all plugins. But you have to be careful about namespace collisions. If two plugins both expect a variable called API_KEY, they'll step on each other.

The manifest helps here too. Each plugin declares its expected secret names. The onboarding skill can check for collisions and warn the user. Or you can namespace the secrets by plugin: BACKUP_PLUGIN_GITHUB_TOKEN versus CONTENTFUL_PLUGIN_GITHUB_TOKEN. More verbose, but safer.

Daniel's instinct to find a consistent pattern is exactly right. The worst outcome would be each plugin doing its own thing, with data scattered across dot Claude, dot config, the desktop, and who knows where else. A single, well-documented convention — even if it's not perfect — is vastly better than fragmentation.

I think the convention he's converging on is solid. Use an environment variable as the primary pointer to user data. Default to OS-appropriate locations. Keep secrets out of the plugin directory. Use an onboarding conversation to handle the initial setup. Ship a manifest so the agent knows what it needs before it starts working. That's a pattern that scales from a single user on Linux to a team using enterprise secret management.

The only thing I'd caution about is over-engineering the onboarding. Daniel mentioned that the agent can "flex to the user's needs" based on that first interview. That's a great goal, but agents are still flaky enough that you want the happy path to be very short. Detect the OS, propose a default data directory, ask if that's okay, ask about secrets, write the config. Five minutes, tops. If the onboarding turns into a twenty-minute conversation about the user's entire computing philosophy, people are going to bounce.

That's the tension with agentic interfaces in general. The flexibility is the feature. The unpredictability is the bug. The skill of plugin design is in constraining the agent enough that it does the right thing ninety-nine percent of the time, while still giving it enough flexibility to handle the one percent of cases where the user has a weird setup.

Daniel's approach of shipping these as open source is interesting in that context, because it means the community can contribute the weird edge cases. Someone on Windows with a non-standard AppData location can submit a fix. Someone using Doppler can add the Doppler resolution logic to the onboarding skill. The plugin gets smarter without the author having to personally support every configuration.

It's the same dynamic that made open source software work in the first place. The maintainer handles the core logic and the common cases. The community fills in the long tail of platform-specific quirks and integrations. The difference is that instead of code contributions, a lot of these contributions might be prompt engineering — "here's the instruction that makes the agent handle this edge case correctly.

Which is a whole new kind of software development, really. You're not writing deterministic code that handles every branch explicitly. You're writing instructions that guide an agent's reasoning, and you're relying on the agent's general intelligence to fill in the gaps. The plugin is part code, part prompt, part convention. The art is knowing which parts to specify rigidly and which to leave flexible.

The data directory problem is a perfect example of something that should be specified rigidly. The convention — data goes here, secrets go there, never mix user data with plugin code — needs to be ironclad. The agent can be flexible about how it implements the convention on a given platform, but the convention itself shouldn't be negotiable.

Because the cost of getting it wrong is high. Daniel mentioned the worry about personal data propagating to everyone who has the plugin. That's not a hypothetical. One bad git add, one misplaced dot env file, and suddenly your API keys are in a public repository. The pattern has to make that mistake hard to make.

This is why I like the approach of keeping user data entirely outside the plugin directory. Not in a subdirectory, not in a dot file that's gitignored — completely outside the repository. If the plugin lives in slash home slash daniel slash projects slash backup-plugin, the user data lives in slash home slash daniel slash dot local slash share slash backup-plugin. There's no amount of sloppy git commands that can accidentally commit data from a completely different directory tree.

Unless the agent gets confused and writes to the wrong place. Which brings us back to the instructions needing to be explicit and the convention needing to be enforced by the plugin's system prompt, not just by user discipline.

Daniel should test the onboarding on a clean machine — or a VM — and verify that the agent actually writes to the correct directory, that secrets don't leak into logs, and that the config survives a new terminal session. The gap between "the instructions say the right thing" and "the agent actually does the right thing" is where bugs live.

Alright, let me try to pull this together into something actionable. If I were implementing this today, here's what I'd do. Step one: define the environment variable. I'd call it CLAUDE_PLUGIN_DATA and document that it should point to a directory outside any plugin repository. Step two: write a plugin manifest for each plugin that lists its data subdirectories and secret requirements. Step three: write an onboarding skill that detects the OS, proposes a default path, asks about secrets, and writes the configuration. Step four: in the plugin's main instructions, make it mandatory that the agent checks CLAUDE_PLUGIN_DATA before doing anything that writes data, and refuses to proceed if it's not set. Step five: test on all three platforms.

I'd add step six: write a post-install verification skill. After onboarding, the agent runs a quick check — can it write to the data directory, can it read the secrets, does the directory structure look right. If anything's off, it flags it immediately rather than waiting for a real task to fail.

That's a good call. Failing fast is always better than failing mysteriously three hours into a backup job. And it gives the user confidence that the plugin is set up correctly.

You know what strikes me about this whole discussion? Daniel is basically doing systems administration, but through the medium of prompt engineering. Twenty years ago, setting up a backup system meant editing config files and writing shell scripts. Now it means writing a system prompt and an onboarding conversation. The underlying principles — separation of concerns, least privilege, fail fast, OS portability — are exactly the same.

The tools change. The principles don't. And the people who understand the principles are the ones who build things that last, even when the tools are evolving this fast. Daniel's question is about where to put files, but it's really about how to design a plugin architecture that's going to hold up as the agentic ecosystem matures.

That's what makes it a good prompt. It's not just "how do I fix this bug." It's "what's the right pattern, and why, and how do I make it work for everyone?" Those are the questions worth spending thirty minutes on.

And on that note, we should probably wrap before we start designing a package manager.

I'm already thinking about dependency resolution between plugins.

Of course you are. And now: Hilbert's daily fun fact.

Hilbert: In the nineteen fifties, Soviet researchers in Tajikistan discovered that a rare mineral called tugtupite, when ground into pigment, produced a vivid crimson that shifted to a soft pink under ultraviolet light — a photochromic effect caused by trace sulfur impurities in the crystal lattice.

...right.

If you're ever in Tajikistan and your paint changes color, blame the sulfur.

I'll file that under things I didn't expect to learn today. This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop, and thanks to Daniel for a prompt that's going to save someone from accidentally open-sourcing their API keys.

If you enjoyed this, do us a favor and leave a review wherever you listen. Find more at myweirdprompts dot com.

We'll be back next time. Until then, keep your secrets out of your repos.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2685: Plugin Data Storage for AI Agents

Downloads

You Might Also Like

#2685: Plugin Data Storage for AI Agents