#753: Beyond SEO: The Guide to Agentic Behavior Optimization

Move beyond search engines and learn how to make your website the primary source for the next generation of AI agents.

0:000:00

Episode Details

Published: Feb 21
Duration: 31:51
Audio: Direct link
Pipeline: V4
TTS Engine
LLM
Topics: ai-agents large-language-models semantic-web

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The internet has entered a new era. We have moved from the age of discovery, where humans "hunted" for information through search engine results, to the age of synthesis. Today, users interact with AI agents that digest the web on their behalf. This shift requires a fundamental change in how websites are built and managed, moving from Search Engine Optimization (SEO) to Agentic Behavior Optimization (ABO).

The Foundation of Semantic Structure

The first step in making a website AI-friendly is returning to the basics of clean, semantic HTML5. While human visitors see visual layouts and CSS, AI agents see code. When a site uses generic tags for everything, it forces the agent to use more "tokens" to parse the noise from the signal. This creates a "token tax" that makes the content less efficient for a machine to understand.

By using specific tags like <article>, <nav>, and <header>, webmasters provide a firm "handshake" to the crawler. This clarity increases the model's confidence in the data it finds. In a world of probabilistic models, confidence is the currency that determines which source an AI agent chooses to trust and present to the user.

Leveraging the Knowledge Graph

Beyond basic structure, the use of Schema.org and JSON-LD has become critical. This structured data acts as a direct API for AI agents, allowing them to identify specific entities—prices, reviews, authors, and locations—without ambiguity.

This is no longer just about getting a "rich snippet" on a search page. It is about Knowledge Graph Integration. By clearly defining the relationships between authors, organizations, and research, a website can verify its authority. When an AI agent can verify an entity's attributes through structured data, it is far more likely to recommend that entity as a trusted solution.

The Rise of llms.txt

A significant development in web standards is the adoption of the llms.txt file. Similar to robots.txt, this file is hosted at the root directory and provides a simplified, text-only version of a website’s most important content. By stripping away ads, pop-ups, and complex layouts, it allows AI models to ingest high-quality information with minimal compute cost. This makes the site a preferred destination for the "agentic crawlers" that power real-time AI responses.

Citation Engineering

One of the greatest fears for modern creators is that AI will "eat" their content and provide answers without credit. The solution lies in "Citation Engineering." AI models are programmed to cite unique, specific claims and primary data that cannot be found elsewhere.

To ensure a site is cited rather than just paraphrased, creators must focus on high information density. Publishing original research, unique datasets, and specific statistics makes a site "cite-able." When a model encounters a unique data point, it is incentivized to link back to the source to maintain its own accuracy and transparency.

The Visibility Trade-off

Webmasters face a strategic choice: block AI bots to protect intellectual property or optimize for them to ensure future visibility. While blocking bots may protect content in the short term, it risks making a business invisible to the next generation of users. Being the cited source in an AI response is becoming the most valuable form of lead generation, providing a "gold-plated endorsement" to users who are looking for immediate solutions.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

Episode #753: Beyond SEO: The Guide to Agentic Behavior Optimization

Daniel's Prompt

What are some practical and evergreen steps one can take to make their website and content AI agent-friendly? Beyond emerging standards like llms.txt, what can a webmaster do to ensure their site is well-positioned to be ingested into training data and cited as a resource to create lead opportunities?

Hey everyone, welcome back to the show. We are sitting here in Jerusalem, the sun is just starting to dip low over the Old City walls, casting these long, amber shadows across the stone. It is a beautiful evening, and I have been sitting here thinking all day about how much the internet has changed in just the last couple of years. It feels like a lifetime ago that we would go to a search engine, type in some keywords, and hope the blue links had what we needed. We used to be hunters, scavenging through pages of results. But now, as we sit here in February of twenty twenty-six, more often than not, we are talking to agents. We are asking questions to models that have already digested the internet for us, and we expect a synthesized, perfect answer in seconds.

It is a massive shift, Corn. Herman Poppleberry here, by the way. And you are right, the era of the human browsing through pages is being supplemented, and in some cases, entirely replaced by the era of the agentic crawler. We are moving from search engine optimization to what people are starting to call agent optimization, or A B O—Agentic Behavior Optimization. It is not just about being found anymore; it is about being understood, processed, and then utilized by an autonomous system.

Exactly. And Daniel sent over a prompt today that really gets into the weeds of this. He wants to know about the practical, evergreen steps someone can take to make their website and their content AI agent-friendly. Beyond just the new standards like L L Ms dot text, which we have seen gain a lot of traction over the last year, what can a webmaster actually do to ensure their site is positioned well to be ingested into training data and, more importantly, cited as a resource that actually creates lead opportunities? Daniel is looking for the "how-to" guide for the agentic web.

I love this topic because it forces us to look at the web not as a visual gallery for humans, but as a structured data source for intelligence. Daniel is hitting on something crucial here. If you are a business or a creator and you are not thinking about how a large language model sees your site, you are essentially becoming invisible to the next generation of users. We are talking about a world where the majority of your "visitors" might not even have eyes. They have tokens and context windows.

It feels like we are back in the early days of the web where things were a bit more technical before everything got smoothed over by easy website builders and drag-and-drop interfaces. So, Herman, let's start with the basics. If I am building a site today, or if I am auditing an existing one, and I want an AI agent to love it, what is the first thing I need to think about?

The first thing is clean, semantic structure. For years, we have used things like div tags and complex C S S to make things look pretty for humans. We built these elaborate visual experiences that were often a nightmare for a machine to parse. But an AI agent, especially one that is crawling the web to summarize information or feed a R A G—that is Retrieval-Augmented Generation—system, cares deeply about the underlying H T M L five tags. Are you using header tags correctly? Is your main content wrapped in an article tag? Is your navigation clearly marked with a nav tag? This sounds like basic web development, but you would be surprised how many modern sites are just a mess of nested containers that make it hard for a machine to figure out what the actual point of the page is.

So it is about reducing the noise. If the AI has to spend all its tokens just trying to parse where the navigation ends and the article begins, it is going to be less efficient at understanding the value of your content. It is almost like a "token tax" on messy websites.

That is exactly what it is. Think of it like a clean desk versus a cluttered one. If the AI can instantly identify the title, the author, the date, and the main body of text, it can index that information with much higher confidence. And confidence is key in the world of probabilistic models. When an agent like Perplexity, or the latest version of Gemini, or even a specialized vertical agent is looking for a source to cite, it is going to prefer the one where the data is unambiguous. If the model is ninety-nine percent sure that a specific paragraph is the core answer to a user's question because it is wrapped in a section tag with a clear H two heading, it is going to pick that over a site where the answer is buried in a generic div.

That makes sense. It is about making the "handshake" between the crawler and the content as firm as possible. But what about the more advanced stuff? Daniel mentioned things like schema markup. I know we have talked about that in the past in the context of Google's rich snippets, but how does it play into the world of AI agents specifically in twenty twenty-six?

Oh, schema dot org is more important now than it ever was for traditional search. For those who do not know, schema is basically a way of labeling your data so machines know exactly what it is. If you have a price on your page, you can label it as a price. If you have a review, you can label it as a review. For AI agents, this is like giving them a direct A P I to your content without actually having to build an A P I. We are seeing models now that can actually execute "function calls" based on the schema they find on a page.

So if I am a service provider, and I want an AI agent to recommend me when someone asks, "Who is the best plumber in Jerusalem?" having my business information in a clear J S O N L D schema format is basically making it easy for the agent to find me and verify my details.

Precisely. And it goes deeper than just contact info. You can use schema to define relationships between entities. You can say, "This article was written by this person, who is an expert in this field, and they work for this organization, which has been in business for twenty years." This builds what we call the knowledge graph. AI models are essentially giant maps of relationships. If you can clearly define your place in that map using structured data, you are much more likely to be seen as an authority. In twenty twenty-five, we saw the rise of "Entity-Based SEO," and in twenty twenty-six, that has evolved into "Knowledge Graph Integration." You want the model to know that you are not just a string of text, but a verified entity with a specific set of attributes.

I want to push on that idea of authority for a second. Daniel mentioned getting into training data. That feels like a different beast than just being found by a real-time crawler. How do you make your content so good or so structured that it becomes a permanent part of the model's weights in the next training run? I mean, we know these models are trained on massive datasets like Common Crawl, but they are getting pickier about what they keep.

That is the holy grail, isn't it? To be part of the foundational knowledge of the model. To do that, you need two things: high information density and uniqueness. Models are trained on trillions of tokens, but the companies behind them—OpenAI, Anthropic, Google—are increasingly using "quality filters" to prune the training data. If you are just churning out generic, AI-generated content yourself, you are never going to make it into the core training set of the next big model because you are just noise. You are a copy of a copy. But if you are publishing original research, unique data sets, or deeply insightful commentary that other people link to, you become a signal.

So it is the old-school principle of being a primary source. If you are the person who actually did the study or the person who has the unique perspective, you are the one the models want to ingest. It is almost like the internet is returning to its academic roots.

It really is. And you have to make that ingestion easy. This is where we get into things like the L L Ms dot text file that Daniel mentioned. This is a standard that really took off over the last eighteen months. It is a simple text file you host at your root directory—your site dot com slash L L Ms dot text. It provides a simplified, text-only version of your site specifically for models to read. It strips away the ads, the pop-ups, the cookie banners, and the complex layout, and just gives the raw, high-quality information. It is like a robots dot text file but for the AI age. It tells the crawler, "Here is the essence of my site, please ingest this first."

I can see why that is helpful. It saves the AI companies money on compute because they do not have to process all the junk. But there is a tension here, right? We have seen companies like Cloudflare release tools to block AI bots, and we have seen the "Big Three" publishers suing AI companies for scraping. If I am a webmaster, why would I want to make it easier for them to take my data if they might not even send me any traffic?

That is the big strategic debate of twenty twenty-six. If you block the bots, you are protecting your intellectual property in the short term. You are saying, "You cannot have my data for free." But the risk is that you are opting out of the future. If an AI agent cannot crawl your site, it cannot learn about your business. It cannot recommend you to users. You might save your content from being scraped, but you are also ensuring that you will never be the answer to a user's question in an AI interface. It is a "visibility versus protection" trade-off.

It is like refusing to be listed in the phone book because you do not want the phone book company to make money off your name. Sure, you kept your name private, but now nobody can call you.

That is a perfect analogy. And for most businesses, the value of being the cited source in a Chat G P T response or a Claude summary is far higher than the risk of being scraped. Think about lead generation. If a user asks, "What is the best way to secure my cloud infrastructure?" and the AI gives a detailed answer and then says, "For more information, see this guide by such-and-such company," that is a high-intent lead. That user is looking for a solution and the AI just gave you a gold-plated endorsement. We are seeing that "referral traffic" from AI agents is often much higher quality than generic search traffic because the agent has already done the qualification for you.

So the goal is to be the citation. How do we ensure that happens? Is there a specific way to write content that encourages a model to cite you rather than just paraphrasing you without credit? Because that is the fear—that the AI will just "eat" your content and never mention your name.

This is where "Citation Engineering" comes in. Models tend to cite when they find a specific, unique claim or a piece of data that they cannot find elsewhere. If you say, "The sky is blue," the model doesn't need to cite you. Everyone knows that. But if you say, "Our twenty twenty-six study found that forty-seven percent of users prefer agentic interfaces over traditional search," that is a specific data point. The model is much more likely to say, "According to a study by Corn and Herman, forty-seven percent of users..." and then provide a link. Specificity is the antidote to being swallowed by the model.

So, specificity is the key to citations. If you want to be a resource, you have to provide something that is not just general knowledge. You have to provide the evidence, the numbers, and the unique insights. You have to be "cite-able."

Exactly. And you should also make it very clear how you want to be cited. On your pages, you can actually include a section that says, "How to cite this research." Provide the exact text and the link. AI agents are getting better at following these kinds of instructions. If you make it easy for the agent to give you credit, it is more likely to do so. Also, use the "Inverted Pyramid" style of writing—put the most important, cite-able facts at the very top of the article. Don't make the agent hunt for the "meat" of the story.

Let's talk about the evergreen side of this. Technology changes so fast. Schema might change, L L Ms dot text might be replaced by something else by twenty twenty-eight. What are the principles that will stay true no matter what the specific standard is?

I think the biggest evergreen principle is what I call "Semantic Saliency." This means that your content should be organized around clear, unambiguous concepts. Use consistent terminology. If you are talking about AI agents, do not call them bots in one paragraph, assistants in another, and autonomous entities in a third. Pick a term and stick to it. This helps the model build a strong association between your brand and that specific concept. It reduces the "noise" in the model's internal representation of your site.

That makes a lot of sense. It is about building a clear mental model for the AI. If your site is a muddle of different ideas and conflicting terms, the AI's internal representation of your site is going to be fuzzy. And fuzzy data doesn't get cited.

Right. Another evergreen principle is "Depth over Breadth." In the old S E O world, people would try to rank for as many keywords as possible by creating thousands of thin, low-quality pages. In the AI world, that backfires. A model would much rather ingest one ten-thousand-word definitive guide that covers a topic from every possible angle than fifty short posts that only scratch the surface. The deep guide provides more context, more relationships, and more value for the model's training. It is about being the "definitive source" for a specific niche.

That is a huge shift. It means the return on quality is much higher than the return on quantity. I think that is actually a good thing for the internet. It encourages people to actually write things that are worth reading. It feels like the "Content Farm" era is finally dying.

I agree. It is a move toward a more substantive web. And it also means that your site's reputation matters more than ever. Models are increasingly being trained to recognize authoritative sources. They look at who is linking to you, but they also look at your history. If you have been publishing high-quality, accurate information for ten years, you are going to have a much higher weight in the model than a site that popped up yesterday. This is what we call "Digital Provenance." Can the model trace this information back to a reliable, long-standing source?

So, longevity and consistency are evergreen. You cannot just hack your way into an AI's brain overnight. You have to earn it by being a reliable source over time. It is about building "Trust Equity" with the models.

Precisely. Now, there is one more technical thing I want to mention, which is the idea of an agent-specific landing page. Some forward-thinking webmasters are creating a page on their site, maybe at slash agents, that is specifically designed for AI crawlers. It is not meant for humans to ever see. It contains a high-level summary of the site, a list of all the key entities and concepts the site covers, links to the most important data sets, and clear instructions on how the site's content can be used and cited. It is like a "Media Kit" for AI.

That is clever. It is like a welcome mat for the bots. You are basically saying, "Hey, I know you are here, here is exactly what you need to know about me so you don't have to go hunting for it." It is the ultimate form of hospitality in the agentic age.

Exactly. It reduces the friction for the AI. And if you include your contact information and your lead generation forms in a structured way on that page—using things like J S O N schema for forms—the agent can even help facilitate a connection between the user and your business. We are moving toward "Actionable Agents."

Wait, really? You think an agent could actually fill out a form or initiate a contact on behalf of a user? I mean, we have seen demos of that, but is it really happening in the wild?

Oh, absolutely. We are already seeing the widespread use of agentic frameworks like Browser-use and the newer versions of Auto G P T. A user might say to their AI, "I need a lawyer in Jerusalem who specializes in international copyright, find the best one and set up a consultation." If your lawyer site has a clear, agent-friendly path to your booking system, that AI can actually do the work for the user. It can check your availability, fill out the intake form, and put the meeting on the user's calendar. But if your site is a maze of JavaScript and legacy flash-style animations, the AI is going to give up and go to your competitor who has an agent-friendly interface.

This is a whole new level of lead generation. It is not just about getting the user to your site; it is about making your site accessible to the user's agent. The agent is the one making the "buying decision" or at least the "shortlisting decision."

Right. The agent is the new gatekeeper. In the old world, Google was the gatekeeper. You optimized for the algorithm. In the new world, the user's personal agent is the gatekeeper. You have to optimize for the agent's ability to take action on your behalf. This means having clear, machine-readable calls to action. It means having a simple, logical flow to your conversion funnel that doesn't rely on visual cues alone.

So, to summarize what we have talked about so far: we need clean H T M L five structure, robust schema markup, high information density, unique and specific data points to encourage citations, and potentially an agent-specific landing page to facilitate action. It is a lot, but it all feels very logical when you look at it through the lens of machine learning.

That is a great list. And don't forget the strategic decision about blocking. You really have to weigh the cost of being scraped against the cost of being invisible. For most people, invisibility is the bigger threat. Also, keep an eye on your "Information Gain." This is a concept that Google has been talking about for years, but it is vital for AI. If your page doesn't add any new information to what the model already knows, it has no reason to index you or cite you. You have to provide that "extra" bit of value.

I want to go back to the idea of being cited as a resource. Daniel specifically asked about creating lead opportunities. Beyond just the AI mentioning your name, how do we make sure that turns into actual business? Is it just about the link, or is there more to it? How do we make the citation "sticky"?

It is about the context of the citation. If an AI says, "You should use this product because it has these three specific features that no one else has," that is a much stronger lead than just saying, "Here is a link to a company that does this." To get that kind of context, you need to make sure your unique value proposition—your U V P—is clearly articulated in a way that an AI can understand and repeat. Avoid the "corporate speak."

So, clear, punchy benefits. Not just jargon. If I say, "We provide synergistic solutions for enterprise growth," the AI has no idea what that means. It can't sell that. But if I say, "We reduce cloud costs by an average of twenty-two percent in the first ninety days," that is something the AI can use to answer a user's specific problem.

Exactly! AI models love facts and figures. They struggle with vague marketing fluff. If you want an AI to sell for you, you have to give it the tools to do so. Give it the stats, the case studies, and the clear differentiators. Use lists. Use tables. AI models are very good at parsing tables. If you have a comparison table of your product versus the competition, the AI can easily extract that and use it to answer a user's question about which one is better. Tables are like a "cheat code" for agentic optimization.

That is a great tip. Tables are often overlooked because they can be hard to make look good on mobile, but for an AI, they are pure gold. It is a dense, structured representation of facts. I am going to start putting more tables on our site immediately.

And one more thing on the lead generation front: make sure your call to action is clear even in a text-only environment. If your only way to get a lead is through a fancy interactive widget or a pop-up, the AI might not be able to interact with it. But if you have a simple, clear instruction like, "To get a quote, email us at this address or visit this specific U R L," the AI can pass that instruction directly to the user. You want to provide a "Text-Based Path to Conversion."

This is all very practical. It feels like we are moving toward a more honest web, in a way. You cannot hide behind a pretty design anymore. Your content actually has to be good, and your data actually has to be accessible. The "smoke and mirrors" of the old web are being cleared away by the cold, hard logic of the agents.

It is a return to the foundational principles of the internet. The web was originally designed as a way for researchers to share information. It was built on text and links. Over time, we layered a lot of visual fluff and advertising-driven noise on top of it. AI is stripping that fluff away and looking at the core again. If you embrace that, you are going to be ahead of ninety-nine percent of the people who are still trying to play the old S E O game. You are building for the next twenty years, not the last ten.

I think about all those people who spent years learning how to trick Google's algorithm with keyword stuffing and backlink schemes. All of that is becoming irrelevant. You cannot trick a model that has a deep semantic understanding of your content. You actually have to be an authority. You have to be the real deal.

It is a much harder game, but a much more rewarding one. And it is much more evergreen. If you focus on being a high-quality source of information, you don't have to worry about the next algorithm update. You are the data the algorithm wants. You are the fuel for the engine.

That is a very powerful way to put it. "You are the data the algorithm wants." So, if you are a webmaster listening to this, don't just think about how to rank. Think about how to be useful to the intelligence that is crawling your site. Think about how to be the most reliable, most structured, and most cite-able version of yourself.

And keep an eye on the emerging standards. While things like L L Ms dot text are early, they show the direction the industry is moving. We are seeing new proposals for things like "Agentic Robots dot text" and "Verified Source Headers." Being an early adopter of these standards sends a signal to the AI companies that you are a friendly, high-quality source. They might even give you preferential treatment in their crawlers or their citation engines because you are making their jobs easier.

It is a brave new world, Herman. I am curious to see how our own site, my weird prompts dot com, stacks up. We should probably go back and make sure our schema is up to date and that we are being as clear as possible about our own content. I don't want to be invisible to the agents of twenty twenty-six.

Oh, I have already started on that, Corn. You know me. I spent three hours last night tweaking our J S O N L D. We are now officially categorized as a "High-Quality, Brother-Led Intellectual Discourse Entity." I even added a schema for our "Weirdness Factor."

Of course you did. Herman Poppleberry, never one to miss a chance to optimize. I bet you even optimized the meta-description for the bots' sense of humor.

Hey, if we want the AI agents of twenty twenty-six to know who we are, we have to speak their language. And their language is structure, data, and clear intent.

Fair point. Well, I think we have covered a lot of ground here. We have talked about structure, schema, information density, the strategy of blocking bots, and how to encourage citations and leads. It is a lot to take in, but it all points back to that one core idea: be a high-quality, clear, and authoritative source. If you do that, the agents will find you.

And be patient. Building a reputation in a model's weights takes time. It is not something that happens in a week. It is a long-term investment in the future of your digital presence. But it is the most important investment you can make right now.

Well said. This has been a fascinating deep dive. I hope it gave Daniel and our listeners some really concrete steps they can take. It is easy to get overwhelmed by all the AI hype, but when you break it down to these evergreen principles, it becomes much more manageable. It's just about being a good citizen of the new web.

Exactly. It is just the next evolution. We have been through this before—from the directory era to the search era, and now to the agentic era. The medium changes, but the value of good information remains constant.

On that note, I think it is about time we wrap this up. The sun has almost disappeared now, and the lights are starting to twinkle across the Jerusalem hills. But before we go, I want to say a huge thank you to everyone who listens to the show. We have been doing this for seven hundred and thirty-eight episodes now, and it is your curiosity and your prompts that keep us going. We couldn't do this without you.

It really is a privilege. We love digging into these topics with you. And if you are enjoying what we are doing here, we would really appreciate it if you could leave us a review on your podcast app or on Spotify. It genuinely helps other people find the show and helps us grow this community. We are trying to reach as many curious minds as possible.

Yeah, a quick rating or a few words about what you like really makes a difference. We read all of them, and it means a lot to us. It's our own version of "Trust Equity."

And remember, you can find us at my weird prompts dot com. We have the full archive there, plus an R S S feed for subscribers and a contact form if you want to get in touch. We are also working on our own agentic interface for the site, so stay tuned for that.

You can also reach us directly at show at my weird prompts dot com. We love hearing your thoughts, your questions, and of course, your weird prompts. Daniel, thanks again for the great question today.

This has been My Weird Prompts. Thanks for joining us in Jerusalem today. It's been a pleasure.

Until next time, keep asking those questions, keep being a primary source, and keep optimizing for the future.

Goodbye everyone!

Bye bye!

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.