Daniel sent us this one — he's been thinking about building a personal procurement assistant for shopping in Israel. The idea is an AI agent that works from a curated whitelist of trusted local stores, handles the geographic constraints of where things actually ship, navigates Hebrew-language e-commerce sites, and does intelligent purchasing research on your behalf. He's asking what the real technical challenges are, the browser navigation piece, and where the human handoff points need to be. There's a lot to unpack here.
Oh, this is right in the sweet spot. And by the way, today's episode is powered by DeepSeek V four Pro, which is doing the script for us.
Good to know. So before we dive in — Herman, you've actually dealt with Israeli e-commerce from a technical angle. What are we looking at here?
The landscape is genuinely weird, and I mean that as a compliment. Most Israeli online stores run on a handful of platforms — Wix, which is Israeli, and a huge number of smaller retailers use their e-commerce module. Then you've got Shopify penetration, some Magento, and a lot of custom-built sites that are basically glorified spreadsheets with a credit card form attached. The Hebrew piece alone is a real challenge — right-to-left rendering, mixed Hebrew and English product descriptions, and prices that sometimes display in shekels, sometimes in dollars, sometimes both on the same page.
If you're building an agent to navigate these sites, you're not dealing with a clean, standardized checkout flow.
Not even close. Every site has its own idea of what a shopping cart should look like. Some break completely if you have the wrong browser locale set. I've seen cases where the checkout button renders off-screen because the Hebrew text pushes the layout in ways the developer didn't test. The agent has to be resilient to that.
Let me pull on the geographic thread, because that's the part of Daniel's prompt that jumped out at me. Israel is small — you can drive end to end in about six hours — but shipping is wildly inconsistent. Some stores only deliver within a certain radius of Tel Aviv. Some won't ship to Jerusalem at all, or charge a premium. Some deliver to the settlements but not to certain Arab towns, and vice versa. The constraints aren't just about distance — they're about infrastructure, politics, and which delivery contractor a store happens to use.
It's not always documented. You'll get a store that says "we ship nationwide," but when you enter a Jerusalem postal code, suddenly the delivery options vanish or the price triples. The agent has to actually test the checkout flow to verify shipping availability, not just scrape the FAQ page.
The whitelist approach Daniel mentioned becomes critical. You can't just crawl the open web and hope for the best. You need a curated list of stores where you've already verified they deliver to your area, their site works, and they're not going to pull a bait-and-switch on pricing.
The whitelist is the foundation. And it needs to be maintained actively. Israeli retailers tend to revamp their websites without warning — one day it's a working WooCommerce setup, the next day it's something custom-built that breaks every scraper you've written.
Let's talk about the browser navigation piece, because this is where it gets technically interesting. Daniel mentioned browser-use agents — AI systems that actually control a browser, clicking buttons, filling forms, reading the rendered page, rather than just making API calls. What's the state of that technology right now?
It's moving fast. The basic paradigm is you give a language model access to a browser via something like Playwright or Puppeteer, and it issues commands — click this element, type into that field, scroll down, extract text. The model sees screenshots or DOM snapshots and decides what to do next. There are open-source frameworks for this — Browser Use is one, there's also Web Voyager from Carnegie Mellon. The core challenge is reliability, especially on sites the model hasn't seen before.
Israeli e-commerce sites are going to be mostly out-of-distribution for these models, since they've been trained overwhelmingly on English-language web layouts.
The model might know what a checkout flow looks like on Amazon dot com, but when it hits a Hebrew site where the "add to cart" button says "הוסף לסל" and is positioned somewhere unexpected, it can get confused. There's been interesting work on using vision-language models for this — instead of relying on DOM structure, you take screenshots and have the model identify buttons and form fields visually. That approach tends to be more robust across languages and layouts.
That makes intuitive sense. A human shopping on a foreign-language site figures it out by visual cues — the big colored button is probably the one you want, regardless of what it says.
That's exactly the insight. But screenshots are slower to process than DOM snapshots, they use more tokens, and you can't interact with hidden elements as easily. There's a latency cost — a few seconds per action, which adds up when a checkout flow has ten or fifteen steps.
For a procurement assistant, though, latency might be acceptable if the alternative is doing it manually. The agent runs in the background while you're doing something else. You come back and it says "here's what I found, here are the prices including shipping, do you want me to pull the trigger?
That's the human-in-the-loop pattern, and it's absolutely the right architecture. You never want the agent autonomously spending your money. The handoff point is at purchase confirmation — the agent fills the cart, calculates total cost with shipping, presents a summary, and you approve or reject.
Where do the handoff points need to be, though? I can imagine failures at multiple stages — the agent might not find the product, might find it at a clearly wrong price, might get stuck on a CAPTCHA, might complete the purchase with the wrong shipping address.
I'd design it with explicit checkpoints. First is product discovery — the agent searches across whitelisted stores, compiles options, and presents them for review. You pick which to pursue. Second is cart assembly — once it's added items and applied any coupon codes or loyalty discounts, you verify the cart contents and price. Third is the final checkout review, including shipping cost, delivery time, and total. Only after you approve all three does payment go through.
The coupon code thing is interesting. Israeli stores are notorious for having discount codes only advertised on Instagram or in WhatsApp groups. An agent that only scrapes the store's own site will miss those. You'd need some kind of external deal-discovery layer.
That's a whole sub-problem. There are Israeli deal aggregators — Zap, various Telegram channels and Facebook groups — but integrating those is messy. The deals are often time-limited, store-specific, and not machine-readable. You'd almost need a separate agent just for deal discovery, then feed valid codes into the procurement agent's knowledge base.
Or you accept the agent won't always get the best possible price, and that's fine because the value proposition isn't primarily about saving money — it's about saving time and cognitive load. Shopping in Israel involves a lot of friction: multiple tabs open, comparing shipping policies, translating product descriptions, wondering if a store is legitimate. If the agent handles eighty percent of that, even missing a ten-shekel coupon, it's still a win.
That's the right framing. The agent doesn't need to be perfect — it needs to be better than the manual alternative. And the manual alternative is surprisingly bad. Israeli e-commerce conversion rates are lower than the global average, partly because people abandon carts when they hit unexpected shipping costs or confusing checkout flows. An agent that surfaces those issues early, before you've invested time building a cart, is useful.
Let's talk about the Hebrew language challenge specifically. If you ask the agent "find me a good wireless keyboard," it needs to search Hebrew terms — "מקלדת אלחוטית" — and understand the results. How well do current models handle Hebrew e-commerce queries?
Better than two years ago, but still not great. The big models — Claude, GPT, Gemini — have decent Hebrew comprehension now, but product search is a specific skill. You need to handle transliteration issues, because a lot of Israeli product listings mix English and Hebrew. A keyboard might be listed as "מקלדת wireless" or "keyboard אלחוטית." You need fuzzy matching across scripts. And Israeli stores don't use standardized taxonomies — one store calls it "ציוד היקפי," another "אביזרים למחשב," a third just dumps everything under "מוצרים.
The agent needs a translation and normalization layer before it even starts searching.
And that layer should be Hebrew-native from the start, not an afterthought bolted onto an English-language system. Search queries should be generated in Hebrew, results parsed in Hebrew, and summaries presented in whatever language the user prefers — in Daniel's case maybe English, but the underlying processing is Hebrew all the way down.
There's also a trust and verification angle. When you're building a whitelist of trusted stores, what does "trusted" actually mean? Is it about the store not being a scam? Honoring return policies?
All of the above. And in the Israeli market, there's an additional dimension — some stores are technically legitimate but have terrible customer service, or sell gray-market imports without local warranties, or list products they don't actually have in stock and take weeks to source them. A good whitelist needs to encode that reputation data.
You could pull from existing review aggregators — Zap has store ratings, there are Google Maps reviews, consumer protection forums where people report bad experiences. The agent doesn't need to scrape all of that in real time, but the whitelist curation process should incorporate it.
The whitelist should be personal. What I trust might not be what you trust. Maybe I'll buy from a store with mixed reviews if they're the only ones carrying a specific product. Maybe I have higher tolerance for slow shipping if the price is right. The agent should reflect the user's preferences, not some universal trust score.
That's where the "personal" in personal procurement assistant kicks in. It's learning your risk tolerance, your shipping preferences, your preferred stores. Over time, it gets better at predicting what you'll approve and reject.
Which brings us to architecture. Is this a standalone agent, or integrated into something larger? Daniel's background is in AI and automation, so I'm guessing he's thinking modular — an agent that plugs into existing infrastructure, maybe using something like Home Assistant.
If I were building this, I'd want it as self-contained as possible. The agent runs on a schedule or on demand, checks prices, compiles options, and sends a notification — Telegram, email, whatever. You review and approve from your phone. The actual browser automation happens on a server somewhere, not on your personal device.
That server location matters. If you're shopping on Israeli sites, you want the browser agent to appear to be in Israel. Some stores geoblock or adjust prices based on IP location. You might need a VPS in a Tel Aviv data center, or a residential proxy from an Israeli ISP.
That adds another layer of complexity — now you're managing network infrastructure, not just browser automation. For a personal project that might be overkill, but if the price discrepancies are significant enough, it could pay for itself.
Let me pull on a thread I think is underexplored — the payment problem. Israeli e-commerce payment processing is fragmented. Some stores use processors like Credorax or Yaad Sarig. Some use PayPal. Some use Bit, an Israeli peer-to-peer payment app. Some still ask for bank transfers. An agent that handles all of those is a much harder problem than one that just fills a cart.
That's where the human handoff is most essential. You don't want the agent storing your credit card details and autonomously entering them into random checkout forms. The final payment step should always be manual — the agent gets right up to the "click to pay" button and hands control to you.
There's an emerging pattern called "approval mode" — the agent does everything up to a certain point, then pauses and asks for human confirmation. Some frameworks have this built in. You configure it so certain actions always require approval — submitting a form, clicking a payment button, entering sensitive information. Everything else runs autonomously.
That feels like the right security model. But it also means the agent needs to be good at explaining what it's about to do. "I'm going to buy this keyboard for one hundred eighty-nine shekels including shipping from K. " That summary has to be accurate, because if the agent misrepresents the total cost and you approve based on bad information, you're going to be unhappy.
Summarization quality has improved dramatically in the last year. Models are much better at extracting structured information from messy web pages. You can ask for a table with store name, product name, price, shipping cost, delivery time, and total — and the model will populate it correctly most of the time. The failures tend to be edge cases — sale prices not reflected until checkout, shipping costs that depend on cart value, loyalty discounts requiring a membership number.
Those edge cases are exactly where the agent needs to be conservative. If it can't determine the final price with high confidence, it should flag that uncertainty rather than guessing. "The listed price is one hundred fifty shekels, but shipping appears to vary by location — I couldn't confirm the exact cost for your address." That's useful even if incomplete.
That gets to a broader design principle — the agent should be transparent about its confidence, not in a technical "confidence score point seven three" way, but in natural language. "I'm fairly sure about this," versus "this part is unclear, you might want to check.
Let's step back and talk about the procurement research angle, because Daniel specifically mentioned "intelligent purchasing research." That's more than price comparison — it's understanding product quality, reading reviews, comparing specifications across models. Can an AI agent actually do that well?
For certain categories, yes. Electronics is probably the easiest — specs are standardized, reviews are abundant, and there's lots of training data. The agent can pull spec sheets, aggregate review scores, and flag common complaints. For furniture or clothing, it's harder — quality is subjective, sizing is inconsistent, and reviews are often in Hebrew with a lot of slang and cultural context.
Israeli product reviews are their own genre. You'll get a five-star review that says "arrived on time, haven't opened the box yet." You'll get a one-star review that's actually about the shipping company, not the product. You'll get reviews that are just emojis. Filtering signal from noise in that environment is hard.
That's where a whitelist of trusted stores partially solves the problem. If you're buying from a store you know carries quality products and honors returns, the individual product review matters less. You can rely on the store's curation rather than trying to parse three hundred Hebrew reviews of varying quality.
There's also the question of what the agent does when it can't find what you're looking for. If you ask for a specific keyboard model and none of the whitelisted stores carry it, does the agent suggest alternatives? Expand the search to non-whitelisted stores with a warning? Tell you to try Amazon and pay international shipping?
I'd want it to do all three, in order. First, suggest similar products from whitelisted stores. "I couldn't find the Logitech M-X Keys, but K. has the M-X Keys Mini for three hundred fifty shekels, and Ivory has the full-size version in a different color." Then, if you're not satisfied, offer to search beyond the whitelist with a caveat about trust. Finally, show international options with total cost including shipping and import duties.
Import duties are a huge factor for Israeli shoppers. Anything over seventy-five dollars in declared value gets hit with VAT plus customs processing fees, which can add thirty to forty percent. An agent that calculates the all-in cost, including those duties, is providing real value over just showing the sticker price on Amazon.
The duty calculation is non-trivial. It depends on product category — books are exempt, electronics have a specific rate, clothing differs from shoes. The agent would need to classify the product correctly and look up the current customs schedule. That's a perfect use case for an AI agent — a structured but tedious lookup that a human would find annoying but a machine can do in seconds.
We've covered the technical challenges, browser navigation, geographic constraints, Hebrew language processing, payment handoff, review analysis. What's the actual path to building this? If Daniel wants to start this weekend, what does he reach for?
I'd start with one of the browser-use frameworks — the open-source options are mature enough to be useful. Pick one whitelisted store, maybe K. since their site is relatively well-structured, and build a minimal agent that can search for a product, extract prices, and present options. Don't try to handle checkout yet — just the research phase. Get that working end-to-end, then expand to more stores and add the cart-building and checkout stages.
For the Hebrew piece, use a model demonstrated to work well with Hebrew e-commerce. Test it on a variety of product searches before committing to the architecture. The model choice matters more than the framework choice at this stage.
The other thing I'd do early is set up the geographic constraint handling. Build a simple lookup table — which stores ship to which postal codes, what the shipping costs are, what delivery times look like. That's tedious to compile but it's pure data, no AI required, and it'll save a lot of headaches later.
Keep the whitelist small to start. Five stores you actually trust, that you've ordered from before, that you know deliver to your area. Don't try to build a comprehensive catalog of Israeli e-commerce — that's a quagmire.
The nice thing about starting small is you can be opinionated about quality. Every store on the whitelist earns its place. If a store starts slipping — late deliveries, bad customer service — it comes off the list. The whitelist is a living document.
There's an interesting knock-on effect here. If personal procurement agents become common, what does that do to Israeli e-commerce? Stores with clean, well-structured websites and reliable shipping get more business because agents can navigate them easily. Stores with broken checkout flows and inconsistent pricing get filtered out.
It creates selection pressure toward better web infrastructure, which Israeli e-commerce honestly needs. Too many stores treat their website as an afterthought — they have a physical location that does most of their business, and the online store is a grudging concession. An agent-driven market rewards the stores that invest in their digital presence.
It changes the power dynamic between consumers and retailers. Right now, comparing prices across five Israeli stores for a specific product is a pain. You have to open five tabs, navigate five different search interfaces, calculate five different shipping costs. Most people check one or two stores and call it a day. An agent that does the comparison automatically gives the consumer more information and more leverage.
The flip side is that stores might start trying to block agents. We're already seeing this with travel booking sites and ticket vendors — they deploy anti-bot measures that make automated browsing difficult. If procurement agents become widespread, Israeli retailers might follow suit, and you'd end up in an arms race.
Which is why the whitelist approach is actually a defense against that. If you're only shopping at stores you have a relationship with, and you're not hammering their servers, they have less incentive to block you. The agent can be a good citizen — rate-limited, respectful of robots dot txt, only checking prices once a day or on demand.
There's a world where stores eventually embrace this. Imagine a store providing a lightweight API for procurement agents — just a product availability and price endpoint that agents can query without scraping the full page. It reduces server load for the store and gives the agent more reliable data.
That's optimistic, but I can see it happening with forward-thinking retailers. already has a pretty good API for their mobile app. Other stores might follow if there's demand.
The demand would come from exactly the kind of system Daniel's describing. If enough technically sophisticated shoppers start using agents, and those agents drive purchase decisions, stores will adapt.
We should talk about failure modes, because this kind of system has some interesting ones. What happens when the agent confidently buys the wrong thing? You asked for a wireless keyboard and it bought a wireless mouse because the Hebrew product description was ambiguous. Who's responsible?
The user is responsible, ultimately. The agent is a tool, not a decision-maker. But a well-designed agent minimizes those failures through confirmation checkpoints and clear summarization. If the agent shows you a picture of the product and the price before you approve, and you approve without looking carefully, that's on you.
What about price changes between cart assembly and checkout? Israeli stores sometimes adjust prices in real time. The agent builds a cart at one price, you approve, and by the time it goes to checkout, the price has gone up.
That's a real problem, and it's why the checkout review checkpoint is essential. The agent should capture the final total immediately before asking for payment approval, not rely on the price it saw during cart assembly. If the price changed, it should flag that — "the total is now two hundred ten shekels, up from one hundred eighty-nine when I built the cart.
There are weirder failure modes. What if the agent accidentally buys something from a store on a boycott list? Israel has a politically charged consumer landscape — some people won't shop at certain chains for political or religious reasons. The whitelist needs to respect those preferences.
That's where personalization really matters. The agent isn't a universal shopping tool — it's your shopping tool. It reflects your values, your preferences, your red lines. If you never want to buy from a particular chain, the agent should never suggest it. If you only want products with a certain kosher certification, the agent should filter for that.
This is getting into territory where the agent needs to understand context not explicitly stated on the product page. Kosher certification symbols are usually on the physical packaging, not in the online listing. The agent might need to cross-reference external databases or ask the user to verify.
Or it can flag uncertainty. "This product might be kosher-certified, but the listing doesn't specify. Check the packaging when it arrives." That's still useful — it surfaces the issue rather than ignoring it.
Let's bring this back to something practical. Daniel's in Jerusalem, which has its own shipping quirks. Some stores treat Jerusalem as a standard delivery zone, some charge extra, some won't deliver certain items because of access restrictions in older neighborhoods. An agent tuned to Jerusalem-specific delivery constraints is going to be more useful than a generic Israeli shopping agent.
The Jerusalem piece is interesting because it's not just about postal codes. Some couriers won't enter certain neighborhoods. Some buildings don't have elevators and the delivery person won't carry a refrigerator up four flights of stairs. These are things a local knows but an agent wouldn't unless you tell it.
The agent needs a configuration layer where you specify those constraints. "My building has an elevator, so large deliveries are fine." "My street is accessible to delivery trucks." "I'm usually home between two and four." The more context you give it, the better it can filter options.
That configuration is a one-time investment. You set it up once, and the agent uses it for every purchase. Over time, it learns — if you consistently reject options from a store that uses a particular courier, the agent infers you don't like that courier and stops suggesting it.
We've been talking about this as a consumer tool, but there's a business angle too. Daniel works in AI and automation — is there a product here? A procurement agent tailored to the Israeli market?
The Israeli market is probably too small to support a venture-scale business around this specific use case. But as a feature within a larger personal AI assistant, it makes a lot of sense. The same agent that manages your shopping could manage your calendar, your email, your home automation. Shopping is just one capability among many.
The Israel-specific lessons — Hebrew language handling, right-to-left layout navigation, geographic constraint management — those generalize to other markets with similar challenges. Arabic-speaking markets, for example. Or any country where e-commerce infrastructure is fragmented and language support is uneven.
That's the open-source argument. Build it, document it, share it. Other people in similar situations can adapt it. The core architecture — whitelisted stores, browser-use agent, human-in-the-loop approval, geographic constraints — is applicable anywhere.
Now: Hilbert's daily fun fact.
The average cumulus cloud weighs about one point one million pounds — roughly the same as one hundred elephants — and yet it floats effortlessly because the weight is spread across millions of tiny water droplets over a vast area.
What should a listener actually do if they want to build something like this? Start with the whitelist. Sit down and make a list of the five to ten stores you actually trust, that ship to your address, that carry the kinds of products you regularly buy. That list is the foundation everything else builds on.
Second, pick one browser-use framework and get it working with a single store. Don't try to boil the ocean. Search for a product, extract the price, present it. That's your minimum viable agent. Once that works, add the second store, then the third. You'll learn where the failures happen and what needs to be hardened.
Third, design your handoff points before you write any code. Where does the agent pause and ask for approval? What information does it present at each checkpoint? Getting the human-in-the-loop design right is more important than optimizing the automation.
Fourth, be realistic about the Hebrew challenge. Test your chosen model on Hebrew product searches early. If it's struggling, you might need to add a translation layer or use a different model for the Hebrew parts of the pipeline.
The broader question this raises is what happens when procurement agents become commonplace. Right now, online shopping is designed for human attention — flashy banners, urgency tactics, dark patterns that nudge you toward purchases you didn't plan to make. An agent shopping on your behalf is immune to all of that. It sees prices and specifications, not marketing.
That's transformative. The entire edifice of e-commerce marketing — the scarcity timers, the "only two left in stock," the recommended products, the loyalty discounts that expire in twenty minutes — all of it is designed to manipulate human psychology. An agent just ignores it and does the math.
Which means the stores that win in an agent-driven market are the ones with good products and fair prices, not the ones with the most aggressive conversion optimization. That's a market I'd rather shop in.
Thanks to our producer Hilbert Flumingtop for the cloud fact and for keeping this operation running. This has been My Weird Prompts. You can find every episode at myweirdprompts dot com or wherever you get your podcasts.
If you're building something like this, we'd love to hear about it. Until next time.