Kris McDermott of Spark Foundry got my vote for line of the day: "We 'did AI' to what end? That's not a verb."
But that’s what every marketer and retailer is now hearing from solution providers: We’ll just “AI” it.
As Kris said, AI exposes the cracks in your organization, it does not heal them.
That was the throughline at Xnurta's Signal to Scale Summit on Tuesday this week. Every vendor across the ecosystem is selling agentic workflows of some kind. And brands are motivated to adopt it. I heard from one attendee that, for better or worse, their leadership is asking for every new initiative to include how AI will be used.
That mandate means that brands generally want to get smarter on agentic workflows. And this event helped to put some of that theory into practice.
Real Use Cases For Agentic Media Buying
The clearest use cases came from Sha Atakhanov, VP, Marketing & Ecommerce at Boiron, the homeopathy and OTC brand. He runs retail media with two FTEs across roughly 50,000 keywords. "They can't possibly optimize all 50,000 keywords even if they work 24/7," he said. AI isn't a productivity gain there — it's the only way the work gets done at all.

The second use case is creative. Boiron is using an AI tool to score product and lifestyle images for conversion potential. On one cold product for babies, they kept iterating images against the model's score until they hit 90%. The conversion outcome:
"The sales increase 29% week over week." The funny detail was that the model kept downgrading an image with a mom holding the baby. They tried it without her, and the score jumped. Sales followed.
Sha also gave the cleanest decision framework for picking AI tools: it has to (1) contribute to growth, (2) save time or increase productivity, and (3) reduce cost. Anything that doesn't do at least two of those, you don't buy. And critically — guardrails are part of the spec: "With AI you have to have good compliance, good boundaries. You can customize the AI and kind of set your goals and specific objectives. Because if you don't, the AI could do optimization on its own and might not be contributing to your profitability."

Yi Chiang at Better Being Co was the other brand voice that stood out — particularly on measurement. He said that ROAS alone is an efficiency metric, not something that’s a result he’s trying to achieve. He talked about share-of-voice and search query performance as their most underutilized levers, and about simplifying the dashboard noise by sorting metrics into inputs vs outputs. Not a flashy answer. A practitioner's answer.
And Alex Lans at Freebird is one of the brand-side practitioners actually using agentic ad buying in production today — not piloting, running. His mental model is worth stealing: think of it as the difference between an automatic car and a stick shift. The car drives either way. But if something goes wrong, the person who knows how gears work is the only one who can drive.
Mirakl Ads is the ad-tech solution trusted by Rakuten and 50+ global enterprise retailers.
That’s because Mirakl Ads was built with both 3P marketplace sellers and 1P suppliers in mind. Both advertiser audiences demand a seamless advertising journey from onboarding to reporting.
You can offer everything from Sponsored Products to video all in one solution.
The Amazon backdrop
None of this is happening in a vacuum. Liz Joyce, Technical Product Marketing Lead at Amazon gave a quick tour of the substrate: the Amazon Ads MCP server is live, and "skills" Amazon released in March give agencies a way to handle more complex scenarios, and the same infrastructure that powers Amazon's own ads agent now lets brands plug their own in. But she also cautioned that "Too many signals are not better."
Her advice: start by setting up AI evals (more on this later), map your workflows, figure out what's high-frequency and repeatable versus what always needs a human.
Agent evals as a source of truth
This is where the day really got interesting. Xnurta launched Insight Agents — a redesigned agent that handles open-ended queries against Amazon Ads accounts (data pulls, audits, root-cause diagnosis, recommendations).
But the real banger was a 100+ question “eval framework.” For those unfamiliar, an eval (or evaluation) is a structured way to test whether an AI system is actually performing well against the outcomes you care about, using defined tasks, scoring criteria, and benchmarks.
Xnurta shared how their agent compared against 3 frontier AI models which were connected directly to Amazon Ads via the MCP server. On data accuracy, the frontier models scored in the 50%. Xnurta's purpose-built agent scored 80%+.
But Xnurta says it doesn’t just want to grade its own homework. It is open-sourcing the eval framework and launching the Agentic Retail Media Council — a panel of industry experts (including competitors, explicitly invited) who will help evolve it. CEO Kashif Zafar said this is version one. “We built it because somebody had to start. Version two will be built with the industry."
This matters because Liz Joyce from Amazon gave that specific advice earlier, “set up your evals” – but there’s a chicken-and-egg problem. Most brands don't have the engineering muscle to build agent evals from scratch. An open, industry-built framework is the closest thing to a shared yardstick I've seen proposed. If the rest of the category shows up to it, this is a real piece of infrastructure. If they don't, it's a marketing wrapper. Worth watching closely.
If you're interested in joining the Agentic Retail Media Council, Xnurta is recruiting members, register your interest here.
I was a paid speaker at Xnurta’s Signal to Scale Summit. Opinions are my own.

