LandscapeJun 28, 20269 min read

What AI agents can and can't do inside your ESP (yet)

Brevo and ActiveCampaign both wired up MCP servers this year, so you can finally point an AI agent at your email account. It will send, tag, and segment all day. Ask it why last quarter worked and it goes quiet, because behind the ESP door is the same thin slice of data the dashboard always showed you. An agent is only as good as the data it can reach.

Thierry from SendlensFounder

A person sitting beside a homemade cardboard robot with a worried face, a nod to AI agents that can act on command but can't yet answer the harder questions

This year, two of the platforms we connect to did something more interesting than ship a feature. They opened a door.

Brevo and ActiveCampaign both released MCP servers. If you haven't hit that acronym yet, the Model Context Protocol is the thing that lets an AI agent (Claude, ChatGPT, Cursor, whatever you use) reach into a tool and act on your behalf. Anthropic introduced it in late 2024. By the end of 2025 it had been handed to a Linux Foundation body backed by Anthropic, OpenAI, Google, and Microsoft, with more than ten thousand public servers in the wild. It has gone from a research curiosity to the default way software lets agents in.

So your ESP opened a door for your agent. The honest question is what's on the other side of it.

What the door actually opens onto

Connect Claude to Brevo's MCP server and it can do a lot. Create and update contacts, build and clean lists, manage segments, draft and schedule email, SMS, and WhatsApp campaigns, move deals through a pipeline, manage senders and domains, kick off a bulk import. The tools are generated straight from Brevo's API, so the agent inherits roughly the surface the API has always had: more than two dozen modules, nearly all of them operational. There is one module for analytics. It returns the same aggregated campaign stats the dashboard already shows you.

ActiveCampaign's MCP server has the same shape. Forty-odd tools: read contacts, add tags, add and remove people from automations, create and update deals, pull a contact's opens and clicks. ActiveCampaign even claims theirs is the only one that can move a contact in and out of a live automation. What it does not have is a single reporting or analytics tool. The closest thing is a raw feed of engagement rows that the agent would have to add up itself.

Neither company is the villain here; this is the pattern everywhere. The MCP servers that Stripe, Slack, Atlassian, and GitHub shipped look the same: create an invoice, send a message, open an issue, update a record. Across the industry, the door agents get is the operational one. Buttons, not memory.

Ask it the one question that matters

Watch the agent work and the split shows up inside a minute.

Operational requests, it nails. "Add everyone who clicked the spring preview to the launch segment." Done. "Schedule the newsletter for Tuesday at nine and hold the promo until the sale goes live." Done. This is genuinely useful, and if that's all you wanted, the MCP server delivers.

Now ask it something an operator actually asks in a Monday review:

Across the last six months of newsletters, which subject pattern is quietly decaying, and is the drop real or just seasonal?
Our welcome flow converts at four percent. Is step two pulling its weight, or is step five doing all the work?
Of the forty campaigns we sent this quarter, which five beat our own trailing average by a meaningful margin, and what did they share?

The agent stalls. The intelligence is all there; what's missing is anything behind the door for it to reach. The aggregated stats it can pull tell it what the open rate was, not which campaigns shared a fingerprint, not how a pattern moved across two quarters, not where in a sequence the revenue actually comes from. The model is fine. The data underneath it is thin.

An agent is only as good as the data it can reach

This is the part the demos skip. When an agent fails at a real question, the instinct is to reach for a better model. It almost never helps. The bottleneck is the data the model can reach, and a smarter model reaches the same thin slice.

A frontier model with access to three months of pre-totaled metrics will give you a confident, well-written answer about three months of pre-totaled metrics. It cannot tell you about the campaign you sent last September, because that data isn't there to reach. It cannot re-segment a number that was already collapsed into an average, because aggregation is a one-way door: once a figure is rolled up, you can't pull it back apart to ask a different question.

Worse, a pre-rolled number can be quietly wrong about direction. Simpson's paradox is the classic case: a trend that holds inside every segment can reverse when you blend the segments together. An agent handed only the blended figure will state the exact opposite of what's true, in the same calm, decimal-precise voice it uses for everything else. It has no way to know, because it never saw the grain.

The data underneath an ESP was never built to be reached this way

Why is the data thin? Not, mostly, because anyone is hiding it. Because of what an ESP is.

An ESP's job is the send. Everything is organized around getting an email into an inbox at the right time, and the best ones are very good at it. Storage follows the job. The platform keeps enough history to operate and to show you this month's report, and then it lets the rest go.

You can see the shape of it in the APIs. Brevo's event endpoints will hand you genuine per-recipient events, but only inside a rolling ninety-day window, with no way to backfill what came before; at scale, events older than two years are deleted outright. ActiveCampaign defaults its campaign and automation reports to the last week or two, splits campaign totals awkwardly across two API versions, and caps you at five requests a second while you try to pull anything large. Both will stream new events to a webhook if you set one up, but a webhook only ever points forward. None of this is unusual. It's what storage looks like when a product is built to deliver mail rather than remember it.

Are they protecting their turf? Sometimes. But that's not the main story.

It's tempting to read all of this as defensiveness, and the suspicion isn't baseless. There is a real industry pattern of platforms tightening access right as agents arrive. Salesforce changed Slack's API terms in 2025 to block bulk export and ban training models on Slack data. SAP published a policy in 2026 barring third-party AI agents from orchestrating calls against your own SAP data. When the value starts moving to the agent layer, the platforms that hold the data have an obvious incentive to control the gate.

For email specifically, though, the simpler explanation holds. Most ESPs do expose your raw events. The frictions are narrower and more mundane: short date windows, low rate limits, analytics gated to the upper plans, history that ages out. You don't need a conspiracy to explain a sending platform that stores like a sending platform. It's by design, not by accident, and the design was never analytics.

Chat-with-your-data doesn't close the gap

Both platforms have also bolted a conversational layer on top. Brevo's Aura includes an AI Data Analyst; ActiveCampaign has a reporting co-pilot. You ask in plain language, "which subject lines drove the most clicks last quarter," and it answers without you building a report.

These are nice, and they make asking easier. The answers stay shallow, though, because they're reading the same aggregated tables the dashboard reads, and the good versions are reserved for the upper plans. Ask a question the underlying report can't express and you get a polite version of nothing, or worse, a confident number with no grain behind it.

And the obvious next thought, "fine, point the agent at the raw database and let it write its own queries," is harder than it sounds. On clean teaching schemas, today's models turn plain English into SQL with around ninety percent accuracy. On enterprise-realistic schemas with hundreds of columns and ambiguous names, that same benchmark falls to roughly twenty. Raw access is only the start. The data still has to be modeled, named, and structured before a machine can reason over it without making things up.

The fix is a substrate, not a smarter bot

Put those two facts together. An agent is only as good as the data it can reach, and reaching well means the data has to be both granular and structured. That is a description of a data warehouse, and a dashboard cannot be retrofitted into one.

This is the part of Sendlens that took us a while to say out loud, because it sounds less exciting than it is. Sendlens gives every marketer their own marketing data warehouse. We connect to your ActiveCampaign, your Brevo, whatever you send with, and we capture your events as they happen and keep them, at the lowest grain, for as long as you want them, normalized so a campaign in one tool is comparable to a campaign in another. We add the structured fingerprint on top, so every send is described consistently and can be grouped, filtered, and compared.

That changes what a question costs. "Which subject pattern is decaying across six months" stops being a project and becomes a query, because the six months are actually there and the patterns are actually labeled. "Compare this newsletter to our own newsletters, not to a flash sale" stops being a spreadsheet and becomes a cluster. And the recommendation that comes back is computed from your real history instead of narrated from this month's averages.

It's also, not by coincidence, the right thing to point an agent at. The same granular, labeled, long-horizon store that lets a person answer a hard question is exactly what an agent needs to answer it without guessing. The ESP gave your agent the buttons. A warehouse is where the memory lives.

The agents are coming into email whether or not the analytics is ready for them. That much is settled. The open question is what you let them reach into. Point them at the ESP's MCP server and they'll run your sends beautifully and go quiet the moment you ask why. Point them at a store that actually remembers, and the why becomes answerable.

You don't rip out your ESP to get there. Keep the sender you have. Add the lens, and the memory behind it.

Frequently asked questions

Does Brevo have an MCP server?

Yes. Brevo released an official, EU-hosted MCP server that lets AI agents like Claude act on your account. Its tools are generated from Brevo's API and are almost entirely operational: managing contacts, lists, campaigns, and deals. One module exposes campaign analytics, and it returns the same aggregated stats the Brevo dashboard already shows.

Does ActiveCampaign have an MCP server?

Yes. ActiveCampaign ships an official remote MCP server that connects to Claude, ChatGPT, and Cursor. It exposes around forty operational tools for contacts, tags, automations, and deals. It has no dedicated analytics or reporting tool. The closest thing is a raw feed of opens and clicks that the agent has to aggregate itself.

Can an AI agent do my email analytics through my ESP?

For operational work, yes. For analytics, not really. The agent can only reach what the ESP exposes, which is mostly real-time actions plus aggregated, short-window stats. Deeper questions about trends, attribution, and per-step performance need granular, long-horizon, structured data the ESP doesn't keep around.

What does Sendlens do differently?

Sendlens stores your full marketing history at event grain, across ESPs, for as long as you want it, with a structured fingerprint on every send. That granular, labeled store is the substrate real analytics needs, and the right thing to point an AI agent at when you want answers it can actually compute. See how it works.