Your company's financial data is not a consumer product. Build like it.

Two architecture decisions that determine whether your AI finance tool is actually enterprise-ready — and why most products aren't making them.

Rob Owen
Pipeline architecture diagram showing system prompt isolation, on-demand tool calls, and inference boundary with tenant isolation and audit trail

There's a version of the AI-in-finance conversation that stays on the surface. "Will they train on my data?" "Is my data secure?" "What does the privacy policy say?"

These are fine questions. They're not the important ones.

The important question is what actually happens to your company's financial data — your revenue, your payroll, your margins, your runway — at the moment an AI model processes it. Not in policy. In the architecture.

That question has a specific technical answer. Most products don't surface it. A few are designing around it intentionally. The difference matters a lot if your customers are businesses with real compliance requirements.


Why this matters more than most people admit

Your company's financial data is not like your personal data. It's not your email history or your social media activity.

It's the information that tells your investors whether to extend your runway. It tells your bank whether to approve your credit line. It tells your competitors — if they ever got it — exactly where you're vulnerable. Revenue by customer. Payroll by department. Cash position at the end of every month.

When you start moving this data through AI systems, the governance question isn't theoretical. It's the thing an enterprise customer's security team asks before signing. It's the gap that surfaces in due diligence. It's what your lawyers should be reviewing before your product goes to market.

Two announcements this month made this concrete: OpenAI launched a personal finance feature connecting to 12,000+ financial institutions, and Anthropic released 10 pre-built finance agent templates. Both good products. Both got reactions that missed the more important question.


The consumer/enterprise confusion

OpenAI's personal finance feature is a consumer product. This should be obvious, but the reaction to it suggests it isn't: a meaningful number of business operators read that announcement and started thinking about their company's finances.

Consumer products and enterprise products are not the same thing. Not in the liability model. Not in the data handling terms. Not in the audit trail. Not in the governance structure.

A consumer product connects to your bank account under consumer terms of service. There's no data processing agreement governing what happens to your company's financial data. There's no dedicated tenant isolation — you're one of millions of accounts. Training opt-outs may apply, or may not, depending on which plan tier you're on and what the ToS says this month.

None of this is appropriate for company financial data. Not because OpenAI is careless — they're not — but because it's the wrong product category. Connecting your personal checking account to a budgeting AI and connecting your company's QuickBooks to a financial analytics platform are different decisions with different requirements.

If you're evaluating AI for your business finances, the product category question comes first. Consumer or enterprise. The technical questions follow from there.


The tool-use architecture question everyone is skipping

Here's what actually happens when an AI model answers a financial question. Understanding this is prerequisite to understanding the data custody problem.

Modern AI finance tools use a "tool-use" or "function-calling" architecture. The model doesn't know your company's finances by default — it has no access to your data at startup. Instead, it reasons about what data it needs and calls tools to retrieve it. The tool queries your database, gets the relevant data, and returns it to the model as context. The model reads that result and generates an answer.

This is a good architecture. It means the model only touches data it actually needs, for the specific query at hand. It's more efficient and more controlled than dumping your entire transaction history into a prompt.

But here's the thing that matters for data custody: at the moment that tool result comes back to the model — when your Q1 revenue, your March payroll run, your AR aging — becomes part of the model's context, that data is in the inference pipeline. Whatever infrastructure is running inference now has access to it.

Where inference happens is therefore the data custody question. Not "will they train on it." Where does it physically run, under whose infrastructure agreement, and what's the audit trail.


The spectrum

There's a real range here, and it determines your compliance story.

Consumer AI products sit at one end. Inference runs on the provider's consumer infrastructure. Your data is processed under consumer terms. No data processing agreement. No enterprise audit trail.

Direct API sits in the middle. You call an AI provider's API directly from your application. Your data transits their servers as query context. If you have an enterprise agreement, there's a data processing agreement and a training opt-out. But inference still happens on their infrastructure, under their terms.

Managed cloud service sits at the other end. Instead of calling an AI provider's API directly, you invoke the same model through a managed cloud service — AWS Bedrock, for example. Inference happens within that cloud's infrastructure. The data handling agreement is with the cloud provider, not directly with the model company. You can deploy within your own VPC. The data doesn't leave your cloud account's network boundary to reach the model.

Most AI finance tools on the market today are in the middle tier. A few are moving toward the third.


What we built, and what we're building toward

At MyRunwayHealth, Merlin is our AI financial analyst. We've been thinking about this problem since before we wrote the first line of code.

System prompt with zero financial data. Merlin's system prompt — the instructions that define its behavior — contains no financial information. The model starts every session knowing nothing about your company's books. This is a deliberate design choice. It means no financial data transits the inference pipeline unless a specific query requires it.

On-demand data retrieval. When you ask Merlin a question, it retrieves only the data needed to answer that specific question, at the moment it's asked. Not your full transaction history as a baseline. The data minimization principle applied at the architecture layer, not the policy layer.

Tenant isolation enforced in code. Every tool call validates that it's accessing the right company's data before executing. Not a config flag. Not a row-level filter. A hard check in the infrastructure layer that returns an access denied response if the tenant doesn't match. Your data is fully separated from every other company's data at the database level.

Full audit trail. Every tool call Merlin makes, every piece of data returned, every request to the AI model — logged. The audit trail answers the question "what did the model actually see?" Not just for debugging. Because that's a real compliance requirement for business financial data.

The Bedrock direction. We're moving Merlin's AI inference to AWS Bedrock. Same Claude model we use today. Completely different data custody story.

With direct Anthropic API (where Merlin runs today), your query context — the tool results containing your financial data — transits Anthropic's infrastructure. Enterprise sub-processor agreement, no training, audited. But Anthropic's servers.

With Bedrock, inference runs within AWS's managed infrastructure. The agent runs inside a VPC with private subnets and no inbound internet traffic. Audit logs go to CloudWatch. The data handling agreement is with AWS. Your financial data doesn't leave your AWS account's network boundary to reach the model.

That's not a minor distinction for a business dealing with sensitive financial data. It's the difference between "we have a data processing agreement with our AI provider" and "your data stays inside your cloud infrastructure." The second answer is a lot easier to explain to an enterprise customer's security team. It's also the right architecture if you're serious about financial data governance.


The questions worth asking

Before connecting any AI to your business's financial data — whether you're buying a product or building one — these are the questions that actually matter:

Is this a consumer product or an enterprise product? The answer determines everything downstream: data handling terms, audit trail, governance structure, liability model.

Where does inference happen? The AI provider's consumer infrastructure? Their enterprise API? A managed cloud service within your own cloud environment? The answer tells you whose data handling agreement you're actually operating under.

What data reaches the model, and when? Full transaction history front-loaded? Or only query-relevant data retrieved on demand? This is an architecture question, not a policy question. Ask to see the architecture.

Is there an audit trail? Not "we log things." A specific answer: every tool call, every piece of data returned to the model, every API request. Queryable. Exportable. Your evidence that the system behaved as designed.

Is tenant isolation enforced in code or in config? Config can be misconfigured. Code-level enforcement at the infrastructure layer cannot.

The AI finance market is moving fast. Most of the energy is going into capabilities: accuracy, speed, integrations, UI. The data governance infrastructure tends to come later — usually when an enterprise deal stalls in security review, or when you realize your ToS doesn't cover how the product actually works.

Design for it now. Retrofit is expensive.

Field notes from the cap table

Practical posts on runway, burn, and the boring-but-critical math of bootstrapping. Sent occasionally — never spammy.

Email delivery via Substack. Unsubscribe any time.