AI Privacy for SMBs: What Data Is Your Business Actually Sharing with OpenAI and Anthropic?

AI Privacy for SMBs: What Data Is Your Business Actually Sharing with OpenAI and Anthropic?

AI privacy for small and medium-sized businesses is no longer a theoretical question. It's the first thing every director asks before a single prompt gets typed: does my customer data stay private? And rightly so. Because the default settings of ChatGPT, Claude and Gemini weren't designed with your customer base in mind. This article gives you a concrete overview of what each platform stores, what gets used for model training, how to turn that off, and when you need a completely different approach. No legal jargon, just usable answers.

What happens to your data by default

Most business owners start with the free or paid consumer version of ChatGPT, Claude or Gemini. That's understandable, because they're easy to get started with. But it's exactly those versions that have the broadest data processing terms.

With ChatGPT (OpenAI) in the standard consumer version, meaning via chatgpt.com without an enterprise subscription, your conversations are stored by default and can be used to improve the model. OpenAI keeps this data for up to 30 days for safety checks, but the contents of conversations can be kept longer if you don't actively turn that off. You can disable "Improve the model for everyone" in the settings, but even then your conversation history is kept unless you manually turn that off too.

With Claude (Anthropic) via the consumer interface on claude.ai, it's a similar story. Anthropic stores conversations and uses them for product development, unless you have a paid subscription and activate the right settings. Anthropic's privacy page is more transparent than many of its competitors', but the default settings aren't built for business-sensitive data.

Gemini (Google) via the free interface integrates with your Google account. That means Google can, in principle, link your conversation data to other data they hold about you. For business use without a Google Workspace Enterprise license, this is particularly unwise if you work with customer data.

The difference between the API and the consumer version

This is the crucial distinction many business owners miss. As soon as you use the API of OpenAI or Anthropic, the rules of the game change fundamentally.

OpenAI states explicitly in its API terms of use that data sent through the API is not used for model training, unless you explicitly opt in. The same goes for Anthropic. Through the Claude API, your prompts and outputs aren't used to train new models by default. OpenAI keeps API data for a maximum of 30 days for abuse detection, after which it's deleted.

For an SMB, this means concretely: if you call the API through a tool like n8n, Make or a custom integration instead of using the chat interface, you have considerably more control over what happens to your data. That's not a marginal difference, it's the difference between data that could potentially end up in a training set and data that won't.

When is the API enough protection?

For most SMBs in services or e-commerce, the API approach is enough, provided you meet a few basic conditions. You don't process special categories of personal data such as medical records or citizen service numbers through the API without additional safeguards. You've signed a data processing agreement with the provider, because OpenAI and Anthropic offer those for business use. And you don't send entire customer databases as context, but work with anonymized or minimized data.

Using ChatGPT safely with your data: the practical steps

If you do work with the ChatGPT interface, or your employees do, there are concrete steps you can take today.

Turn off conversation history via Settings, Data Controls, "Improve the model for everyone". This prevents your input from being used as training data. Use ChatGPT Team or ChatGPT Enterprise if you have a team: with those subscriptions, conversations aren't used for training by default and you get admin rights to enforce policy. With Enterprise, your data also doesn't leave your organization for training processes.

For Claude: use Claude for Work or the API with a business account. Anthropic offers a Data Processing Agreement for business users, which is necessary if you process personal data and fall under the GDPR.

When do you need a self-hosted or EU-only setup?

There are situations where the API of OpenAI or Anthropic simply isn't the right solution. That's the case if you work in a sector with strict data localization requirements, such as healthcare, financial services or the legal sector. It's also the case if your clients have explicitly stipulated that their data may not leave the EU, or if your own ISO or NEN certification requires it.

In those situations, there are two routes. The first is n8n self-hosted: you run the automation layer on your own server or in an EU data center, and connect it to a locally hosted language model. Think of open-source models you run on your own infrastructure, combined with n8n as the orchestration tool. No data leaves your environment.

The second route is using Azure OpenAI Service or Google Vertex AI, where the models from OpenAI and Google are offered from EU data centers under European contract terms. Microsoft Azure, for example, lets you run GPT-4o from a data center in the Netherlands or Ireland, with a GDPR-compliant data processing agreement under which your data isn't used for model improvement.

What does the EU AI Act mean for you in practice?

The EU AI Act, which takes effect in phases, requires providers of AI systems to be transparent about how models work and which data was used. For SMBs that use AI as a user, not as a developer, the direct obligations are limited. But the law does strengthen your position as a customer: providers have to be clearer about data use, and you have more grounds to demand data processing agreements. Practically speaking, this means that for every AI tool you use, you need to be able to show that you know which data is being processed and on what basis.

AI data protection for your business: a workable approach

You don't need to be an IT lawyer to get this right. A workable approach for an SMB looks like this. Start with an inventory: which employees use which AI tools, and what data do they type into them? Then you switch from consumer versions to business subscriptions or the API, sign data processing agreements with OpenAI or Anthropic, and set out in a short internal guideline what is and isn't allowed in a prompt. For most businesses, that's sorted within a few weeks.

If you work in a sector with stricter requirements, you look at Azure OpenAI from an EU data center or a self-hosted setup with n8n. Which route fits your business depends on your data, your clients and your certifications. Want concrete advice on that? Book a free discovery call, and we'll map it out together.

Ready to win back your time?

Book a free discovery call. We look at your business together and show you how much capacity you can win back with an AIOS.

Book a free call →