DLP vs. Browser Extensions: Blocking PII in AI Chatbots
Employees at every size of organisation are pasting client names, account numbers, and medical records into ChatGPT, Claude, and Gemini. Most companies know this is happening. Fewer know what to actually do about it.
The options fall into two camps: traditional data loss prevention (DLP) tools that operate at the network level, and browser-based extensions that detect and mask personally identifiable information (PII) before it leaves the user's device. Neither is a complete solution on its own. Understanding what each approach actually does, where it falls short, and how the two fit together is the first step toward a real answer.
Why traditional DLP struggles with AI chatbots
DLP has been an enterprise staple for over a decade. It was designed to prevent sensitive data from leaving an organisation through email attachments, file uploads to cloud storage, USB transfers, and endpoint exfiltration. The pattern it catches is data in transit over inspectable channels, and it does that job well.
AI chatbots break this model in several ways.
The inspection problem
ChatGPT, Claude, and Gemini operate entirely inside the browser over encrypted HTTPS. When a user types or pastes text into a chat input field and presses send, the data leaves as a standard HTTPS POST request to the provider's API. From the network's perspective, this looks identical to any other legitimate web traffic.
Traditional DLP tools that rely on network-level inspection have limited options here. They can perform TLS interception (decrypting and re-encrypting HTTPS traffic at a proxy), but this is invasive, breaks certificate pinning on some sites, and raises its own security and privacy concerns. They can route traffic through a reverse proxy or CASB, but this requires IT deployment and ongoing configuration per platform. Or they can deploy a browser agent, which is effectively what browser extensions already do, just at enterprise scale and cost.
The data format problem
Most DLP rules are built around file types and structured data patterns: credit card numbers in spreadsheets, Social Security numbers in PDFs, patient records in database exports. AI chatbot input is different. It is free-form text, often pasted from documents, emails, or internal systems, and it can contain any combination of PII types in any format. A single paste might include a client's name, their phone number, a partial address, and a medical condition, all embedded in a paragraph of natural language.
Pattern matching catches the structured items (card numbers, SSNs). But names, addresses, and contextual identifiers require natural language understanding, which most network-level DLP tools are not designed to perform on live chat input.
The retrofit gap
Several enterprise DLP vendors now claim coverage for AI chatbot usage. Netskope, Palo Alto Networks, Zscaler, and others have added AI-specific policies in recent releases. However, the depth of coverage varies significantly. Some offer URL-based blocking (allowing or denying access to ChatGPT entirely), some offer keyword-based content inspection, and a few offer actual NER-based PII detection on browser input. The gap between "we support AI chatbot DLP" in the marketing and "we inspect every paste event in real time" in practice is often wide.
For organisations evaluating DLP for AI chatbot coverage, the key question to ask the vendor is specific: does your tool actually block PII in browser input fields on ChatGPT, Claude, and Gemini in real time before the request is sent? If the answer involves network-level inspection or post-hoc log analysis, the data has already left the browser by the time it is flagged.
How browser-level PII detection works
Browser extensions take a fundamentally different approach. Instead of inspecting traffic at the network edge, they operate inside the browser itself, reading the content of input fields before the message is submitted to the AI provider.
Detection before transmission
The extension monitors the chat input field on supported platforms. When the user types or pastes content, the extension scans the text using a combination of regex pattern matching (for structured PII like credit card numbers, SSNs, API keys, and phone numbers) and named entity recognition (for unstructured PII like person names, addresses, and organisations).
When PII is detected, it is replaced with safe placeholders before the
message is sent. The AI provider receives [PERSON_A] instead of a
real name. The mapping between placeholder and real value is stored
locally in the browser, encrypted, and never transmitted anywhere.
When the AI responds using the placeholder, the extension swaps the real value back in so the user sees a natural conversation. The AI provider never had access to the original PII.
PiiBlocker, the extension we built, works this way. Everything runs locally in the browser with zero data collection. But the architectural pattern is not unique to PiiBlocker. Several tools take this approach, and the mechanism is the same regardless of vendor: intercept before send, mask PII, store the mapping locally, unmask on response. For a deeper explanation, see what PII masking means in practice.
What browser extensions catch reliably
Structured PII is the strength. Credit card numbers, Social Security numbers, National Insurance numbers, API keys, email addresses, phone numbers, and dates of birth all follow predictable formats that regex handles well. NER-based detection extends coverage to person names, addresses, employer names, and medical conditions with reasonable accuracy.
The combination of regex and NER means a browser extension can catch 15 or more PII types in real time, including items embedded in pasted text that the user didn't consciously register as sensitive.
What browser extensions cannot catch
Unstructured business secrets that are not PII fall outside the detection scope. Internal project codenames, M&A details, pricing strategy, proprietary algorithms, and trade secrets are not patterns that PII detection is designed to find. A browser extension will catch the name of the person involved in the deal, but not the deal itself.
This is the honest gap in browser-level protection. For organisations where the primary risk is proprietary business information rather than personal data, DLP with custom content policies is the better fit.
DLP vs. browser extensions: a practical comparison
The two approaches serve different needs. This table summarises the practical differences.
| DLP (network/endpoint) | Browser extension | |
|---|---|---|
| Deployment | IT-managed, enterprise rollout | User-installed, no IT required |
| Detection scope | Structured PII + custom content policies (trade secrets, keywords, file types) | Structured PII + NER-based name/address detection |
| Inspection point | Network edge, endpoint agent, or CASB proxy | Browser input field, before HTTP request |
| Blast radius | Whole organisation, policy-enforced | Individual user or small team |
| Audit trail | Centralised logging and compliance reporting | Local only (by design, for privacy) |
| Cost | $10-50+ per user/month, enterprise contracts | Often free or low-cost |
| Best for | Large organisations with IT teams and compliance requirements | Individuals, small teams, and as an endpoint layer within larger deployments |
What to choose based on your situation
Individuals and small teams
If you don't have an IT department, a DLP solution is not realistic. Enterprise DLP requires deployment, configuration, and ongoing management that assumes dedicated security staff.
Browser-level PII masking is the practical choice. Install an extension, and the protection is immediate. No accounts, no configuration beyond the initial install, no ongoing maintenance.
For a comparison of the available browser-based PII masking tools, including detection scope, local vs. server processing, and pricing, see our separate review.
Regulated industries
Legal, healthcare, and finance organisations face a harder problem. Compliance frameworks (GDPR, HIPAA, SOX) require not just prevention but documentation: proof that controls exist, logs showing what was blocked, and audit trails for regulators.
For these organisations, the answer is usually both. DLP provides the audit trail and policy enforcement layer that compliance teams need to show regulators. Browser-level masking provides the real-time protection at the point of use, catching PII before it reaches the AI provider.
The browser layer is what prevents the leak. The DLP layer is what proves to auditors that prevention is happening. For a deeper look at the GDPR compliance framework for AI chatbots, including DPIA requirements and lawful basis assessment, see our compliance guide.
Enterprise with existing DLP
If your organisation already runs a DLP solution, the question is not whether to replace it. The question is whether it actually covers AI chatbot paste events.
Ask your DLP vendor: does the tool inspect text pasted into ChatGPT, Claude, and Gemini input fields in real time, before the request is sent? If the answer is no, or if the answer is "we log it after the fact," a browser-level extension fills that specific gap without replacing or conflicting with your existing DLP investment.
Many enterprise security teams are deploying browser-level masking as an additional endpoint control alongside their existing DLP stack. It adds a layer that catches what network-level inspection misses.
The policy-only approach does not work
Many organisations currently rely on acceptable use policies: written guidelines instructing employees not to paste sensitive data into AI chatbots. This is the cheapest approach and the least effective.
Research consistently shows that employees paste sensitive data into AI tools regardless of policy. A 2024 study found that 73% of employees share sensitive data with AI chatbots without realising the risk. The data includes names, email addresses, financial information, and proprietary business details.
Samsung provides a concrete case study. In March 2023, Samsung semiconductor engineers pasted proprietary source code, internal meeting notes, and chip testing sequences into ChatGPT within three weeks of the company lifting its internal ban. The company had both a policy and security training. Neither prevented the leaks. For a full account of this and four other real-world data leak incidents through AI chatbots, see our incident report.
Policies are necessary but not sufficient. Technical controls turn the policy from "we told people not to do this" into "we prevented it from happening."
A realistic implementation checklist
For teams ready to move from awareness to action, here is a practical starting point.
-
Audit current AI chatbot usage. Survey which platforms employees use (ChatGPT, Claude, Gemini, Perplexity, Copilot) and which departments handle personal data regularly. HR, legal, healthcare, finance, and customer service are typical hotspots.
-
Map where PII enters AI tools. The riskiest moments are not typed prompts. They are pasted content: cover letters, customer support drafts, code with comments containing customer data, meeting notes, and email threads. Identify the paste patterns specific to your organisation.
-
Choose your deployment model. Browser-level only (individuals and small teams), DLP only (enterprises with existing infrastructure and no browser-level gap), or layered (DLP for audit trail and policy enforcement, browser extension for real-time detection).
-
Pilot with one team. Deploy the chosen tool to a single department for two weeks. Measure what it actually catches in your environment, not what the vendor claims it catches in a demo.
-
Measure real detection rates. After the pilot, review what was detected: how many PII items, what types, how many false positives. This data is what justifies broader rollout to leadership.
-
Update your acceptable use policy. Reference the technical control explicitly. "Employees must use [tool name] when interacting with AI chatbots" is enforceable. "Employees should be careful" is not.
-
Review quarterly. New AI platforms appear regularly. Perplexity, Mistral, and Copilot are gaining adoption quickly. Whatever tool you deploy today needs to keep pace with the platforms your employees adopt tomorrow.
Bottom line
DLP and browser-level PII masking are not competing solutions. They solve different parts of the same problem.
DLP covers the enterprise perimeter: policy enforcement, audit logging, and broad content inspection including non-PII business secrets. Browser-level masking covers the last mile: real-time detection and replacement of PII at the exact moment it would otherwise leave the user's device.
For individuals and small teams, a browser extension is the practical answer. For regulated industries, both layers together give you prevention and proof. For enterprises with existing DLP, the honest question is whether your current tool actually inspects browser-based AI chat input in real time. If it doesn't, the browser layer fills a gap your DLP was never designed to cover.