Privacy-first · 100% in your browser

Clean your document
before feeding it to AI

Automatically find and remove names, emails, phone numbers, addresses, and financial data from PDFs, Word, Excel, PowerPoint, images, and text — before uploading to ChatGPT, Claude, Gemini, or any AI tool.

Your files never leave your device. Zero uploads. Zero tracking.

Drop your file here

or

PDF · Word · Excel · PowerPoint · Images · Text · Markdown · HTML · CSV · LaTeX

Works offline once loaded — your file is never uploaded. (Scanned PDFs need internet on first use to fetch the OCR engine; cached after.)

or paste text directly

Auto-detect

Names, emails, phone numbers, addresses, IBANs, credit cards, dates of birth, API keys, and more — flagged automatically.

Manual removal

Click any word or drag to select a region to remove anything the auto-detector missed.

Truly private

Everything runs in your browser. Your document is never uploaded to any server — not even ours.

PDF + text

Works on PDFs and plain text. Download a redacted PDF or copy the cleaned text straight into your AI prompt.

FAQ

Common questions

Why should I clean my document before using ChatGPT or Claude?
When you paste or upload a document to any AI service, that content is processed on their servers. Even with "memory off", the data is transmitted and processed. Redacting names, account numbers, and other sensitive info before prompting means you keep all the benefits of AI assistance without exposing private data.
Is my document really never uploaded?
Yes. CleanForAI runs entirely in your browser using JavaScript. Your file is read locally, processed locally, and the redacted output is generated locally. No data is sent to any server at any point. You can verify this by turning off your internet connection after the page loads — it still works.
What types of sensitive data does it detect?
Email addresses, phone numbers (international formats), credit card numbers, IBANs, Social Security / National Insurance numbers, dates of birth, physical addresses, URLs containing credentials, and API keys / tokens. Name detection uses pattern matching — it works well on common formats but you can always manually select anything it misses.
Does redaction actually delete the data or just hide it?
Real redaction — the original text bytes are gone from the output. CleanForAI renders each page to a canvas, draws black boxes over redacted regions, then re-encodes the result. The underlying text is not present in the downloaded file.
How can I verify my data isn't being sent anywhere?
Three ways to check:
  1. Offline test — load the page once with internet so the libraries cache, then disconnect your Wi-Fi or ethernet. Drop in a file and process it. It works exactly the same. Nothing can reach a server if there's no internet. (For scanned PDFs you need internet on the very first run so the OCR engine can download — after that, even OCR works offline.)
  2. DevTools Network tab — open your browser's developer tools (F12), go to the Network tab, and filter by Fetch/XHR. Process a file. You'll see zero outbound requests carrying your data. The only network activity is the one-time CDN load of the libraries themselves (PDF.js, pdf-lib, and — for scanned PDFs — the Tesseract.js OCR engine + English language model, totalling ~12 MB which your browser then caches). After that initial load, you can disconnect your internet and everything still works: file parsing, sensitive-data detection, OCR, redaction, and download. All processing of your file happens locally in your browser — nothing about your document is ever transmitted.
Can I use it on scanned PDFs?
Yes. For image-based PDFs, CleanForAI uses OCR to extract the text, then lets you redact before copying the cleaned text to your AI tool. The downloaded PDF will have the redacted image pages.
Is CleanForAI free?
Yes, completely free, no account required, no ads. If it saves you time, you can optionally leave a tip via Ko-fi to help keep it running.