Why a Black Box Isn't Redaction — and How to Actually Remove the Text

01 // The failures

"Redacted" documents leak constantly.

In 2019, lawyers for Paul Manafort filed a court document with passages blacked out. The blackout was only a visual layer — the underlying text was still in the PDF. Reporters simply copied the "redacted" passages and pasted them into a text editor, revealing sealed details the filing was meant to hide. It became one of the most-cited redaction failures of the decade, precisely because it was so ordinary a mistake.

It is far from unique. Over the years, government agencies, courts, and corporations have released documents — including high-profile releases such as portions of the Epstein-related court files — where text under the black boxes was recoverable, or where redactions could be defeated by selecting all the text, removing the overlay object, or extracting the file's hidden text layer. The pattern is decades old and it keeps repeating for one reason: the tool drew a box but never deleted what was underneath.

The lesson isn't "those people were careless." Most used real, expensive software. The lesson is that covering text and removing text look identical on screen — and only one of them is safe to send.

02 // Why it happens

A black box is just paint on glass.

A PDF is not a picture — it's a structured document with a text layer, drawing objects, hidden layers, and metadata all stacked together. When you draw a filled black rectangle (or "highlight" text in black) over a passage, you add one more object on top. Everything you were trying to hide is still in the file, unchanged, directly beneath the new shape.

That leftover content comes back out three easy ways. Select-all and copy grabs the text layer regardless of what's drawn over it, so the words paste straight out. "Remove object" or "delete annotation" in any PDF editor lifts the black rectangle off and exposes the page. And text extraction — the same routine search engines and screen readers use — reads the underlying characters directly, box or no box.

Metadata makes it worse: author names, edit history, earlier draft text, comments, and the document's original filename often ride along invisibly even when the visible page looks clean. A black box does nothing to any of it.

03 // What works

True redaction destroys, it doesn't cover.

True (destructive) redaction removes the content itself — the characters, the objects, and the metadata — so there is literally nothing left under the mark to recover. Two approaches do this reliably:

A · DESKTOP BENCHMARK

Adobe Acrobat Pro

Acrobat Pro's Redact tool genuinely removes the marked content rather than masking it, and its separate Sanitize Document step strips hidden metadata, layers, and scripts. It's the long-standing benchmark for redaction done right, and the right choice if you also need OCR, text editing, or form authoring.

The trade-off is cost and setup: a desktop install, an Adobe account, and a subscription around $19.99/month.

B · IN YOUR BROWSER

BlackoutPDF

BlackoutPDF takes a different route to the same guarantee: it re-renders each page to flat pixels, burns the black boxes into those pixels, and rebuilds a brand-new PDF from the images. Because the output is built from scratch, there is no text layer left under the black and the original metadata doesn't survive — nothing to select, copy, or extract.

It runs 100% in your browser — your file is never uploaded — and it's free for short documents. The honest trade-off: rasterized output is not text-selectable, which for a redacted document is usually exactly what you want, but is wrong if the recipient needs to search the file.

04 // Verify it

How to check your redaction actually held.

Never trust that a redaction worked because the page looks right. Spend thirty seconds proving it before you send anything. If any of these tests returns the hidden content, it was not redacted — go back and do it destructively.

01

Try to select and copy under the box

Open the finished file, drag-select across a blacked-out passage (or hit Select All), and paste into a plain text editor. If any "hidden" words appear, the text layer survived — the redaction failed.

02

Open the file's properties / metadata

Check Document Properties (or "Get Info") for author, title, and history fields. Leftover names, original filenames, or edit history mean metadata wasn't stripped — a separate leak even if the visible page is clean.

03

Run the strict test: extract the text

Use any "extract text" or "save as text" feature, or a command-line tool, to pull every character out of the file. A truly redacted page yields no recoverable text where the black boxes are. This is the test that catches what the eye and the copy-paste miss.

Redact a PDF the right way — free, nothing uploads.

BlackoutPDF removes the content instead of covering it: pages are re-rendered to pixels and rebuilt into a fresh PDF, so there's no text layer left under the black and metadata is stripped. It all happens in your browser — your document never leaves your machine. Free for short files; verify the result with the three tests above before you send.

Related, while you're here: redact a PDF without uploading it explains the no-upload architecture in depth, and remove metadata from a PDF covers stripping the hidden author, history, and layer data that black boxes leave behind.

Open the redactor Compare the tools

05 // Questions

Straight answers about redaction.

Can you recover text under a black box in a PDF?

Usually yes, if the box was simply drawn over the text. The original characters stay in the file's text layer, so anyone can select and copy them out, delete the black rectangle as an object, or run text extraction to read what's underneath. This is exactly how "redacted" court filings and government releases have leaked. The only way to prevent it is true redaction, which removes the content rather than covering it.

What is true (destructive) redaction?

True redaction permanently removes the marked content — the characters, the underlying objects, and the associated metadata — so there is nothing left in the file to recover. It's the opposite of masking, where a black shape is layered on top while the text survives beneath. Adobe Acrobat Pro's Redact plus Sanitize Document does this on the desktop; BlackoutPDF does it by re-rendering pages to pixels and rebuilding a new PDF entirely in your browser.

Is rasterizing a PDF a secure way to redact?

Yes, when it's done by flattening the page to an image and building a new file from that image. Once the page is pixels and the original PDF structure is discarded, there is no text layer, no hidden object, and no metadata to extract from beneath the black — the redacted content is genuinely gone. The trade-off is that the resulting page is an image, so its text is no longer selectable or searchable. For a document you're redacting, that's typically the desired outcome.

How do I check whether my redaction actually worked?

Run three quick tests on the finished file. First, try to select and copy across the blacked-out areas and paste into a text editor — nothing hidden should appear. Second, open Document Properties to confirm no leftover author, history, or filename metadata. Third, for the strict check, extract all text from the file; a true redaction yields no recoverable text where the black boxes are. If any test surfaces the hidden content, the redaction failed and must be redone destructively.

A black box isn't redaction.