Back to blog

How to Redact a PDF Without Leaking the Hidden Text

By ZimaPDF TeamPublished on
Updated on

Many people think redacting a PDF means drawing a black rectangle over the private part and calling it a day.

That is not redaction. That is decoration with legal consequences.

If the underlying text is still sitting in the file, someone can often copy it, search it, extract it, or reveal it with the wrong viewer. If the document contains names, account numbers, addresses, pricing terms, or legal details, that is a serious problem.

What proper PDF redaction actually does

Real redaction removes the sensitive content from the document so it cannot be recovered in normal use.

That means:

  • The text should not be selectable.
  • The hidden words should not appear in search.
  • Copy and paste should not bring the private content back.
  • The visual cover should be part of the final saved file, not just an overlay.

If your current method does not meet those conditions, it is not safe enough.

How to redact a PDF safely

Here is the straightforward way to do it:

  1. Open the Redact PDF tool.
  2. Upload the file that contains sensitive information.
  3. Mark the text or page areas that need to be removed.
  4. Apply the redaction so the content is permanently stripped from the final output.
  5. Download the redacted PDF and review it once before sharing.

That last check matters. Always open the final file and try to search or copy the removed text.

What should be redacted?

Common examples include:

  • Personal data such as phone numbers, addresses, birth dates, and ID numbers.
  • Financial details such as bank accounts, tax information, and invoice totals.
  • Legal clauses that are not meant for every recipient.
  • Client names or internal company references in shared drafts.

If the recipient does not need it, it should not be in the file.

Why covering text with shapes is risky

This is where people get caught out. In many editors, the black box is just another layer on top of the PDF. The original content remains untouched underneath.

So yes, the page looks safe. But the data is still there.

That is why real redaction tools matter. They are built to remove the information, not just hide it visually.

Redaction vs deleting pages

If a whole page is sensitive, you do not always need redaction. Sometimes it is cleaner to use Remove PDF Pages and delete the page entirely.

Use redaction when you need to keep the rest of the page visible. Use page removal when the entire page should disappear.

One careful habit is enough

Before sending any contract, report, or case file outside your team, ask one question:

Could someone learn something from this PDF that they should not know?

If the answer is yes, do not draw boxes and hope for the best. Use the Redact PDF tool and remove the information properly before the file leaves your hands.

Why Black Box Redaction Fails: A Technical Explanation

To understand why "just draw a box over it" is insufficient, it helps to understand how a PDF stores information.

A PDF file is not a flat image — it is a structured document containing separate content streams. These streams hold the actual text characters, their coordinates on the page, and their visual rendering instructions. When you draw a black rectangle over text in a basic PDF editor, you are adding a new visual element to the top layer of the document.

The original text stream is untouched. It is still there in the file's internal data, just visually obscured on screen.

Anyone with access to a PDF extraction tool — or even just the "Select All" function in their PDF reader — can copy the underlying text, including the parts hidden under black boxes.

True redaction tools remove the text from the content streams entirely, then replace the visual area with a solid block that is part of the content itself.

High-Profile Redaction Failures

Poorly redacted PDFs have caused real-world legal and reputational damage for organisations of all sizes:

  • Government agencies have submitted court documents with "redacted" names and addresses that were instantly copyable.
  • Law firms have filed exhibits with sensitive case details hidden under shapes but fully extractable.
  • Journalists have revealed confidential business information from company filings where the redaction was purely cosmetic.

In every case, the author assumed that a black rectangle meant removal. It does not.

What You Should Redact

The most commonly redacted categories of information:

Personal Data (GDPR/Privacy Law)

  • Full names in combination with other identifying details
  • Email addresses and phone numbers
  • Home or business addresses
  • National identification numbers, passport numbers, driving licence numbers
  • Dates of birth

Financial Information

  • Bank account and sort code numbers
  • Credit card numbers
  • Salary and payroll data
  • Invoice totals in sensitive commercial negotiations

Legal and Contractual Details

  • Proprietary clauses not meant for all recipients
  • Settlement amounts and terms
  • Internal counsel opinions

Medical Information

  • Patient names and diagnosis details
  • Prescription information
  • Insurance claim details

Before and After Redaction: A Review Checklist

Before you share a redacted PDF, run through this checklist:

  1. Open the redacted file in a PDF reader and try to select the redacted text. If you can highlight or copy it, the redaction did not work properly.
  2. Search for a redacted term using Ctrl+F / Cmd+F. If the search finds it, the text is still in the file.
  3. Check page headers and footers. These are often overlooked and may contain names or document IDs that should also be redacted.
  4. Review document metadata. Author names, tracked changes, and document comments may contain information you did not intend to include. Use a PDF cleaner or the flatten function to strip these before sharing.
  5. Check all pages, not just the ones you think contain sensitive data. References to redacted content sometimes appear in later summary tables or index sections.