AI-Redact

How to Redact Medical Records: HIPAA Compliant Guide

Medical records contain some of the most sensitive personal information that exists — diagnoses, treatment plans, mental health notes, substance abuse history, genetic information, and sexual health data. When these records need to be shared for purposes beyond direct patient care, the Protected Health Information (PHI) within them must be redacted.

HIPAA does not just suggest this — it requires it. And the penalties for getting it wrong are severe. This guide walks through the step-by-step process of redacting medical records in compliance with HIPAA, from identifying what needs to be redacted to verifying the final output.

When Medical Records Need Redaction

Medical records do not always need redaction. HIPAA permits the use and disclosure of PHI for treatment, payment, and healthcare operations without patient authorization. However, redaction is required in several common scenarios.

Records Requests from Attorneys

When an attorney requests medical records — whether for personal injury litigation, workers' compensation, disability claims, or family law matters — the records often need to be redacted to remove information unrelated to the specific matter. A request for orthopedic treatment records does not entitle the attorney to see the patient's psychiatric notes.

Research Use

Medical researchers frequently need access to patient data for clinical studies, quality improvement projects, and epidemiological research. In most cases, the data must be de-identified (redacted) before researchers can use it without individual patient authorization or IRB waiver.

Insurance and Billing

When sharing records with insurance companies or other payers, the disclosure should be limited to the minimum necessary information. Treatment details unrelated to the billed service may need to be redacted.

Subpoenas and Court Orders

Responding to subpoenas and court orders for medical records often requires careful review. Not everything in the chart is necessarily responsive, and some information may be subject to additional protections (substance abuse treatment records under 42 CFR Part 2, psychotherapy notes under HIPAA).

Business Associate Sharing

When sharing records with business associates — billing companies, IT vendors, consultants — the minimum necessary standard applies. Redact information that the business associate does not need to perform its function.

Public Health Reporting

Sharing de-identified data with public health authorities for disease surveillance, outbreak investigation, or population health studies requires redaction of individual identifiers.

Training and Education

Using real patient cases for medical education, residency training, or staff development requires de-identification unless the patient has provided specific authorization.

PHI to Redact: The 18 HIPAA Identifiers

HIPAA's Safe Harbor method requires removal of all 18 categories of identifiers. In the context of medical records, here is what to look for in each category.

1. Patient Names

Full names, maiden names, aliases, and nicknames. Also redact names of relatives, employers, and household members if they appear in the record. Names appear throughout medical records — in headers, in clinical notes, in referral letters, and in correspondence.

2. Geographic Data

Street addresses, city, county, ZIP code, and any geographic unit smaller than a state. In medical records, addresses appear on face sheets, insurance forms, and referral documents. Note: the first three digits of a ZIP code may be retained if the geographic unit contains more than 20,000 people.

3. Dates

All dates directly related to the individual except year. This includes:

  • Dates of birth
  • Admission and discharge dates
  • Dates of service
  • Surgery dates
  • Lab result dates
  • Appointment dates
  • Dates of death

Only the year may be retained. For ages over 89, the age must be aggregated into a single category of "90 or older."

4. Phone Numbers

Home, work, mobile, and emergency contact phone numbers. These appear on registration forms, face sheets, and contact notes.

5. Fax Numbers

Fax numbers for the patient, referring providers, and other contacts listed in the record.

6. Email Addresses

Patient email addresses, including those used for patient portal communications.

7. Social Security Numbers

Full or partial SSNs. These commonly appear on registration forms, insurance documents, and billing records within the medical chart.

8. Medical Record Numbers (MRN)

The internal medical record number assigned by the healthcare organization. This is a critical identifier unique to the patient within the facility.

9. Health Plan Beneficiary Numbers

Insurance member IDs, Medicaid numbers, Medicare beneficiary identifiers, and other health plan numbers.

10. Account Numbers

Patient billing account numbers, financial responsibility numbers, and guarantor account numbers.

11. Certificate/License Numbers

Professional license numbers of treating providers (if they could be linked back to the patient through the provider-patient relationship in small practices), and any certificate numbers belonging to the patient.

12. Vehicle Identifiers

Vehicle identification numbers or license plate numbers mentioned in the record (common in accident-related records).

13. Device Identifiers

Serial numbers of implanted medical devices (pacemakers, joint replacements, insulin pumps) and unique device identifiers (UDIs).

14. Web URLs

Patient portal URLs, MyChart links, or any web addresses associated with the patient.

15. IP Addresses

IP addresses from telehealth sessions or patient portal access logs.

16. Biometric Identifiers

Fingerprints, voiceprints, retinal scans. These are less common in standard medical records but may appear in certain facility types.

17. Full-Face Photographs

Clinical photographs showing the patient's face, photo IDs in the chart, and any comparable images.

18. Other Unique Identifiers

Any other unique identifying number, characteristic, or code. In medical records, this includes:

  • Encounter numbers
  • Accession numbers (lab and radiology)
  • Internal tracking numbers
  • Any codes created by the organization that could be used to re-identify the patient

Step-by-Step Redaction Process

Step 1: Determine the Purpose and Scope

Before touching the records, clarify:

  • Why are the records being shared? (litigation, research, billing, etc.)
  • What information does the recipient need? (Apply the minimum necessary standard)
  • Which de-identification method will you use? (Safe Harbor or Expert Determination)
  • Are there additional protections beyond standard HIPAA? (substance abuse records under 42 CFR Part 2, psychotherapy notes, state-specific protections)

The answers to these questions determine the scope of redaction needed.

Step 2: Compile the Complete Record

Gather all components of the medical record that are responsive to the request:

  • Clinical notes (progress notes, history and physical, discharge summaries)
  • Operative reports
  • Laboratory results
  • Radiology reports and images
  • Medication lists
  • Nursing notes
  • Consultation reports
  • Referral letters
  • Registration and demographic forms
  • Insurance and billing documents
  • Consent forms
  • Correspondence

Step 3: Convert to a Workable Format

If records are in an EHR system, export them as PDF. If they are paper records, scan them to PDF with OCR enabled. Having all records in a consistent digital format allows for efficient processing.

For scanned documents, OCR (Optical Character Recognition) is essential — it converts the scanned images into searchable, selectable text that redaction tools can process.

Step 4: Automated Detection

Use an AI-powered tool to perform the first pass of PHI detection. AI-Redact can automatically detect all 18 HIPAA identifier types across the entire record set.

Upload the records, run detection, and review the results. The AI will highlight:

  • Patient names throughout the record
  • All dates (except year)
  • SSNs and account numbers
  • Phone numbers and email addresses
  • Addresses
  • Medical record numbers
  • And other identifiers

This automated first pass catches the vast majority of PHI and establishes a consistent baseline across all documents.

Step 5: Manual Review

After automated detection, a trained reviewer should examine the records for:

  • PHI the AI may have missed: Unusual formatting, handwritten notes in scanned documents, or identifiers in unexpected locations
  • Context-dependent identifiers: Information that is identifying only in context (e.g., "the patient's neighbor John" — "John" is identifying in this context but might not be flagged by pattern-based detection)
  • Identifier #18: Unique codes, encounter numbers, and organization-specific identifiers that may not match standard patterns
  • Clinical content: Depending on the purpose, some clinical content may also need to be removed (e.g., psychiatric notes when only orthopedic records were requested)

Step 6: Apply Redaction

Once all items are identified and reviewed, apply the redaction. Make sure you are using a tool that permanently removes the data — not one that draws visual overlays.

The redaction should:

  • Remove the text from the document structure (not just cover it)
  • Remove the text from any searchable index within the PDF
  • Scrub document metadata
  • Handle images (redact text in clinical photos or scanned forms)

Step 7: Verify the Output

Before releasing the redacted records, verify that the redaction is complete and effective.

Text search test: Search the redacted document for known PHI (patient name, SSN, MRN). No matches should be found.

Selection test: Try to select text in redacted areas. If you can select or copy hidden text, the redaction is not permanent.

Metadata check: Inspect the document properties for residual PHI in metadata fields (author, title, comments, revision history).

Visual review: Page through the document to verify that redactions are visually present where expected and that no PHI is visible.

Step 8: Document the Process

Maintain records of:

  • Who performed the redaction
  • What tool was used
  • What de-identification method was applied (Safe Harbor or Expert Determination)
  • Date of redaction
  • Quality assurance steps performed
  • Any items flagged for special consideration

This documentation supports compliance and provides a defensible record if the redaction is ever questioned.

Special Considerations for Medical Records

Psychotherapy Notes

HIPAA provides additional protection for psychotherapy notes — the personal notes of a mental health professional documenting or analyzing the contents of a counseling session. These notes are maintained separately from the medical record and generally cannot be disclosed without specific patient authorization, even for treatment purposes.

When redacting medical records, psychotherapy notes should typically be excluded entirely rather than redacted and included.

Substance Abuse Treatment Records

Records of substance use disorder treatment programs that receive federal funding are protected under 42 CFR Part 2, which is more restrictive than HIPAA. These records generally cannot be disclosed without specific written patient consent, and the redaction standards are stricter.

If your medical records include substance abuse treatment information, consult with compliance counsel about the applicable requirements.

Genetic Information

The Genetic Information Nondiscrimination Act (GINA) provides additional protections for genetic information. When redacting medical records that include genetic test results, family health history, or genetic counseling notes, consider whether GINA's protections apply.

State-Specific Protections

Many states have privacy laws that are more protective than HIPAA for certain types of medical information:

  • HIV/AIDS status (many states require separate consent for disclosure)
  • Mental health records (some states have stricter rules than HIPAA)
  • Reproductive health information (state laws vary significantly)
  • Minor patients' records (consent and disclosure rules vary by state)

Always check applicable state law in addition to HIPAA requirements.

Tools for Medical Record Redaction

AI-Redact

AI-Redact is designed for healthcare document redaction:

  • Detects all 18 HIPAA identifiers automatically
  • Handles scanned medical records with OCR
  • Processes records in batch for large record sets
  • Generates audit trails for compliance documentation
  • HIPAA compliant with BAA available
  • SOC 2 Type II certified
  • Zero data retention

EHR-Integrated Tools

Some EHR systems include built-in redaction or de-identification modules. Epic, Cerner, and other major platforms offer export options that can filter certain PHI. However, these tools often have limited redaction customization and may not catch all identifiers.

Manual Redaction

Manual review and redaction using Adobe Acrobat Pro or similar tools. This is slow but may be appropriate for small-volume, high-sensitivity situations where every page requires careful clinical judgment.

Quality Assurance Checks

Quality assurance is not optional for medical record redaction. A single missed identifier can constitute a HIPAA breach.

Statistical Sampling

For large record sets, review a statistically significant sample of redacted documents. A common approach is to review 10% of documents or a minimum of 30 documents, whichever is greater.

Targeted Checks

In addition to random sampling, perform targeted checks for:

  • First and last pages of each document (registration forms and signature pages often contain the most identifiers)
  • Headers and footers (patient name and MRN often appear in headers on every page)
  • Lab results (accession numbers, ordering provider information)
  • Correspondence (letters typically contain full names and addresses)

Automated Verification

After manual verification, run the redacted documents through the AI detection tool a second time. Any remaining detected PHI indicates a redaction gap.

Common Mistakes

Missing Headers and Footers

Many medical records have the patient's name and MRN in the header or footer of every page. These are easy to overlook but appear dozens or hundreds of times in a lengthy record.

Incomplete Date Redaction

Redacting the date of birth but leaving the date of service unredacted. Under Safe Harbor, all dates (except year) related to the individual must be removed.

Leaving Device Serial Numbers

Implanted device serial numbers are unique identifiers under HIPAA. They appear in operative reports and device registration forms and are often overlooked.

Not Redacting Referring Provider Names in Context

In small communities or specialized care settings, the name of a referring provider combined with a clinical specialty and treatment dates may be sufficient to identify a patient. Consider whether provider names in context are identifying.

Forgetting About Images

Clinical photographs, wound images, and scanned copies of photo IDs within the medical record need visual redaction. AI tools can detect text in images via OCR, but the actual photograph may need area-based redaction.

Frequently Asked Questions

Can patients request their own unredacted records?

Yes. Under HIPAA's Right of Access, patients have the right to access their own complete medical records, including information that would be redacted for other requestors.

How long does medical record redaction take?

It depends on the volume and complexity. With AI-powered tools, a 50-page record can be processed in 5-10 minutes (including review). Manual-only redaction of the same record might take 1-2 hours. For large record sets (thousands of pages), the time difference between AI-assisted and manual approaches is even more dramatic.

Is it sufficient to remove the patient's name and SSN?

No. The Safe Harbor method requires removal of all 18 identifier categories, not just names and SSNs. Removing only a few identifiers does not achieve HIPAA-compliant de-identification.

Do we need a BAA with our redaction tool provider?

If the tool processes PHI (i.e., you upload medical records to the tool for processing), the tool provider is a business associate under HIPAA, and a BAA is required. AI-Redact offers BAAs for healthcare organizations.

What about de-identified data sets for research?

For research use, you can either use the Safe Harbor method (remove all 18 identifiers) or the Expert Determination method (have a qualified statistician certify that re-identification risk is very small). The Expert Determination method allows retention of more data elements, which may be valuable for research purposes.

Conclusion

Redacting medical records is a HIPAA requirement that demands thoroughness, consistency, and proper tools. The 18 HIPAA identifiers define a broad scope of information that must be removed, and the consequences of failure — fines up to $50,000 per record, plus potential criminal penalties — make cutting corners unacceptable.

The most effective approach combines AI-powered automated detection with human review. AI handles the high-volume, pattern-based identification of PHI across the entire record set, while human reviewers address context-dependent identifiers and clinical judgment calls. Quality assurance verifies the output before release.

Further Reading

If your organization processes medical records for sharing, invest in proper tools and documented procedures. AI-Redact is built for healthcare document redaction — HIPAA compliant, BAA available, and capable of detecting all 18 identifier types automatically. Try it free.

Ready to Redact Your Documents?

Try AI-Redact free — no signup required. Redact sensitive information from your PDFs in seconds.