PII Detection & Redaction - LockLLM Documentation

Link to section: What is PII Detection?What is PII Detection?

PII (Personally Identifiable Information) Detection is an opt-in feature that scans prompts for sensitive personal data before they reach your LLM provider. Using an advanced ML-based model, it identifies 21 types of personal information - from names and emails to credit card numbers and Social Security numbers - and gives you three ways to handle detected PII.

Key benefits:

Multilingual support - detects PII across multiple languages automatically, not just English
21 entity types detected including names, financial data, government IDs, and more
Three flexible actions - warn, block, or automatically strip PII from prompts
Strip mode ensures your LLM never sees actual personal information
Privacy compliance support for GDPR, HIPAA, CCPA, and other regulations
Works in both scan endpoint and proxy mode
Fail-open design - detection issues never block your requests
Combines seamlessly with threat detection, custom policies, and prompt compression

Link to section: Multilingual CapabilitiesMultilingual Capabilities

LockLLM's PII detection model is built on multilingual ML architecture, meaning it works across multiple languages out of the box. There is no configuration needed - multilingual detection is always active.

What this means for you:

Names, addresses, phone numbers, and other entities are detected regardless of the language they are written in
International formats are recognized (e.g., European phone numbers, non-US address formats, names in various scripts)
Mixed-language prompts are handled naturally - PII is detected even when the prompt switches between languages
Ideal for global applications serving users in different countries and languages

Supported languages include (but are not limited to): English, Spanish, French, German, Italian, Portuguese, Dutch, and other major languages. The ML model handles variations in formatting, abbreviations, and regional conventions.

Link to section: Supported Entity TypesSupported Entity Types

LockLLM detects 21 types of personally identifiable information, grouped by category:

Link to section: IdentityIdentity

Entity Type	Examples
First Name	John, Maria, Wei
Last Name	Smith, Garcia, Chen
Date of Birth	01/15/1990, January 15, 1990
Username	john_doe, user123

Link to section: Contact InformationContact Information

Entity Type	Examples
Email	[email protected]
Phone Number	(555) 123-4567, +44 20 7946 0958
Street Address	123 Main Street, Apt 4B
City	New York, London, Tokyo
State	California, CA
Zip Code	90210, SW1A 1AA
Building Number	Suite 400, Floor 12
Secondary Address	P.O. Box 1234

Link to section: FinancialFinancial

Entity Type	Examples
Credit Card	4111-1111-1111-1111
Account Number	1234567890
Tax ID	12-3456789

Link to section: Government IDsGovernment IDs

Entity Type	Examples
Social Security Number	123-45-6789
Driver's License	D1234567
ID Card Number	AB1234567

Link to section: Security & NetworkSecurity & Network

Entity Type	Examples
Password	P@ssw0rd123
IP Address	192.168.1.1, 2001:db8::1
URL	https://example.com/profile

Link to section: ActionsActions

PII detection is opt-in and controlled via the X-LockLLM-PII-Action header. Three actions are available:

Link to section: allow_with_warningallow_with_warning

Detect PII and include the results in the response, but forward the original (unmodified) request to your LLM provider.

Use when: You want visibility into what PII is being sent but don't want to modify or block requests. Good for monitoring and auditing.

Link to section: blockblock

Reject any request that contains PII with a 403 Forbidden error. The request is never forwarded to your LLM provider.

Use when: You have strict compliance requirements and personal data must never reach your AI provider under any circumstances.

Link to section: strip (recommended for privacy)strip (recommended for privacy)

Automatically detect PII and replace each entity with a [TYPE] placeholder before forwarding the request. Your LLM receives the redacted text and never sees the actual personal information.

Use when: You want the best of both worlds - your LLM can still understand the context of the request while actual personal data stays protected.

Link to section: Strip Mode Deep DiveStrip Mode Deep Dive

Strip mode is the recommended action for most privacy-conscious applications. Here's how it works:

Link to section: Before and AfterBefore and After

Original prompt (what the user sends):

My name is John Smith and I live at 742 Evergreen Terrace, Springfield.
You can reach me at [email protected] or call (555) 123-4567.
My SSN is 123-45-6789.

Redacted prompt (what your LLM receives):

My name is [GIVENNAME] [SURNAME] and I live at [STREETADDRESS], [CITY].
You can reach me at [EMAIL] or call [TELEPHONENUM].
My SSN is [SOCIALNUM].

The LLM can still understand that the user is sharing contact information and asking about something related to their personal details, but it never sees the actual values. This is especially valuable for:

Customer support chatbots that process user inquiries
Healthcare applications where patient information must be protected
Financial services where personal data appears in user queries
Any application where users might accidentally share sensitive information

Link to section: Placeholder TypesPlaceholder Types

Each detected entity is replaced with a descriptive placeholder that preserves context:

Detected Entity	Placeholder
First Name	`[GIVENNAME]`
Last Name	`[SURNAME]`
Email	`[EMAIL]`
Phone Number	`[TELEPHONENUM]`
Street Address	`[STREETADDRESS]`
City	`[CITY]`
Credit Card	`[CREDITCARDNUMBER]`
Social Security Number	`[SOCIALNUM]`
Date of Birth	`[DATEOFBIRTH]`
Driver's License	`[DRIVERSLICENSE]`
Password	`[PASSWORD]`
IP Address	`[IPADDRESS]`
Account Number	`[ACCOUNTNUM]`
Tax ID	`[TAXID]`
ID Card Number	`[IDCARDNUM]`
Username	`[USERNAME]`
Zip Code	`[ZIPCODE]`
Building Number	`[BUILDINGNUM]`
Secondary Address	`[SECONDARYADDRESS]`
State	`[STATE]`
URL	`[URL]`

Link to section: ConfigurationConfiguration

Link to section: HeadersHeaders

Header	Values	Default	Description
`X-LockLLM-PII-Action`	`allow_with_warning`, `block`, `strip`	Not set (disabled)	How to handle detected PII

Link to section: Scan EndpointScan Endpoint

curl -X POST https://api.lockllm.com/v1/scan \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-LockLLM-PII-Action: strip" \
  -d '{
    "input": "Contact John Smith at [email protected] or 555-123-4567"
  }'

Link to section: Proxy Mode - JavaScript/TypeScriptProxy Mode - JavaScript/TypeScript

const OpenAI = require('openai')

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'X-LockLLM-PII-Action': 'strip'
  }
})

// User sends: "Contact John Smith at [email protected]"
// LLM receives: "Contact [GIVENNAME] [SURNAME] at [EMAIL]"
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Proxy Mode - PythonProxy Mode - Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),
    base_url='https://api.lockllm.com/v1/proxy/openai',
    default_headers={
        'X-LockLLM-PII-Action': 'strip'
    }
)

response = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': user_prompt}]
)

Link to section: LockLLM SDK - JavaScript/TypeScriptLockLLM SDK - JavaScript/TypeScript

import { createOpenAI } from '@lockllm/sdk/wrappers'

const openai = createOpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  proxyOptions: {
    piiAction: 'strip'
  }
})

Link to section: LockLLM SDK - PythonLockLLM SDK - Python

from lockllm import create_openai, ProxyOptions

openai = create_openai(
    api_key=os.getenv('LOCKLLM_API_KEY'),
    proxy_options=ProxyOptions(pii_action='strip')
)

Link to section: Response FormatResponse Format

Link to section: Scan Endpoint ResponseScan Endpoint Response

When PII is detected, the response includes a pii_result object:

{
  "request_id": "req_abc123",
  "safe": true,
  "confidence": 95,
  "injection": 3,
  "pii_result": {
    "detected": true,
    "entity_types": ["First Name", "Last Name", "Email", "Phone Number"],
    "entity_count": 4,
    "redacted_input": "Contact [GIVENNAME] [SURNAME] at [EMAIL] or [TELEPHONENUM]"
  }
}

The redacted_input field is only included when the action is strip.

Link to section: Proxy Mode Response HeadersProxy Mode Response Headers

Header	Description
`X-LockLLM-PII-Detected`	`"true"` or `"false"`
`X-LockLLM-PII-Types`	Comma-separated entity types found (e.g., `"Email,Phone Number"`)
`X-LockLLM-PII-Count`	Number of PII entities detected
`X-LockLLM-PII-Action`	The action that was applied

Link to section: Block Mode Error ResponseBlock Mode Error Response

When X-LockLLM-PII-Action: block and PII is detected:

{
  "error": {
    "message": "Request blocked due to personal information detected",
    "type": "lockllm_pii_error",
    "code": "pii_detected",
    "pii_details": {
      "entity_types": ["Email", "Phone Number", "Social Security Number"],
      "entity_count": 3
    },
    "request_id": "req_abc123"
  }
}

Link to section: Combining with Other FeaturesCombining with Other Features

PII detection integrates with all other LockLLM features. The processing order in proxy mode is:

Security scan -> PII detection/redaction -> Prompt compression -> Forward to provider

Important interactions:

Prompt compression: When PII stripping is enabled, compression is applied to the redacted text. This means compressed prompts never contain original PII values.
Threat detection: Security scanning runs on the original text before PII processing, ensuring no attacks are missed.
Custom policies: Policy checks run on the original text alongside threat detection.
Smart routing: Routing decisions are made independently of PII detection.

Link to section: Use CasesUse Cases

Link to section: Healthcare (HIPAA Compliance)Healthcare (HIPAA Compliance)

Protect patient information in medical AI applications:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'X-LockLLM-PII-Action': 'strip',     // Redact patient PII
    'X-LockLLM-Scan-Action': 'block',     // Block injection attacks
    'X-LockLLM-Policy-Action': 'block'    // Block policy violations
  }
})

Link to section: Financial ServicesFinancial Services

Prevent credit card numbers, account numbers, and tax IDs from reaching your LLM:

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),
    base_url='https://api.lockllm.com/v1/proxy/openai',
    default_headers={
        'X-LockLLM-PII-Action': 'block'   # Reject requests with financial PII
    }
)

Link to section: Customer SupportCustomer Support

Allow customer inquiries through while protecting personal information:

// User: "My name is Jane Doe, order #12345, email [email protected]"
// LLM receives: "My name is [GIVENNAME] [SURNAME], order #12345, email [EMAIL]"
// The LLM can help with the order without knowing the customer's real identity

Link to section: EducationEducation

Protect student information in educational AI tools while preserving the learning context.

Link to section: Privacy CompliancePrivacy Compliance

PII detection helps you meet the requirements of major privacy regulations by preventing personal data from reaching third-party AI providers.

The General Data Protection Regulation requires data minimization (Article 5) and data protection by design (Article 25). PII detection directly supports both principles:

Strip mode ensures only the minimum necessary data reaches your AI provider - personal identifiers are replaced with placeholders before the request leaves LockLLM
Block mode prevents any request containing personal data from being processed, supporting strict data handling policies
LockLLM does not store prompt content or detected PII values - data is processed in memory and immediately discarded, supporting your data retention compliance

Link to section: HIPAA (US Healthcare)HIPAA (US Healthcare)

Protected Health Information (PHI) includes patient names, dates, contact information, and identification numbers. PII detection catches names, dates of birth, Social Security numbers, phone numbers, emails, and addresses. Combine with custom policies to add healthcare-specific restrictions (e.g., blocking medical diagnosis requests).

Link to section: CCPA (California)CCPA (California)

The California Consumer Privacy Act defines "personal information" broadly to include identifiers, contact details, financial data, and more. PII detection covers the key categories: names, emails, phone numbers, addresses, Social Security numbers, driver's licenses, and financial account information.

Link to section: Key Compliance BenefitKey Compliance Benefit

Across all regulations, a critical advantage of PII detection is that LockLLM never stores the personal data it detects. The detection runs in memory, results are returned immediately, and the original data is discarded. This minimizes your data processing footprint and reduces compliance risk.

Link to section: Detection Accuracy by Entity TypeDetection Accuracy by Entity Type

Detection accuracy varies by entity type based on how structured the data is. Understanding these differences helps you choose the right action mode for your use case.

Accuracy Level	Entity Types	Why
Very high	Email, Phone Number, Credit Card, Social Security Number, IP Address, URL	These follow strict, well-defined patterns that are reliably detected regardless of context
High	Date of Birth, Zip Code, Tax ID, Driver's License, Account Number, ID Card Number	Structured data with some format variation across regions and conventions
Context-dependent	First Name, Last Name, City, State, Street Address, Building Number, Secondary Address, Username, Password	Detection relies on surrounding text and context to distinguish these from ordinary words

For maximum protection with context-dependent entities, use block mode and handle any edge cases in your application. For pattern-based entities (very high and high accuracy), strip mode works reliably with minimal false positives.

Link to section: Strip Mode in PracticeStrip Mode in Practice

Strip mode replaces detected PII with descriptive placeholders while preserving the meaning of the request. Here are examples across different use cases:

Link to section: Customer SupportCustomer Support

Original:

Hi, I'm Sarah Chen. My order #45678 was shipped to 42 Oak Lane, Portland, OR 97201.
Can you check the status? My phone is 503-555-0147.

What your LLM receives:

Hi, I'm [GIVENNAME] [SURNAME]. My order #45678 was shipped to [STREETADDRESS], [CITY], [STATE] [ZIPCODE].
Can you check the status? My phone is [TELEPHONENUM].

The order number passes through because it is not PII. The LLM can still help with the order inquiry without knowing the customer's real identity.

Link to section: Financial QueryFinancial Query

Original:

Transfer $500 from account 1234567890 to John Doe, routing number 021000021.

What your LLM receives:

Transfer $500 from account [ACCOUNTNUM] to [GIVENNAME] [SURNAME], routing number [ACCOUNTNUM].

Dollar amounts and transaction types pass through. Account numbers and names are redacted.

Link to section: Healthcare IntakeHealthcare Intake

Original:

Patient: Maria Garcia, DOB: 03/15/1985, SSN: 456-78-9012.
Complaint: persistent headache for 3 days, no prior history of migraines.

What your LLM receives:

Patient: [GIVENNAME] [SURNAME], DOB: [DATEOFBIRTH], SSN: [SOCIALNUM].
Complaint: persistent headache for 3 days, no prior history of migraines.

Medical symptoms and clinical details pass through because they are not personally identifiable on their own. Patient identifiers are redacted.

Link to section: Multilingual Detection ExamplesMultilingual Detection Examples

PII detection works across multiple languages automatically. No configuration changes are needed - the ML model handles different languages, scripts, and regional formats out of the box.

Spanish:

Me llamo Carlos Rodriguez, mi correo es carlos@ejemplo.com

Detected: First Name, Last Name, Email

French:

Mon numero de telephone est +33 1 23 45 67 89, j'habite a Paris

Detected: Phone Number, City

German:

Meine Adresse ist Berliner Str. 42, 10115 Berlin

Detected: Street Address, Zip Code, City

Mixed-language prompts are also handled naturally. If a prompt switches between English and another language, PII is detected in both languages within the same request.

Link to section: PricingPricing

Scenario	Cost
No PII detected	FREE
PII detected (any action)	$0.0001 per detection

PII detection is opt-in (disabled by default)
You are only charged when PII is actually found
The fee applies regardless of which action you choose (warn, block, or strip)

Link to section: LimitationsLimitations

Detection accuracy may vary by entity type - common entities like emails and phone numbers are detected with very high accuracy, while more ambiguous entities depend on surrounding context
Works best with clearly structured text; heavily obfuscated or encoded PII may not be detected
The ML model is continuously improved to expand language coverage and detection accuracy

Link to section: FAQFAQ

Link to section: Does PII detection work with non-English text?Does PII detection work with non-English text?

Yes. LockLLM's PII detection model is multilingual and detects personal information across multiple languages automatically. No additional configuration is needed - simply enable PII detection and it works regardless of the input language.

Link to section: Can I choose which entity types to detect?Can I choose which entity types to detect?

Currently, PII detection scans for all 21 supported entity types in every request. You cannot selectively enable or disable individual entity types. However, you can use the allow_with_warning action and filter the results in your application based on the entity_types reported in the response.

Link to section: What happens if PII detection is temporarily unavailable?What happens if PII detection is temporarily unavailable?

PII detection uses a fail-open design. If the detection service is temporarily unreachable, your request proceeds normally without PII scanning. This ensures your application's availability is never impacted. The response headers will indicate that PII detection was not applied.

Link to section: Does stripping PII affect response quality?Does stripping PII affect response quality?

In most cases, the LLM can understand the context of the request even with PII replaced by placeholders. For example, "Help [GIVENNAME] [SURNAME] update their [EMAIL]" is clear enough for the model to provide a helpful response. The placeholders preserve the semantic structure of the original text.

Link to section: Can I use PII detection in the scan endpoint?Can I use PII detection in the scan endpoint?

Yes. PII detection works in both the scan endpoint (/v1/scan) and proxy mode (/v1/proxy). In the scan endpoint, the pii_result object is included in the response body. In proxy mode, PII metadata is provided via response headers.

Link to section: Is detected PII stored or logged?Is detected PII stored or logged?

No. LockLLM does not store prompt content or detected PII values. Only metadata is logged (entity types, counts, and whether PII was detected). The actual personal information is processed in memory and immediately discarded.

Link to section: Does PII detection work with all providers?Does PII detection work with all providers?

Yes. PII detection works with all 17+ supported providers in proxy mode, and independently in the scan endpoint. It is applied before the request reaches any provider, so it works universally.

Link to section: What is PII Detection?What is PII Detection?

Link to section: Multilingual CapabilitiesMultilingual Capabilities

Link to section: Supported Entity TypesSupported Entity Types

Link to section: IdentityIdentity

Link to section: Contact InformationContact Information

Link to section: FinancialFinancial

Link to section: Government IDsGovernment IDs

Link to section: Security & NetworkSecurity & Network

Link to section: ActionsActions

Link to section: allow_with_warningallow_with_warning

Link to section: blockblock

Link to section: strip (recommended for privacy)strip (recommended for privacy)

Link to section: Strip Mode Deep DiveStrip Mode Deep Dive

Link to section: Before and AfterBefore and After

Link to section: Placeholder TypesPlaceholder Types

Link to section: ConfigurationConfiguration

Link to section: HeadersHeaders

Link to section: Scan EndpointScan Endpoint

Link to section: Proxy Mode - JavaScript/TypeScriptProxy Mode - JavaScript/TypeScript

Link to section: Proxy Mode - PythonProxy Mode - Python

Link to section: LockLLM SDK - JavaScript/TypeScriptLockLLM SDK - JavaScript/TypeScript

Link to section: LockLLM SDK - PythonLockLLM SDK - Python

Link to section: Response FormatResponse Format

Link to section: Scan Endpoint ResponseScan Endpoint Response

Link to section: Proxy Mode Response HeadersProxy Mode Response Headers

Link to section: Block Mode Error ResponseBlock Mode Error Response

Link to section: Combining with Other FeaturesCombining with Other Features

Link to section: Use CasesUse Cases

Link to section: Healthcare (HIPAA Compliance)Healthcare (HIPAA Compliance)

Link to section: Financial ServicesFinancial Services

Link to section: Customer SupportCustomer Support

Link to section: EducationEducation

Link to section: Privacy CompliancePrivacy Compliance

Link to section: GDPR (EU)GDPR (EU)

Link to section: HIPAA (US Healthcare)HIPAA (US Healthcare)

Link to section: CCPA (California)CCPA (California)

Link to section: Key Compliance BenefitKey Compliance Benefit

Link to section: Detection Accuracy by Entity TypeDetection Accuracy by Entity Type

Link to section: Strip Mode in PracticeStrip Mode in Practice

Link to section: Customer SupportCustomer Support

Link to section: Financial QueryFinancial Query

Link to section: Healthcare IntakeHealthcare Intake

Link to section: Multilingual Detection ExamplesMultilingual Detection Examples

Link to section: PricingPricing

Link to section: LimitationsLimitations

Link to section: FAQFAQ

Link to section: Does PII detection work with non-English text?Does PII detection work with non-English text?

Link to section: Can I choose which entity types to detect?Can I choose which entity types to detect?

Link to section: What happens if PII detection is temporarily unavailable?What happens if PII detection is temporarily unavailable?

Link to section: Does stripping PII affect response quality?Does stripping PII affect response quality?

Link to section: Can I use PII detection in the scan endpoint?Can I use PII detection in the scan endpoint?

Link to section: Is detected PII stored or logged?Is detected PII stored or logged?

Link to section: Does PII detection work with all providers?Does PII detection work with all providers?