Proxy Mode - LockLLM Documentation

Link to section: What is Proxy Mode?What is Proxy Mode?

Proxy Mode is LockLLM's automatic security solution that scans all your LLM requests without changing your code. Simply change your base URL and either use your own API keys (BYOK) or LockLLM credits - that's it!

Available in two modes:

BYOK Mode: Use your own provider API keys (OpenAI, Anthropic, etc.) - you only pay LockLLM for security detections and routing optimization
Universal Mode: Use LockLLM credits for everything via OpenRouter (200+ models) - no provider keys needed

Key benefits:

Zero code changes required
Works with official SDKs (OpenAI, Anthropic, etc.)
Automatic scanning of all requests
Supports 17+ LLM providers with custom endpoint support
Your API keys are securely stored
Pay only for security detections and smart routing
Free tier with monthly credits

Link to section: How It WorksHow It Works

BYOK Mode (recommended for production):

Add your provider API key (OpenAI, Anthropic, etc.) to the LockLLM dashboard
Change your base URL to https://api.lockllm.com/v1/proxy/{provider}
All requests are automatically scanned before being forwarded to the provider
By default, threats are detected with warnings but NOT blocked (use headers to enable blocking)
You pay provider directly for LLM usage, LockLLM only for security

Universal Mode (no provider keys needed):

Change your base URL to https://api.lockllm.com/v1/proxy/chat/completions
All requests are scanned and forwarded via OpenRouter (200+ models)
Everything billed via LockLLM credits

Your App → LockLLM Proxy (scan + route) → Provider (OpenAI/Anthropic/etc)
            ✓ Safe: Forward
            ✗ Malicious (default): Forward with warning
            ✗ Malicious (block mode): Block request

Link to section: PricingPricing

LockLLM uses a transparent, usage-based pricing model. You only pay for security detections and smart routing when they detect actual threats or optimize your costs.

Link to section: What You Pay ForWhat You Pay For

1. Security Detection Fees (charged only when threats are found):

Safe prompts: FREE - No charge when prompt passes security checks
Unsafe core scan (prompt injection detected): $0.0001 per detection
Policy violation (custom policy triggered): $0.0001 per detection
Both unsafe (injection + policy violation): $0.0002 per detection
PII detected (personal information found): $0.0001 per detection
Prompt compression (TOON): FREE
Prompt compression (Compact): $0.0001 per use
Maximum per request (injection + policy + PII + compact compression): $0.0004

2. Smart Routing Fees (optional, charged only when saving you money):

When routing to a cheaper model: 5% of cost savings
When routing to a more expensive or same-cost model: FREE
When routing is disabled: FREE

3. LLM API Usage:

BYOK (Bring Your Own Key): FREE - You pay your provider directly
Non-BYOK (Universal Endpoint): Variable cost based on model usage via LockLLM credits

Link to section: BYOK vs Non-BYOKBYOK vs Non-BYOK

BYOK Mode (Provider-Specific Endpoints):

Use your own API keys for OpenAI, Anthropic, Gemini, etc.
You pay your provider directly for LLM usage
LockLLM charges only for security detections and routing optimization
Example: https://api.lockllm.com/v1/proxy/openai

Non-BYOK Mode (Universal Endpoint):

Use LockLLM credits for LLM usage (200+ models)
Billed for security detections, routing optimization, and LLM usage
No need to configure provider API keys
Example: https://api.lockllm.com/v1/proxy/chat/completions

Link to section: Free Monthly CreditsFree Monthly Credits

All users receive free monthly credits based on their tier (1-10). Higher tiers unlock more free credits and higher rate limits. Learn more about the tier system.

Link to section: Example Cost BreakdownExample Cost Breakdown

Scenario 1: BYOK user with 100 safe requests

Security scanning: $0 (all prompts safe)
Routing fees: $0 (not using routing)
LLM usage: $5 (paid directly to OpenAI)
Total LockLLM cost: $0

Scenario 2: BYOK user with 2 unsafe prompts detected

Security scanning: $0.0002 (2 × $0.0001)
Routing fees: $0
LLM usage: Blocked (unsafe prompts not forwarded)
Total LockLLM cost: $0.0002

Scenario 3: BYOK user with routing enabled (saved $10 in costs)

Security scanning: $0.0001 (1 unsafe prompt)
Routing fees: $0.50 (5% of $10 savings)
LLM usage: $40 (paid directly to provider after routing)
Total LockLLM cost: $0.5001 (but you saved $9.50 overall!)

Scenario 4: Non-BYOK user (universal endpoint)

Security scanning: $0.0001
Routing fees: $0.50
LLM usage: $15 (via LockLLM credits)
Total LockLLM cost: $15.5001

Link to section: Supported ProvidersSupported Providers

LockLLM proxy mode supports 17 providers with their own URLs:

Provider	Proxy URL
OpenAI	`https://api.lockllm.com/v1/proxy/openai`
Anthropic	`https://api.lockllm.com/v1/proxy/anthropic`
Google Gemini	`https://api.lockllm.com/v1/proxy/gemini`
Cohere	`https://api.lockllm.com/v1/proxy/cohere`
OpenRouter	`https://api.lockllm.com/v1/proxy/openrouter`
Perplexity	`https://api.lockllm.com/v1/proxy/perplexity`
Mistral AI	`https://api.lockllm.com/v1/proxy/mistral`
Groq	`https://api.lockllm.com/v1/proxy/groq`
DeepSeek	`https://api.lockllm.com/v1/proxy/deepseek`
Together AI	`https://api.lockllm.com/v1/proxy/together`
xAI (Grok)	`https://api.lockllm.com/v1/proxy/xai`
Fireworks AI	`https://api.lockllm.com/v1/proxy/fireworks`
Anyscale	`https://api.lockllm.com/v1/proxy/anyscale`
Hugging Face	`https://api.lockllm.com/v1/proxy/huggingface`
Azure OpenAI	`https://api.lockllm.com/v1/proxy/azure`
AWS Bedrock	`https://api.lockllm.com/v1/proxy/bedrock`
Google Vertex AI	`https://api.lockllm.com/v1/proxy/vertex-ai`

Link to section: Custom EndpointsCustom Endpoints

All providers support custom endpoint URLs for self-hosted models, alternative endpoints, or proxy/gateway services. When adding your API key in the dashboard, you can optionally specify a custom endpoint URL to override the default.

Link to section: Universal Endpoint (No BYOK Required)Universal Endpoint (No BYOK Required)

Don't want to configure provider API keys? Use the universal endpoint with LockLLM credits:

Endpoint: https://api.lockllm.com/v1/proxy/chat/completions

Features:

Access 200+ models
No provider API keys needed
No BYOK configuration required
Everything billed via LockLLM credits
Same security scanning and routing features

You can browse all supported models and their IDs in the Model List page in your dashboard. When making requests, you must use the exact model ID shown there (e.g., openai/gpt-4).

Example:

const OpenAI = require('openai')

const client = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key only
  baseURL: 'https://api.lockllm.com/v1/proxy/chat/completions'
})

// Use the model ID from the Model List page
const response = await client.chat.completions.create({
  model: 'openai/gpt-4',  // Must match the model ID exactly
  messages: [{ role: 'user', content: userPrompt }]
})

When to use:

Testing without provider API keys
Access to multiple providers without separate keys
Simplified billing (one invoice from LockLLM)
Rapid prototyping

When to use BYOK instead:

Production applications (lower costs)
You already have provider API keys
Want direct provider billing
Need provider-specific features

Link to section: Quick StartQuick Start

Link to section: Step 1: Add Your Provider API Key to DashboardStep 1: Add Your Provider API Key to Dashboard

Sign in to your LockLLM dashboard
Navigate to Proxy Settings (or API Keys)
Click Add API Key
Select your provider (e.g., OpenAI, Anthropic, Azure)
Enter your provider API key (from OpenAI, Anthropic, etc.)
Give it a nickname (optional, e.g., "Production Key")
Click Add API Key

Link to section: Step 2: Get Your LockLLM API KeyStep 2: Get Your LockLLM API Key

In the LockLLM dashboard, go to API Keys section
Copy your LockLLM API key
You'll use this to authenticate proxy requests (not your provider key!)

Link to section: Step 3: Update Your CodeStep 3: Update Your Code

Change your SDK configuration to use LockLLM's proxy. Important: Pass your LockLLM API key (not your provider key) for authentication.

Link to section: OpenAI SDK (JavaScript/TypeScript)OpenAI SDK (JavaScript/TypeScript)

const OpenAI = require('openai')

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key (...)
  baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})

// All requests automatically scanned with default allow_with_warning behavior
// Threats are detected and warnings added to response, but requests are NOT blocked
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Anthropic SDK (JavaScript/TypeScript)Anthropic SDK (JavaScript/TypeScript)

const Anthropic = require('@anthropic-ai/sdk')

const anthropic = new Anthropic({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key
  baseURL: 'https://api.lockllm.com/v1/proxy/anthropic'
})

// Automatically scanned!
const response = await anthropic.messages.create({
  model: 'claude-3-opus-20240229',
  max_tokens: 1024,
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Python OpenAIPython OpenAI

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),  # Your LockLLM API key
    base_url='https://api.lockllm.com/v1/proxy/openai'
)

# Automatically scanned!
response = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': user_prompt}]
)

Link to section: Python AnthropicPython Anthropic

import os
from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ.get('LOCKLLM_API_KEY'),  # Your LockLLM API key (...)
    base_url='https://api.lockllm.com/v1/proxy/anthropic'
)

# Automatically scanned!
response = client.messages.create(
    model='claude-3-opus-20240229',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': user_prompt}]
)

Link to section: How Authentication WorksHow Authentication Works

Important: Understanding the two types of API keys:

Provider API Key (OpenAI, Anthropic, etc.):
- You add this to the LockLLM dashboard once
- Stored securely and encrypted
- Never put this in your code when using the proxy
LockLLM API Key:
- You pass this in your SDK configuration (apiKey parameter)
- Authenticates your requests to the LockLLM proxy
- This is what goes in your code

Link to section: Provider-Specific ConfigurationProvider-Specific Configuration

Link to section: Azure OpenAIAzure OpenAI

Azure requires additional configuration:

Dashboard Setup:

Select Azure OpenAI as provider
Enter your Azure OpenAI API key
Enter your Endpoint URL (e.g., https://your-resource.openai.azure.com)
Enter your Deployment Name (e.g., gpt-4)
Enter API Version (e.g., 2024-10-21) - optional, defaults to latest

Code:

const OpenAI = require('openai')

const client = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key (NOT Azure key)
  baseURL: 'https://api.lockllm.com/v1/proxy/azure'
})

// Use Azure OpenAI models
const response = await client.chat.completions.create({
  model: 'gpt-4',  // Uses your configured deployment
  messages: [{ role: 'user', content: userPrompt }]
})

Azure API Format Support:

LockLLM supports both Azure OpenAI API formats:

Legacy format (deployment-based): /openai/deployments/{deployment}/chat/completions?api-version=2024-10-21
v1 API format (preview): /openai/v1/chat/completions?api-version=2024-10-21 with deployment in header

The proxy automatically handles both formats. When you configure your deployment name in the dashboard, it's used for all requests.

Link to section: AWS BedrockAWS Bedrock

Bedrock requires AWS credentials:

Dashboard Setup:

Select AWS Bedrock as provider
Enter your AWS credentials JSON:

{
  "accessKeyId": "your-access-key",
  "secretAccessKey": "your-secret-key",
  "region": "us-east-1"
}

Code:

// Use with Bedrock-compatible SDK or direct fetch
const response = await fetch('https://api.lockllm.com/v1/proxy/bedrock/model/anthropic.claude-3-sonnet-20240229-v1:0/invoke', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'  // Your LockLLM API key
  },
  body: JSON.stringify({
    anthropic_version: 'bedrock-2023-05-31',
    max_tokens: 1024,
    messages: [{ role: 'user', content: userPrompt }]
  })
})

Link to section: Google Vertex AIGoogle Vertex AI

Vertex AI requires service account credentials:

Dashboard Setup:

Select Vertex AI as provider
Enter your Google Cloud project ID
Enter your service account JSON key

Code:

// Use with Vertex AI-compatible SDK or direct fetch
const response = await fetch('https://api.lockllm.com/v1/proxy/vertex-ai/v1/projects/YOUR_PROJECT/locations/us-central1/publishers/anthropic/models/claude-3-opus@20240229:streamRawPredict', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'  // Your LockLLM API key
  },
  body: JSON.stringify({
    anthropic_version: 'vertex-2023-10-16',
    max_tokens: 1024,
    messages: [{ role: 'user', content: userPrompt }]
  })
})

Link to section: Multiple Keys for Same ProviderMultiple Keys for Same Provider

You can add multiple API keys for the same provider with different nicknames:

Dashboard Setup:

Add multiple keys for the same provider (e.g., "OpenAI - Production" and "OpenAI - Development")
Give each a unique nickname
Only one can be enabled at a time for each provider

Code:

// The proxy uses the enabled key for the provider
const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key
  baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})

// Whichever OpenAI key is enabled in your dashboard will be used

Note: The proxy automatically selects the enabled key for each provider. To switch between keys, enable/disable them in the dashboard.

Link to section: Managing Provider KeysManaging Provider Keys

Link to section: View Your KeysView Your Keys

Visit the Proxy Settings page in your dashboard to see all configured provider keys:

Provider name
Nickname
Last used timestamp
Enable/disable toggle
Delete option

Link to section: Enable/Disable KeysEnable/Disable Keys

Toggle keys on/off without deleting them:

Go to Proxy Settings
Find your key
Click the toggle switch
Disabled keys won't be used for proxying

Link to section: Delete KeysDelete Keys

Permanently remove provider keys:

Go to Proxy Settings
Find your key
Click Delete
Confirm deletion

Link to section: SecuritySecurity

Link to section: Your Provider API Keys Are SecureYour Provider API Keys Are Secure

Provider API keys (OpenAI, Anthropic, etc.) are encrypted at rest using industry-standard encryption
Keys are stored securely in our database and never exposed in API responses
Only you can view or manage your keys through the dashboard
Keys are never logged or included in error messages

Link to section: Request AuthenticationRequest Authentication

Standard authentication (recommended):

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key (...)
  baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})

Alternative: Using Authorization header directly:

If you need to pass the LockLLM API key separately (e.g., for custom implementations):

const openai = new OpenAI({
  apiKey: 'dummy-key',  // SDK requires a value, but proxy ignores this
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'
  }
})

The proxy checks both the apiKey parameter and the Authorization header for your LockLLM API key.

Link to section: Blocked RequestsBlocked Requests

Default Behavior: By default, the proxy uses allow_with_warning for security scans and policy checks. This means malicious prompts are allowed but flagged with warnings in the response. Requests are only blocked if you explicitly set the action headers to block.

When blocking is enabled (via X-LockLLM-Scan-Action: block), the proxy returns a 400 Bad Request error with the following structure:

{
  "error": {
    "message": "Malicious prompt detected by LockLLM",
    "type": "lockllm_security_error",
    "code": "prompt_injection_detected",
    "scan_result": {
      "safe": false,
      "label": 1,
      "confidence": 95,
      "injection": 95,
      "sensitivity": "medium"
    },
    "request_id": "req_abc123"
  }
}

Response Headers:

X-Request-Id: Unique request identifier
X-LockLLM-Blocked: "true" (indicates request was blocked)

Handle blocked requests in your application:

try {
  const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: userPrompt }]
  })
} catch (error) {
  // Check if error is from LockLLM security block
  if (error.response?.status === 400 && error.response?.data?.error?.code === 'prompt_injection_detected') {
    console.log('Malicious prompt blocked by LockLLM')
    const scanResult = error.response.data.error.scan_result
    console.log('Injection confidence:', scanResult.injection)
    console.log('Request ID:', error.response.data.error.request_id)

    // Handle security incident (log, alert, etc.)
    // You can find this request in your LockLLM dashboard logs
  } else {
    // Handle other errors
    throw error
  }
}

When PII blocking is enabled (via X-LockLLM-PII-Action: block), the proxy returns a 403 Forbidden error:

{
  "error": {
    "message": "Request blocked due to personal information detected",
    "type": "lockllm_pii_error",
    "code": "pii_detected",
    "pii_details": {
      "entity_types": ["Email", "Phone Number"],
      "entity_count": 3
    },
    "request_id": "req_abc123"
  }
}

Successful requests return the normal provider response (OpenAI, Anthropic, etc.) with additional LockLLM headers:

Standard Headers (always included):

Header	Description
`X-Request-Id`	Unique request identifier
`X-LockLLM-Scanned`	`"true"` - confirms request was scanned
`X-LockLLM-Safe`	`"true"` or `"false"` - scan result
`X-Scan-Mode`	Scan mode used for this request
`X-LockLLM-Model`	Model used for the request
`X-LockLLM-Provider`	Provider used
`X-LockLLM-Credits-Mode`	`"lockllm_credits"` or `"byok"`
`X-LockLLM-Sensitivity`	Sensitivity level used for scanning

Credits Mode Headers (included when using LockLLM credits/universal endpoint):

Header	Description
`X-LockLLM-Credits-Reserved`	Amount of credits reserved for this request (in USD). Actual usage may differ; unused credits are refunded automatically.

Warning Headers (included when threats detected with allow_with_warning):

Header	Description
`X-LockLLM-Scan-Warning`	`"true"` if core security threat detected
`X-LockLLM-Injection-Score`	Injection score (0-100)
`X-LockLLM-Confidence`	Detection confidence (0-100)
`X-LockLLM-Label`	`"0"` for safe, `"1"` for malicious
`X-LockLLM-Scan-Detail`	Base64-encoded JSON with full scan detail (message, injection score, confidence, label, sensitivity)
`X-LockLLM-Policy-Warnings`	`"true"` if policy violations found
`X-LockLLM-Warning-Count`	Number of policy warnings
`X-LockLLM-Warning-Detail`	Base64-encoded JSON with first policy warning detail (policy name, violated categories, violation details)
`X-LockLLM-Policy-Confidence`	Policy check confidence score (0-100), included for `combined` and `policy_only` scan modes
`X-LockLLM-Abuse-Detected`	`"true"` if abuse was detected (only when `x-lockllm-abuse-action: allow_with_warning`)
`X-LockLLM-Abuse-Confidence`	Abuse confidence score (0-100)
`X-LockLLM-Abuse-Types`	Comma-separated list of detected abuse types (e.g., `"bot_generated,rapid_requests"`)
`X-LockLLM-Abuse-Detail`	Base64-encoded JSON with full abuse detail (confidence, abuse types, indicators, recommendation)

Routing Headers (included when routing is enabled):

Header	Description
`X-LockLLM-Route-Enabled`	`"true"` if routing was active
`X-LockLLM-Task-Type`	Detected task classification
`X-LockLLM-Complexity`	Complexity score (0.0-1.0)
`X-LockLLM-Selected-Model`	Model chosen by router
`X-LockLLM-Routing-Reason`	Explanation for model selection
`X-LockLLM-Original-Model`	Original model before routing (only if changed)
`X-LockLLM-Original-Provider`	Original provider before routing (only if changed)
`X-LockLLM-Estimated-Original-Cost`	Estimated cost with original model
`X-LockLLM-Estimated-Routed-Cost`	Estimated cost with routed model
`X-LockLLM-Estimated-Savings`	Cost savings from routing
`X-LockLLM-Estimated-Input-Tokens`	Estimated input token count used for cost calculation
`X-LockLLM-Estimated-Output-Tokens`	Estimated output token count used for cost calculation
`X-LockLLM-Routing-Fee-Reserved`	Routing fee reserved upfront (in USD, 6 decimal places)
`X-LockLLM-Routing-Fee-Reason`	Reason no routing fee was charged (e.g., `"routing_to_more_expensive_model"`)

Cache Headers (included for response caching):

Header	Description
`X-LockLLM-Cache-Status`	`"HIT"` or `"MISS"`
`X-LockLLM-Cache-Age`	Cache entry age in seconds
`X-LockLLM-Tokens-Saved`	Tokens saved from cache hit
`X-LockLLM-Cost-Saved`	Cost saved from cache hit

PII Headers (included when PII detection is enabled):

Header	Description
`X-LockLLM-PII-Detected`	`"true"` or `"false"` - always included when `x-lockllm-pii-action` is set
`X-LockLLM-PII-Types`	Comma-separated list of detected entity types (e.g., `"Email,Phone Number"`) - only included when PII is found
`X-LockLLM-PII-Count`	Number of PII entities found - only included when PII is found
`X-LockLLM-PII-Action`	The PII action that was applied (`strip`, `block`, or `allow_with_warning`) - only included when PII is found

Compression Headers (included when prompt compression is enabled):

Header	Description
`X-LockLLM-Compression-Method`	Compression method used: `"toon"`, `"compact"`, or `"combined"`
`X-LockLLM-Compression-Applied`	`"true"` or `"false"` - whether compression was successfully applied
`X-LockLLM-Compression-Ratio`	Compression ratio (only included when compression was applied)

Link to section: Detection SettingsDetection Settings

Link to section: Scan ResultsScan Results

Every request is scanned and returns a confidence score indicating whether the prompt is safe or potentially malicious.

Scan result format:

{
  "safe": true,
  "label": 0,         // 0 = safe, 1 = malicious
  "confidence": 92,   // Confidence in the prediction (0-100)
  "injection": 8,     // Injection score (0-100, higher = more likely malicious)
  "sensitivity": "medium"
}

The proxy scans all requests and provides confidence scores. By default, threats are flagged with warnings but requests are still forwarded. Enable blocking via the x-lockllm-scan-action: block header.

Learn more about Threat Detection

Link to section: Scan ModesScan Modes

Control what gets scanned using the x-lockllm-scan-mode header:

1. Normal Mode

Scans only for core security threats (prompt injection, jailbreaks, etc.)
No custom policy checks
Use when you only need basic security protection

2. Policy-Only Mode

Skips core security scan
Checks only custom content policies
Use when you want to enforce content guidelines without injection detection

3. Combined Mode (default)

Scans for both core security threats AND custom policy violations
Most comprehensive protection
Use for production applications with strict content requirements

Example:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-scan-mode': 'combined'  // Enable both core scan + custom policies
  }
})

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Action HeadersAction Headers

Fine-grained control over how threats are handled using request headers.

Default Behavior (No Headers): When you don't configure any headers, the proxy uses safe defaults:

Core security scan: allow_with_warning (scans but doesn't block)
Policy violations: allow_with_warning (detects but doesn't block)
Abuse detection: disabled (opt-in only)
PII detection: disabled (opt-in only)
Prompt compression: disabled (opt-in only)
Routing: disabled (uses your original model choice)

This means by default, the proxy scans for threats and adds warnings to responses without blocking requests.

Available Headers:

x-lockllm-scan-mode (controls what gets scanned):

normal: Core security threats only
policy_only: Custom policies only
combined (default): Both core threats and policies

x-lockllm-sensitivity (controls detection strictness):

low: Fewer false positives, more permissive
medium (default): Balanced approach
high: Maximum security, most aggressive detection

x-lockllm-scan-action (controls core injection scan behavior):

allow_with_warning (default): Allow request, add warning to response
block: Block malicious requests with 400 error

x-lockllm-policy-action (controls custom policy violation behavior):

allow_with_warning (default): Allow request, add policy warnings to response
block: Block policy violations with 403 error

x-lockllm-abuse-action (controls AI abuse detection):

Not set (default): Skip abuse detection entirely
allow_with_warning: Detect abuse, add warnings to response
block: Block abusive requests with 400 error

x-lockllm-route-action (controls smart routing):

disabled (default): No routing, use original model
auto: Automatic routing based on AI task classification
custom: Use user-defined routing rules from dashboard

x-lockllm-pii-action (controls PII detection and redaction):

Not set (default): Skip PII detection entirely
allow_with_warning: Detect PII, add results to response headers
block: Block requests containing personal information (403 error)
strip: Replace detected PII with [TYPE] placeholders before forwarding to provider

x-lockllm-compression (controls prompt compression):

Not set (default): Skip compression entirely
toon: Compress JSON data using TOON format (free, JSON-only, instant)
compact: Compress any text using ML-based compression ($0.0001 per use)
combined: Apply TOON first then Compact for maximum compression ($0.0001 per use)

x-lockllm-compression-rate (controls compact/combined compression aggressiveness):

Default: 0.5 (balanced)
Range: 0.3 (aggressive) to 0.7 (conservative)
Only applies when using compact or combined method

x-lockllm-cache-response (controls response caching):

true (default): Enable response caching for identical requests
false: Disable response caching (always get fresh responses)

x-lockllm-cache-ttl (controls response cache duration, in seconds):

Default: 3600 (1 hour)
Maximum: 86400 (24 hours)
Only applies when response caching is enabled

Example:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-scan-mode': 'combined',              // Scan both core + policies
    'x-lockllm-sensitivity': 'high',                // Maximum security
    'x-lockllm-scan-action': 'block',               // Block injection attacks
    'x-lockllm-policy-action': 'block',             // Block policy violations
    'x-lockllm-abuse-action': 'allow_with_warning', // Detect abuse, don't block
    'x-lockllm-route-action': 'auto',               // Enable smart routing
    'x-lockllm-pii-action': 'strip',                // Redact personal information
    'x-lockllm-compression': 'compact'               // Compress prompts for token savings
  }
})

Link to section: Custom Content PoliciesCustom Content Policies

Create your own content policies beyond the built-in security categories. Custom policies let you enforce brand-specific guidelines, compliance requirements, or content restrictions.

Link to section: Setting Up Custom PoliciesSetting Up Custom Policies

Go to Dashboard → Policies
Click Create Policy
Enter a policy name (e.g., "No Medical Advice")
Write a description (up to 10,000 characters) defining what should be blocked
Enable the policy
Set the x-lockllm-scan-mode header to combined or policy_only in your requests

Example:

// Dashboard policy: "No Financial Advice"
// Description: "Block requests asking for investment advice, stock tips, or financial planning guidance"

// In your code:
const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-scan-mode': 'combined'  // Checks both core security + custom policies
  }
})

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Policy Violations ResponsePolicy Violations Response

When a custom policy is violated:

{
  "error": {
    "message": "Request blocked by custom policy",
    "type": "lockllm_policy_error",
    "code": "policy_violation",
    "violated_policies": [
      {
        "policy_name": "No Financial Advice",
        "violated_categories": [
          { "name": "Investment Guidance" }
        ],
        "violation_details": "User requested stock recommendations"
      }
    ],
    "request_id": "req_abc123"
  }
}

Learn more about Custom Policies

Link to section: Smart RoutingSmart Routing

Automatically route requests to the optimal model based on task complexity and type. Save costs by using cheaper models for simple tasks while maintaining quality for complex ones.

Link to section: How Routing WorksHow Routing Works

Task Classification: AI analyzes your prompt and determines the task type (e.g., "Code Generation", "Summarization", "Open QA")
Complexity Analysis: Assigns a complexity score (0-1) and tier (low/medium/high)
Model Selection: Chooses the best model based on task requirements and your routing rules
Execution: Routes request to selected model (uses your BYOK key or LockLLM credits)

Link to section: Routing ModesRouting Modes

Auto Routing (X-LockLLM-Route-Action: auto):

AI-powered task classification and complexity analysis
Automatically selects optimal model based on predefined routing logic
Routes high-complexity tasks to advanced models (Claude Sonnet, GPT-4)
Routes low-complexity tasks to efficient models for cost savings

Custom Routing (X-LockLLM-Route-Action: custom):

Uses your own routing rules defined in the dashboard
Configure rules by task type and complexity tier
Specify target model and whether to use BYOK
Falls back to auto routing if no matching rule found

Link to section: Setting Up RoutingSetting Up Routing

Enable Auto Routing:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-route-action': 'auto'
  }
})

// Example: User requests GPT-4, but prompt is simple
// Router detects low complexity and routes to GPT-3.5
// You save money, get fast response, maintain quality

Custom Routing Rules (Dashboard):

Go to Dashboard → Routing
Click Create Rule
Select task type (Code Generation, Summarization, etc.)
Select complexity tier (low, medium, high)
Choose target model
Enable BYOK or use LockLLM credits
Save rule

Example Rule:

Task Type: Code Generation
Complexity: High
Target Model: claude-3-7-sonnet
Use BYOK: Yes (uses your Anthropic key)

Link to section: Routing FeesRouting Fees

Routing to cheaper model: 5% of cost savings
Routing to more expensive or same-cost model: FREE
When routing is disabled: FREE

You only pay routing fees when the router actually saves you money!

Link to section: Routing MetadataRouting Metadata

All routed requests include metadata in response headers:

X-LockLLM-Task-Type: Detected task classification
X-LockLLM-Complexity: Complexity score (0.0-1.0)
X-LockLLM-Selected-Model: Model chosen by router
X-LockLLM-Routing-Reason: Explanation for selection

Learn more about Smart Routing

Link to section: AI Abuse DetectionAI Abuse Detection

Protect your application from end-users abusing your AI endpoints with automated requests, bot-generated prompts, or resource exhaustion attacks.

Link to section: What Gets DetectedWhat Gets Detected

Content Analysis:

Bot-generated content: Template-like structures, excessive special characters
Excessive repetition: Character, word, and phrase-level repetition
Resource exhaustion: Extremely long prompts, deep nesting, oversized inputs

Pattern Analysis (behavioral):

Rapid requests: Unusual request frequency from single API key
Duplicate prompts: Identical prompts within short time window
Burst detection: >50% of requests concentrated in last 30 seconds

Link to section: Enabling Abuse DetectionEnabling Abuse Detection

Add the X-LockLLM-Abuse-Action header to enable abuse detection:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-abuse-action': 'block'  // or 'allow_with_warning'
  }
})

Link to section: Abuse Detection ResponseAbuse Detection Response

When abuse is detected with block action:

{
  "error": {
    "message": "Request blocked due to abuse detection",
    "type": "lockllm_abuse_error",
    "code": "abuse_detected",
    "abuse_details": {
      "confidence": 87,
      "abuse_types": ["bot_generated", "rapid_requests"],
      "indicators": {
        "bot_score": 95,
        "repetition_score": 45,
        "resource_score": 30,
        "pattern_score": 80
      },
      "details": {
        "recommendation": "Implement rate limiting or CAPTCHA for this user"
      }
    },
    "request_id": "req_abc123"
  }
}

With allow_with_warning action, request proceeds with abuse warnings added to response.

Learn more about Abuse Detection

Link to section: PII Detection & RedactionPII Detection & Redaction

Automatically detect and protect personal information in prompts before they reach your LLM provider. When enabled, LockLLM scans for names, email addresses, phone numbers, Social Security numbers, credit card numbers, and 12 other entity types.

Link to section: Supported Entity TypesSupported Entity Types

LockLLM detects 17 types of personal information:

Identity: First Name, Last Name, Date of Birth, Username
Contact: Email, Phone Number, Street Address, City, Zip Code, Building Number
Financial: Credit Card, Account Number, Tax ID
Government IDs: Social Security Number, Driver's License, ID Card Number
Security: Password

Link to section: Enabling PII DetectionEnabling PII Detection

Add the X-LockLLM-PII-Action header to enable PII detection:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-pii-action': 'strip'  // or 'block' or 'allow_with_warning'
  }
})

Link to section: PII ActionsPII Actions

allow_with_warning - Detect PII and add metadata to response headers, but forward the original request:

Response includes X-LockLLM-PII-Detected, X-LockLLM-PII-Types, X-LockLLM-PII-Count headers
Your application can read these headers to decide how to handle PII

block - Block requests containing personal information:

Returns 403 error with details of detected entity types
Request is NOT forwarded to your LLM provider
Prevents personal data from reaching the model

strip (recommended for privacy) - Automatically redact PII before forwarding:

Detected entities are replaced with [TYPE] placeholders (e.g., [EMAIL], [GIVENNAME])
The redacted request is forwarded to your LLM provider
Your LLM never sees the actual personal information
Response headers indicate what was redacted

Link to section: Example: Stripping PIIExample: Stripping PII

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-pii-action': 'strip'
  }
})

// User sends: "Contact John Smith at [email protected] or 555-123-4567"
// LLM receives: "Contact [GIVENNAME] [SURNAME] at [EMAIL] or [TELEPHONENUM]"
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

// Check PII headers in raw response:
// X-LockLLM-PII-Detected: true
// X-LockLLM-PII-Types: First Name,Last Name,Email,Phone Number
// X-LockLLM-PII-Count: 4
// X-LockLLM-PII-Action: strip

Link to section: PII Detection PricingPII Detection Pricing

PII not detected: FREE
PII detected: $0.0001 per detection
PII detection is opt-in and disabled by default

Learn more about PII Detection

Link to section: Prompt CompressionPrompt Compression

Reduce token usage and AI costs by compressing prompts before they are forwarded to your LLM provider. Compression happens after security scanning, so your prompts are always scanned in their original form.

Link to section: Compression MethodsCompression Methods

TOON (Token-Oriented Object Notation)

Converts JSON data to a compact, token-efficient format
Removes redundant syntax (braces, quotes) while remaining LLM-readable
Works only on valid JSON input (returns original text for non-JSON)
FREE - no additional cost
Instant, local transformation

Compact (ML-based)

Advanced token classification that removes unnecessary tokens from any text
Works on natural language, code, structured data, and mixed content
Configurable compression rate (0.3-0.7)
Costs $0.0001 per use
5-second timeout with fail-open behavior

Link to section: Enabling Prompt CompressionEnabling Prompt Compression

Add the X-LockLLM-Compression header to enable compression:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-compression': 'compact',
    'x-lockllm-compression-rate': '0.5'
  }
})

// Prompts are compressed before reaching OpenAI
// You save tokens on upstream API calls
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: longDocument }]
})

// Check compression headers in raw response:
// X-LockLLM-Compression-Method: compact
// X-LockLLM-Compression-Applied: true
// X-LockLLM-Compression-Ratio: 0.4500

Link to section: Example: TOON for JSON DataExample: TOON for JSON Data

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-compression': 'toon'  // Free JSON compression
  }
})

// JSON data in prompt is automatically compressed to TOON format
// Example: {"users": [{"name": "Alice", "age": 30}]} becomes compact notation
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: JSON.stringify(jsonData) }]
})

Link to section: Example: Compact with PythonExample: Compact with Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),
    base_url='https://api.lockllm.com/v1/proxy/openai',
    default_headers={
        'X-LockLLM-Compression': 'compact',
        'X-LockLLM-Compression-Rate': '0.5'
    }
)

response = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': long_document}]
)

Link to section: Prompt Compression PricingPrompt Compression Pricing

TOON: FREE
Compact: $0.0001 per use
Compression is opt-in and disabled by default

Learn more about Prompt Compression in the dedicated guide

Link to section: MonitoringMonitoring

Link to section: View Proxy LogsView Proxy Logs

All proxy requests are logged in your dashboard:

Go to Activity Logs
Filter by Proxy Requests
See scan results, providers, models used
View blocked requests

Link to section: Webhook NotificationsWebhook Notifications

Get notified when malicious prompts are detected:

Go to Webhooks
Add a webhook URL
Choose format (raw, Slack, Discord)
Receive alerts for blocked requests

Learn more about Webhooks

Link to section: PerformancePerformance

Link to section: Does Proxy Mode Add Latency?Does Proxy Mode Add Latency?

Minimal latency is added:

Scanning: ~100-200ms
Network overhead: ~50ms
Total: ~150-250ms additional latency

For most applications, this is negligible compared to LLM response times (1-10 seconds).

Link to section: Scan Result CachingScan Result Caching

Proxy mode automatically caches scan results for identical prompts (30-minute TTL), reducing latency on repeated security scans.

Link to section: Response CachingResponse Caching

LockLLM automatically caches AI responses. When you make the same request multiple times, you get instant responses without paying for duplicate API calls.

Benefits:

Save money on duplicate requests
Faster responses from cache
Works automatically, no setup needed

Example:

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }]
})

// First time: Charged normal rate
// Ask again: Instant response, no charge

If you ask the same question 10 times, you only pay for the first request. The other 9 are free and instant.

Note: Response caching is automatically disabled for streaming requests (stream: true). Streaming responses are always served fresh from the provider. Non-streaming requests are cached by default.

Custom cache duration:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-cache-ttl': '7200'  // Cache for 2 hours (default: 3600, max: 86400)
  }
})

Disable caching if you always need fresh responses:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-cache-response': 'false'
  }
})

Link to section: TroubleshootingTroubleshooting

Link to section: "Unauthorized" Error (401)"Unauthorized" Error (401)

Problem: Your LockLLM API key is missing or invalid.

Solution:

Verify you're passing your LockLLM API key (not provider key) in the apiKey parameter
Check that your LockLLM API key is valid
Ensure you haven't revoked or deleted the API key in your dashboard
Verify you're authenticated and the key is valid

Link to section: "No [provider] API key configured" Error (400)"No [provider] API key configured" Error (400)

Problem: You haven't added the provider's API key to the dashboard, or it's disabled.

Error message example:

{
  "error": {
    "message": "No openai API key configured. Please add your API key at the dashboard.",
    "type": "lockllm_config_error",
    "code": "no_upstream_key"
  }
}

Solution:

Go to your LockLLM dashboard → Proxy Settings
Click "Add API Key" and select your provider (e.g., OpenAI)
Enter your provider API key and save
Ensure the key is enabled (toggle should be on)

Link to section: "Could not extract prompt from request" Error (400)"Could not extract prompt from request" Error (400)

Problem: The request body format is not recognized.

Solution:

Ensure you're using a supported API format for your provider
Check that your request has the correct structure (e.g., messages array for OpenAI/Anthropic)
Verify the SDK version is compatible

Link to section: Requests Not Being ScannedRequests Not Being Scanned

Problem: Requests bypass the proxy or fail silently.

Solution:

Verify you're using the correct base URL: https://api.lockllm.com/v1/proxy/{provider}
Check that you added the provider key to dashboard
Ensure the provider key is enabled (not disabled)
Confirm you're passing your LockLLM API key for authentication

Link to section: Azure-Specific ErrorsAzure-Specific Errors

Problem: Azure requests fail with "azure_config_error".

Solution:

Verify endpoint URL format: https://your-resource.openai.azure.com (no trailing slash)
Check deployment name matches your Azure deployment exactly
Ensure API version is compatible (default: 2024-10-21)
Confirm you added all required fields in the dashboard:
- Azure API key
- Endpoint URL
- Deployment name

Link to section: Rate Limit Exceeded Error (429)Rate Limit Exceeded Error (429)

Problem: Too many requests in a short time.

Error message:

{
  "error": {
    "message": "Rate limit exceeded. Please try again later.",
    "type": "rate_limit_error"
  }
}

Solution:

Rate limits are tier-based (Tier 1: 300 req/min, up to Tier 10: 200,000 req/min)
Implement exponential backoff in your application
Upgrade your tier for higher limits - view tier benefits

Link to section: Insufficient Credits Error (402)Insufficient Credits Error (402)

Problem: Your LockLLM credit balance is too low.

Error message:

{
  "error": {
    "message": "Insufficient credit balance",
    "type": "lockllm_balance_error",
    "code": "insufficient_balance"
  }
}

Solution:

Add credits in your LockLLM dashboard
Switch to BYOK mode to avoid credit requirements for LLM usage
Free monthly tier credits may cover detection fees for most users

Link to section: Credits Service Unavailable (503)Credits Service Unavailable (503)

Problem: The LockLLM credits service is temporarily unavailable.

Error message:

{
  "error": {
    "message": "LockLLM credits are temporarily unavailable. Please try again later or use a provider-specific endpoint with your own API key.",
    "type": "lockllm_service_error",
    "code": "credits_unavailable"
  }
}

Solution:

This is a temporary service issue - retry after a short delay
Switch to BYOK mode (provider-specific endpoints) as a fallback
Implement retry logic with exponential backoff
If the issue persists, contact [email protected]

Link to section: Invalid Provider for Credits Mode (400)Invalid Provider for Credits Mode (400)

Problem: You are trying to use a non-OpenRouter provider with the universal endpoint (LockLLM credits mode).

Error message:

{
  "error": {
    "message": "LockLLM credits mode only supports OpenRouter-compatible models. Please use the provider-specific endpoint with a BYOK key instead.",
    "type": "lockllm_config_error",
    "code": "invalid_provider_for_credits_mode"
  }
}

Solution:

If using the universal endpoint (/v1/proxy/chat/completions), use OpenRouter-compatible model names (e.g., openai/gpt-4, anthropic/claude-3-opus)
For other providers, use the provider-specific endpoint (e.g., /v1/proxy/openai) with a BYOK key configured in the dashboard
Add your provider API key in the dashboard if you haven't already

Link to section: Upstream Provider Error (502)Upstream Provider Error (502)

Problem: The upstream AI provider (OpenAI, Anthropic, etc.) failed to process the request.

Error message:

{
  "error": {
    "message": "Failed to forward request to provider",
    "type": "upstream_error"
  }
}

Solution:

Check the upstream provider's status page for outages
Verify your provider API key is valid and has sufficient quota
Retry the request after a short delay
If using Azure, verify your endpoint URL and deployment name are correct
If the issue persists, try a different model or provider

Link to section: FAQFAQ

Link to section: Does the proxy block requests by default?Does the proxy block requests by default?

No. By default, the proxy uses allow_with_warning mode for all security scans. This means:

Threats are detected and flagged with warnings in the response
Requests are NOT blocked - they proceed to your LLM provider
You receive threat information in response headers and body
No disruption to your application's normal operation

To enable blocking, explicitly set headers:

defaultHeaders: {
  'x-lockllm-scan-action': 'block',        // Block prompt injection
  'x-lockllm-policy-action': 'block'       // Block policy violations
}

This design lets you test and monitor security threats in production without breaking user experience, then enable blocking when you're ready.

Link to section: Is the proxy free?Is the proxy free?

Partially. LockLLM uses a usage-based pricing model where you only pay for security detections and routing optimization:

Safe prompts: FREE (no charge when passing security checks)
Security detections: $0.0001-$0.0002 per detection (only when threats found)
Routing fees: 5% of cost savings (only when routing saves you money)
LLM usage (BYOK): FREE (you pay your provider directly)
LLM usage (non-BYOK): Variable via LockLLM credits

All users receive free monthly credits based on their tier (1-10). You may never pay anything if your usage stays within free tier limits and all prompts are safe.

Link to section: What is BYOK (Bring Your Own Key)?What is BYOK (Bring Your Own Key)?

BYOK means you use your own API keys from OpenAI, Anthropic, etc. You add them to the LockLLM dashboard, and LockLLM proxies requests using your keys. You maintain full control over your keys, and billing stays with your provider account.

With BYOK:

LLM usage: FREE (you pay provider directly)
Security detections: $0.0001-$0.0002 per detection
Routing fees: 5% of savings (when enabled)

Without BYOK (universal endpoint):

Everything billed via LockLLM credits
Use OpenRouter for 200+ models
No provider API keys needed

Link to section: How does authentication work?How does authentication work?

There are two types of API keys:

Provider API Key (OpenAI, Anthropic, etc.): You add this to the LockLLM dashboard once. It's stored encrypted and never put in your code.
LockLLM API Key: You pass this in your SDK configuration. It authenticates your requests and tells the proxy which provider keys to use.

In your code, you only use your LockLLM API key. The proxy handles retrieving and using your provider keys securely.

Link to section: Are my provider API keys secure?Are my provider API keys secure?

Yes! Your API keys are:

Encrypted at rest using industry-grade encryption
Stored securely in our database
Never exposed in API responses or logs
Never transmitted to your application
Completely unaccessible

Link to section: Which providers are supported?Which providers are supported?

17 providers with full support:

OpenAI, Anthropic, Google Gemini, Cohere
Azure OpenAI, AWS Bedrock, Google Vertex AI
OpenRouter, Perplexity, Mistral AI
Groq, DeepSeek, Together AI
xAI (Grok), Fireworks AI, Anyscale
Hugging Face

All providers support custom endpoint URLs for self-hosted or alternative endpoints.

Link to section: How do I configure Azure OpenAI?How do I configure Azure OpenAI?

Azure OpenAI requires additional configuration:

In the dashboard, select Azure OpenAI as provider
Enter:
- API key: Your Azure OpenAI key (or Microsoft Entra ID token)
- Endpoint URL: https://your-resource.openai.azure.com
- Deployment name: Your Azure deployment name (e.g., gpt-4)
- API version (optional): Defaults to 2024-10-21
In your code:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/azure'
})

LockLLM supports both Azure API formats (legacy deployment-based and new v1 API).

Link to section: Does this add latency to my requests?Does this add latency to my requests?

Yes, approximately 150-250ms for scanning:

Scanning: ~100-200ms
Network overhead: ~50ms

This is minimal compared to typical LLM response times (1-10+ seconds) and provides critical security protection. The proxy caches scan results for identical prompts to reduce latency on repeated requests.

Link to section: Will this work with official SDKs?Will this work with official SDKs?

Yes! Proxy mode works seamlessly with official SDKs:

OpenAI SDK (Node.js, Python): Just change baseURL parameter
Anthropic SDK (Node.js, Python): Just change baseURL parameter
Other provider SDKs: Works with any SDK that supports custom base URLs

The proxy is fully compatible with all SDK features including streaming, function calling, and multi-modal inputs.

Link to section: Can I use multiple keys for the same provider?Can I use multiple keys for the same provider?

Yes! You can add multiple keys for the same provider with different nicknames (e.g., "Production" and "Development"). However, only one key per provider can be enabled at a time.

To switch between keys, enable/disable them in the dashboard. The proxy automatically uses the enabled key for each provider.

Link to section: What happens when a malicious prompt is detected?What happens when a malicious prompt is detected?

It depends on your configuration:

Default behavior (allow_with_warning):

The threat is detected and flagged with warning headers (X-LockLLM-Scan-Warning, X-LockLLM-Injection-Score, etc.)
The request is still forwarded to your LLM provider
Your application can read the warning headers to decide how to handle it
The event is logged in your dashboard

With blocking enabled (x-lockllm-scan-action: block):

The request is blocked immediately (not forwarded to provider)
Returns a 400 Bad Request error with scan details and request ID
The event is logged in your dashboard
Can trigger webhook notifications if configured

You can catch blocked requests in your code and handle them appropriately (e.g., alert security team, log incident, show user error message).

Link to section: Do you log or store my prompts?Do you log or store my prompts?

No. We do not store prompt content. We only log:

Metadata (timestamp, model, provider, request ID)
Scan results (safe/malicious, confidence scores)
Prompt length (character count)

Prompt content is scanned in memory and immediately discarded. This ensures privacy while providing security monitoring.

Link to section: How does the detection work?How does the detection work?

The proxy scans every request and assigns confidence scores:

Injection score: 0 (definitely safe) to 100 (definitely malicious)
By default, threats are flagged with warnings but requests are still forwarded (allow_with_warning)
Enable x-lockllm-scan-action: block to block malicious prompts instead of forwarding them
Safe prompts are always forwarded to your provider

The detection system is tuned to balance security and minimize false positives. Contact [email protected] if you need custom detection settings for your use case.

Link to section: Can I test the proxy without adding my real API keys?Can I test the proxy without adding my real API keys?

Yes! You can test with:

Universal endpoint: Use /v1/proxy/chat/completions with LockLLM credits (no provider keys needed)
OpenAI-compatible test endpoints: Use a local LLM server or test API
Custom endpoints: Point to your staging/test environments
Provider test keys: Use provider-issued test/sandbox keys

Just add the test configuration in the dashboard and point your SDK to the proxy.

Link to section: What are custom content policies?What are custom content policies?

Custom policies let you enforce your own content restrictions beyond built-in security. For example:

Block medical or legal advice
Prevent competitor mentions
Enforce brand guidelines
Meet compliance requirements (HIPAA, GDPR, etc.)

Create policies in the dashboard, then set the x-lockllm-scan-mode header to combined to check both security threats and custom policies. Each policy can be up to 10,000 characters and describe exactly what should be blocked.

Link to section: How does smart routing work?How does smart routing work?

Routing analyzes your prompt to determine:

Task type: What the user is asking for (code generation, summarization, chatbot, etc.)
Complexity: How difficult the task is (low/medium/high)
Optimal model: Which model provides best quality/cost ratio

For example, a simple "What is 2+2?" might route from GPT-4 to GPT-3.5 (faster, cheaper, same quality). A complex code generation task stays on GPT-4 (quality matters).

You only pay routing fees (5% of savings) when the router actually saves you money by selecting a cheaper model.

Link to section: What is AI abuse detection?What is AI abuse detection?

Abuse detection protects you from malicious end-users trying to:

Send bot-generated or automated requests
Spam your endpoints with repetitive prompts
Exhaust resources with oversized inputs
Overwhelm your quota with burst requests

Enable it by adding X-LockLLM-Abuse-Action header. It's optional (opt-in) and designed for production applications where end-users directly interact with your AI.

Link to section: What is PII detection?What is PII detection?

PII (Personally Identifiable Information) detection scans prompts for sensitive personal data before forwarding to your LLM provider. It detects 17 entity types including names, emails, phone numbers, SSNs, credit cards, and addresses.

Enable it with the X-LockLLM-PII-Action header:

allow_with_warning: Detect PII, include in response headers, forward request
block: Reject requests containing personal information (403 error)
strip: Replace PII with [TYPE] placeholders before forwarding

PII detection is opt-in (disabled by default) and costs $0.0001 per detection (only when PII is found). It works alongside all other features (scanning, routing, abuse detection).

Link to section: Can I prevent personal data from reaching my LLM?Can I prevent personal data from reaching my LLM?

Yes! Use x-lockllm-pii-action: strip to automatically redact personal information before it reaches your LLM provider. Detected entities are replaced with type placeholders (e.g., [EMAIL], [GIVENNAME]). This ensures your LLM never sees actual personal data while still understanding the context of the request.

Link to section: What is prompt compression?What is prompt compression?

Prompt compression reduces the token count of your prompts before they reach your LLM provider, helping you save on API costs. Three methods are available:

TOON (Token-Oriented Object Notation): Converts JSON data to a compact, token-efficient format. Free, instant, JSON-only.
Compact: Uses advanced ML-based token classification to compress any text. $0.0001 per use, up to 5 seconds.
Combined: Applies TOON first, then Compact on the result for maximum compression. $0.0001 per use. Best for maximum token reduction.

Enable compression with the X-LockLLM-Compression header set to toon, compact, or combined. Compression is disabled by default and applied only after all security checks pass. Learn more about Prompt Compression.

Link to section: Does compression affect response quality?Does compression affect response quality?

TOON produces a compact notation that LLMs understand effectively - the data structure and values are fully preserved. Compact uses ML-based classification to preserve meaning while removing redundant tokens. If quality is a concern, use a higher compression rate (closer to 0.7) with the Compact method via the X-LockLLM-Compression-Rate header.

Link to section: Do you charge for every request?Do you charge for every request?

No! We only charge when providing value:

Safe prompts: FREE (no security detection fee)
Unsafe prompts: $0.0001-$0.0002 (detected threat)
PII detected: $0.0001 (personal information found)
Prompt compression (TOON): FREE
Prompt compression (Compact): $0.0001 per use
Routing to cheaper model: 5% of savings (saved you money)
Routing to same/more expensive model: FREE (no savings)
BYOK LLM usage: FREE (you pay provider directly)

If all your prompts are safe, contain no PII, compression is disabled, and routing is disabled, you pay nothing for security scanning.

Link to section: How do I control my costs?How do I control my costs?

1. Use BYOK (Bring Your Own Key):

No LLM usage charges from LockLLM
Only pay for detections and routing

2. Monitor your dashboard:

Track detection rates
View routing savings

3. Adjust sensitivity:

Higher tiers get more free credits
Increase monthly spending to unlock higher tiers

4. Disable optional features if not needed:

Set x-lockllm-route-action: disabled to skip routing fees (if you don't want cost optimization)
Use x-lockllm-scan-mode: normal instead of combined if you don't need custom policy checks
Only add x-lockllm-pii-action header for inputs that may contain personal data (PII detection is disabled by default)
Use TOON compression (x-lockllm-compression: toon) for free token savings on JSON data
Use Compact compression selectively, as it costs $0.0001 per use

Note: Abuse detection (x-lockllm-abuse-action) is FREE and has no cost impact. PII detection (x-lockllm-pii-action) costs $0.0001 only when PII is actually found - no charge when prompts contain no personal information. TOON compression is FREE with no cost impact.

Link to section: What happens if I run out of credits?What happens if I run out of credits?

If you're using LockLLM credits (non-BYOK mode) and run out:

Proxy returns 402 Payment Required error
Add credits in the dashboard to resume service
Consider switching to BYOK to avoid credit requirements for LLM usage

If you're using BYOK mode:

LLM usage never consumes LockLLM credits (you pay provider directly)
You only need credits for security detections and routing
Free monthly tier credits often cover detection fees for most users