Proxy Mode
Automatically scan all LLM requests with custom policies, smart routing, PII redaction, prompt compression, injection and abuse detection. Zero code changes required.
Link to section: What is Proxy Mode?What is Proxy Mode?
Proxy Mode is LockLLM's automatic security solution that scans all your LLM requests without changing your code. Simply change your base URL and either use your own API keys (BYOK) or LockLLM credits - that's it!
Available in two modes:
- BYOK Mode: Use your own provider API keys (OpenAI, Anthropic, etc.) - you only pay LockLLM for security detections and routing optimization
- Universal Mode: Use LockLLM credits for everything via OpenRouter (200+ models) - no provider keys needed
Key benefits:
- Zero code changes required
- Works with official SDKs (OpenAI, Anthropic, etc.)
- Automatic scanning of all requests
- Supports 17+ LLM providers with custom endpoint support
- Your API keys are securely stored
- Pay only for security detections and smart routing
- Free tier with monthly credits
Link to section: How It WorksHow It Works
BYOK Mode (recommended for production):
- Add your provider API key (OpenAI, Anthropic, etc.) to the LockLLM dashboard
- Change your base URL to
https://api.lockllm.com/v1/proxy/{provider} - All requests are automatically scanned before being forwarded to the provider
- By default, threats are detected with warnings but NOT blocked (use headers to enable blocking)
- You pay provider directly for LLM usage, LockLLM only for security
Universal Mode (no provider keys needed):
- Change your base URL to
https://api.lockllm.com/v1/proxy/chat/completions - All requests are scanned and forwarded via OpenRouter (200+ models)
- Everything billed via LockLLM credits
Your App → LockLLM Proxy (scan + route) → Provider (OpenAI/Anthropic/etc)
✓ Safe: Forward
✗ Malicious (default): Forward with warning
✗ Malicious (block mode): Block request
Link to section: PricingPricing
LockLLM uses a transparent, usage-based pricing model. You only pay for security detections and smart routing when they detect actual threats or optimize your costs.
Link to section: What You Pay ForWhat You Pay For
1. Security Detection Fees (charged only when threats are found):
- Safe prompts: FREE - No charge when prompt passes security checks
- Unsafe core scan (prompt injection detected): $0.0001 per detection
- Policy violation (custom policy triggered): $0.0001 per detection
- Both unsafe (injection + policy violation): $0.0002 per detection
- PII detected (personal information found): $0.0001 per detection
- Prompt compression (TOON): FREE
- Prompt compression (Compact): $0.0001 per use
- Maximum per request (injection + policy + PII + compact compression): $0.0004
2. Smart Routing Fees (optional, charged only when saving you money):
- When routing to a cheaper model: 5% of cost savings
- When routing to a more expensive or same-cost model: FREE
- When routing is disabled: FREE
3. LLM API Usage:
- BYOK (Bring Your Own Key): FREE - You pay your provider directly
- Non-BYOK (Universal Endpoint): Variable cost based on model usage via LockLLM credits
Link to section: BYOK vs Non-BYOKBYOK vs Non-BYOK
BYOK Mode (Provider-Specific Endpoints):
- Use your own API keys for OpenAI, Anthropic, Gemini, etc.
- You pay your provider directly for LLM usage
- LockLLM charges only for security detections and routing optimization
- Example:
https://api.lockllm.com/v1/proxy/openai
Non-BYOK Mode (Universal Endpoint):
- Use LockLLM credits for LLM usage (200+ models)
- Billed for security detections, routing optimization, and LLM usage
- No need to configure provider API keys
- Example:
https://api.lockllm.com/v1/proxy/chat/completions
Link to section: Free Monthly CreditsFree Monthly Credits
All users receive free monthly credits based on their tier (1-10). Higher tiers unlock more free credits and higher rate limits. Learn more about the tier system.
Link to section: Example Cost BreakdownExample Cost Breakdown
Scenario 1: BYOK user with 100 safe requests
- Security scanning: $0 (all prompts safe)
- Routing fees: $0 (not using routing)
- LLM usage: $5 (paid directly to OpenAI)
- Total LockLLM cost: $0
Scenario 2: BYOK user with 2 unsafe prompts detected
- Security scanning: $0.0002 (2 × $0.0001)
- Routing fees: $0
- LLM usage: Blocked (unsafe prompts not forwarded)
- Total LockLLM cost: $0.0002
Scenario 3: BYOK user with routing enabled (saved $10 in costs)
- Security scanning: $0.0001 (1 unsafe prompt)
- Routing fees: $0.50 (5% of $10 savings)
- LLM usage: $40 (paid directly to provider after routing)
- Total LockLLM cost: $0.5001 (but you saved $9.50 overall!)
Scenario 4: Non-BYOK user (universal endpoint)
- Security scanning: $0.0001
- Routing fees: $0.50
- LLM usage: $15 (via LockLLM credits)
- Total LockLLM cost: $15.5001
Link to section: Supported ProvidersSupported Providers
LockLLM proxy mode supports 17 providers with their own URLs:
| Provider | Proxy URL |
|---|---|
| OpenAI | https://api.lockllm.com/v1/proxy/openai |
| Anthropic | https://api.lockllm.com/v1/proxy/anthropic |
| Google Gemini | https://api.lockllm.com/v1/proxy/gemini |
| Cohere | https://api.lockllm.com/v1/proxy/cohere |
| OpenRouter | https://api.lockllm.com/v1/proxy/openrouter |
| Perplexity | https://api.lockllm.com/v1/proxy/perplexity |
| Mistral AI | https://api.lockllm.com/v1/proxy/mistral |
| Groq | https://api.lockllm.com/v1/proxy/groq |
| DeepSeek | https://api.lockllm.com/v1/proxy/deepseek |
| Together AI | https://api.lockllm.com/v1/proxy/together |
| xAI (Grok) | https://api.lockllm.com/v1/proxy/xai |
| Fireworks AI | https://api.lockllm.com/v1/proxy/fireworks |
| Anyscale | https://api.lockllm.com/v1/proxy/anyscale |
| Hugging Face | https://api.lockllm.com/v1/proxy/huggingface |
| Azure OpenAI | https://api.lockllm.com/v1/proxy/azure |
| AWS Bedrock | https://api.lockllm.com/v1/proxy/bedrock |
| Google Vertex AI | https://api.lockllm.com/v1/proxy/vertex-ai |
Link to section: Custom EndpointsCustom Endpoints
All providers support custom endpoint URLs for self-hosted models, alternative endpoints, or proxy/gateway services. When adding your API key in the dashboard, you can optionally specify a custom endpoint URL to override the default.
Link to section: Universal Endpoint (No BYOK Required)Universal Endpoint (No BYOK Required)
Don't want to configure provider API keys? Use the universal endpoint with LockLLM credits:
Endpoint: https://api.lockllm.com/v1/proxy/chat/completions
Features:
- Access 200+ models
- No provider API keys needed
- No BYOK configuration required
- Everything billed via LockLLM credits
- Same security scanning and routing features
You can browse all supported models and their IDs in the Model List page in your dashboard. When making requests, you must use the exact model ID shown there (e.g., openai/gpt-4).
Example:
const OpenAI = require('openai')
const client = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY, // Your LockLLM API key only
baseURL: 'https://api.lockllm.com/v1/proxy/chat/completions'
})
// Use the model ID from the Model List page
const response = await client.chat.completions.create({
model: 'openai/gpt-4', // Must match the model ID exactly
messages: [{ role: 'user', content: userPrompt }]
})
When to use:
- Testing without provider API keys
- Access to multiple providers without separate keys
- Simplified billing (one invoice from LockLLM)
- Rapid prototyping
When to use BYOK instead:
- Production applications (lower costs)
- You already have provider API keys
- Want direct provider billing
- Need provider-specific features
Link to section: Quick StartQuick Start
Link to section: Step 1: Add Your Provider API Key to DashboardStep 1: Add Your Provider API Key to Dashboard
- Sign in to your LockLLM dashboard
- Navigate to Proxy Settings (or API Keys)
- Click Add API Key
- Select your provider (e.g., OpenAI, Anthropic, Azure)
- Enter your provider API key (from OpenAI, Anthropic, etc.)
- Give it a nickname (optional, e.g., "Production Key")
- Click Add API Key
Link to section: Step 2: Get Your LockLLM API KeyStep 2: Get Your LockLLM API Key
- In the LockLLM dashboard, go to API Keys section
- Copy your LockLLM API key
- You'll use this to authenticate proxy requests (not your provider key!)
Link to section: Step 3: Update Your CodeStep 3: Update Your Code
Change your SDK configuration to use LockLLM's proxy. Important: Pass your LockLLM API key (not your provider key) for authentication.
Link to section: OpenAI SDK (JavaScript/TypeScript)OpenAI SDK (JavaScript/TypeScript)
const OpenAI = require('openai')
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY, // Your LockLLM API key (...)
baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})
// All requests automatically scanned with default allow_with_warning behavior
// Threats are detected and warnings added to response, but requests are NOT blocked
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: userPrompt }]
})
Link to section: Anthropic SDK (JavaScript/TypeScript)Anthropic SDK (JavaScript/TypeScript)
const Anthropic = require('@anthropic-ai/sdk')
const anthropic = new Anthropic({
apiKey: process.env.LOCKLLM_API_KEY, // Your LockLLM API key
baseURL: 'https://api.lockllm.com/v1/proxy/anthropic'
})
// Automatically scanned!
const response = await anthropic.messages.create({
model: 'claude-3-opus-20240229',
max_tokens: 1024,
messages: [{ role: 'user', content: userPrompt }]
})
Link to section: Python OpenAIPython OpenAI
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get('LOCKLLM_API_KEY'), # Your LockLLM API key
base_url='https://api.lockllm.com/v1/proxy/openai'
)
# Automatically scanned!
response = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': user_prompt}]
)
Link to section: Python AnthropicPython Anthropic
import os
from anthropic import Anthropic
client = Anthropic(
api_key=os.environ.get('LOCKLLM_API_KEY'), # Your LockLLM API key (...)
base_url='https://api.lockllm.com/v1/proxy/anthropic'
)
# Automatically scanned!
response = client.messages.create(
model='claude-3-opus-20240229',
max_tokens=1024,
messages=[{'role': 'user', 'content': user_prompt}]
)
Link to section: How Authentication WorksHow Authentication Works
Important: Understanding the two types of API keys:
-
Provider API Key (OpenAI, Anthropic, etc.):
- You add this to the LockLLM dashboard once
- Stored securely and encrypted
- Never put this in your code when using the proxy
-
LockLLM API Key:
- You pass this in your SDK configuration (
apiKeyparameter) - Authenticates your requests to the LockLLM proxy
- This is what goes in your code
- You pass this in your SDK configuration (
Link to section: Provider-Specific ConfigurationProvider-Specific Configuration
Link to section: Azure OpenAIAzure OpenAI
Azure requires additional configuration:
Dashboard Setup:
- Select Azure OpenAI as provider
- Enter your Azure OpenAI API key
- Enter your Endpoint URL (e.g.,
https://your-resource.openai.azure.com) - Enter your Deployment Name (e.g.,
gpt-4) - Enter API Version (e.g.,
2024-10-21) - optional, defaults to latest
Code:
const OpenAI = require('openai')
const client = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY, // Your LockLLM API key (NOT Azure key)
baseURL: 'https://api.lockllm.com/v1/proxy/azure'
})
// Use Azure OpenAI models
const response = await client.chat.completions.create({
model: 'gpt-4', // Uses your configured deployment
messages: [{ role: 'user', content: userPrompt }]
})
Azure API Format Support:
LockLLM supports both Azure OpenAI API formats:
- Legacy format (deployment-based):
/openai/deployments/{deployment}/chat/completions?api-version=2024-10-21 - v1 API format (preview):
/openai/v1/chat/completions?api-version=2024-10-21with deployment in header
The proxy automatically handles both formats. When you configure your deployment name in the dashboard, it's used for all requests.
Link to section: AWS BedrockAWS Bedrock
Bedrock requires AWS credentials:
Dashboard Setup:
- Select AWS Bedrock as provider
- Enter your AWS credentials JSON:
{
"accessKeyId": "your-access-key",
"secretAccessKey": "your-secret-key",
"region": "us-east-1"
}
Code:
// Use with Bedrock-compatible SDK or direct fetch
const response = await fetch('https://api.lockllm.com/v1/proxy/bedrock/model/anthropic.claude-3-sonnet-20240229-v1:0/invoke', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY' // Your LockLLM API key
},
body: JSON.stringify({
anthropic_version: 'bedrock-2023-05-31',
max_tokens: 1024,
messages: [{ role: 'user', content: userPrompt }]
})
})
Link to section: Google Vertex AIGoogle Vertex AI
Vertex AI requires service account credentials:
Dashboard Setup:
- Select Vertex AI as provider
- Enter your Google Cloud project ID
- Enter your service account JSON key
Code:
// Use with Vertex AI-compatible SDK or direct fetch
const response = await fetch('https://api.lockllm.com/v1/proxy/vertex-ai/v1/projects/YOUR_PROJECT/locations/us-central1/publishers/anthropic/models/claude-3-opus@20240229:streamRawPredict', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY' // Your LockLLM API key
},
body: JSON.stringify({
anthropic_version: 'vertex-2023-10-16',
max_tokens: 1024,
messages: [{ role: 'user', content: userPrompt }]
})
})
Link to section: Multiple Keys for Same ProviderMultiple Keys for Same Provider
You can add multiple API keys for the same provider with different nicknames:
Dashboard Setup:
- Add multiple keys for the same provider (e.g., "OpenAI - Production" and "OpenAI - Development")
- Give each a unique nickname
- Only one can be enabled at a time for each provider
Code:
// The proxy uses the enabled key for the provider
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY, // Your LockLLM API key
baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})
// Whichever OpenAI key is enabled in your dashboard will be used
Note: The proxy automatically selects the enabled key for each provider. To switch between keys, enable/disable them in the dashboard.
Link to section: Managing Provider KeysManaging Provider Keys
Link to section: View Your KeysView Your Keys
Visit the Proxy Settings page in your dashboard to see all configured provider keys:
- Provider name
- Nickname
- Last used timestamp
- Enable/disable toggle
- Delete option
Link to section: Enable/Disable KeysEnable/Disable Keys
Toggle keys on/off without deleting them:
- Go to Proxy Settings
- Find your key
- Click the toggle switch
- Disabled keys won't be used for proxying
Link to section: Delete KeysDelete Keys
Permanently remove provider keys:
- Go to Proxy Settings
- Find your key
- Click Delete
- Confirm deletion
Link to section: SecuritySecurity
Link to section: Your Provider API Keys Are SecureYour Provider API Keys Are Secure
- Provider API keys (OpenAI, Anthropic, etc.) are encrypted at rest using industry-standard encryption
- Keys are stored securely in our database and never exposed in API responses
- Only you can view or manage your keys through the dashboard
- Keys are never logged or included in error messages
Link to section: Request AuthenticationRequest Authentication
Standard authentication (recommended):
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY, // Your LockLLM API key (...)
baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})
Alternative: Using Authorization header directly:
If you need to pass the LockLLM API key separately (e.g., for custom implementations):
const openai = new OpenAI({
apiKey: 'dummy-key', // SDK requires a value, but proxy ignores this
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'
}
})
The proxy checks both the apiKey parameter and the Authorization header for your LockLLM API key.
Link to section: Blocked RequestsBlocked Requests
Default Behavior: By default, the proxy uses allow_with_warning for security scans and policy checks. This means malicious prompts are allowed but flagged with warnings in the response. Requests are only blocked if you explicitly set the action headers to block.
When blocking is enabled (via X-LockLLM-Scan-Action: block), the proxy returns a 400 Bad Request error with the following structure:
{
"error": {
"message": "Malicious prompt detected by LockLLM",
"type": "lockllm_security_error",
"code": "prompt_injection_detected",
"scan_result": {
"safe": false,
"label": 1,
"confidence": 95,
"injection": 95,
"sensitivity": "medium"
},
"request_id": "req_abc123"
}
}
Response Headers:
X-Request-Id: Unique request identifierX-LockLLM-Blocked:"true"(indicates request was blocked)
Handle blocked requests in your application:
try {
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: userPrompt }]
})
} catch (error) {
// Check if error is from LockLLM security block
if (error.response?.status === 400 && error.response?.data?.error?.code === 'prompt_injection_detected') {
console.log('Malicious prompt blocked by LockLLM')
const scanResult = error.response.data.error.scan_result
console.log('Injection confidence:', scanResult.injection)
console.log('Request ID:', error.response.data.error.request_id)
// Handle security incident (log, alert, etc.)
// You can find this request in your LockLLM dashboard logs
} else {
// Handle other errors
throw error
}
}
When PII blocking is enabled (via X-LockLLM-PII-Action: block), the proxy returns a 403 Forbidden error:
{
"error": {
"message": "Request blocked due to personal information detected",
"type": "lockllm_pii_error",
"code": "pii_detected",
"pii_details": {
"entity_types": ["Email", "Phone Number"],
"entity_count": 3
},
"request_id": "req_abc123"
}
}
Successful requests return the normal provider response (OpenAI, Anthropic, etc.) with additional LockLLM headers:
Standard Headers (always included):
| Header | Description |
|---|---|
X-Request-Id | Unique request identifier |
X-LockLLM-Scanned | "true" - confirms request was scanned |
X-LockLLM-Safe | "true" or "false" - scan result |
X-Scan-Mode | Scan mode used for this request |
X-LockLLM-Model | Model used for the request |
X-LockLLM-Provider | Provider used |
X-LockLLM-Credits-Mode | "lockllm_credits" or "byok" |
X-LockLLM-Sensitivity | Sensitivity level used for scanning |
Credits Mode Headers (included when using LockLLM credits/universal endpoint):
| Header | Description |
|---|---|
X-LockLLM-Credits-Reserved | Amount of credits reserved for this request (in USD). Actual usage may differ; unused credits are refunded automatically. |
Warning Headers (included when threats detected with allow_with_warning):
| Header | Description |
|---|---|
X-LockLLM-Scan-Warning | "true" if core security threat detected |
X-LockLLM-Injection-Score | Injection score (0-100) |
X-LockLLM-Confidence | Detection confidence (0-100) |
X-LockLLM-Label | "0" for safe, "1" for malicious |
X-LockLLM-Scan-Detail | Base64-encoded JSON with full scan detail (message, injection score, confidence, label, sensitivity) |
X-LockLLM-Policy-Warnings | "true" if policy violations found |
X-LockLLM-Warning-Count | Number of policy warnings |
X-LockLLM-Warning-Detail | Base64-encoded JSON with first policy warning detail (policy name, violated categories, violation details) |
X-LockLLM-Policy-Confidence | Policy check confidence score (0-100), included for combined and policy_only scan modes |
X-LockLLM-Abuse-Detected | "true" if abuse was detected (only when x-lockllm-abuse-action: allow_with_warning) |
X-LockLLM-Abuse-Confidence | Abuse confidence score (0-100) |
X-LockLLM-Abuse-Types | Comma-separated list of detected abuse types (e.g., "bot_generated,rapid_requests") |
X-LockLLM-Abuse-Detail | Base64-encoded JSON with full abuse detail (confidence, abuse types, indicators, recommendation) |
Routing Headers (included when routing is enabled):
| Header | Description |
|---|---|
X-LockLLM-Route-Enabled | "true" if routing was active |
X-LockLLM-Task-Type | Detected task classification |
X-LockLLM-Complexity | Complexity score (0.0-1.0) |
X-LockLLM-Selected-Model | Model chosen by router |
X-LockLLM-Routing-Reason | Explanation for model selection |
X-LockLLM-Original-Model | Original model before routing (only if changed) |
X-LockLLM-Original-Provider | Original provider before routing (only if changed) |
X-LockLLM-Estimated-Original-Cost | Estimated cost with original model |
X-LockLLM-Estimated-Routed-Cost | Estimated cost with routed model |
X-LockLLM-Estimated-Savings | Cost savings from routing |
X-LockLLM-Estimated-Input-Tokens | Estimated input token count used for cost calculation |
X-LockLLM-Estimated-Output-Tokens | Estimated output token count used for cost calculation |
X-LockLLM-Routing-Fee-Reserved | Routing fee reserved upfront (in USD, 6 decimal places) |
X-LockLLM-Routing-Fee-Reason | Reason no routing fee was charged (e.g., "routing_to_more_expensive_model") |
Cache Headers (included for response caching):
| Header | Description |
|---|---|
X-LockLLM-Cache-Status | "HIT" or "MISS" |
X-LockLLM-Cache-Age | Cache entry age in seconds |
X-LockLLM-Tokens-Saved | Tokens saved from cache hit |
X-LockLLM-Cost-Saved | Cost saved from cache hit |
PII Headers (included when PII detection is enabled):
| Header | Description |
|---|---|
X-LockLLM-PII-Detected | "true" or "false" - always included when x-lockllm-pii-action is set |
X-LockLLM-PII-Types | Comma-separated list of detected entity types (e.g., "Email,Phone Number") - only included when PII is found |
X-LockLLM-PII-Count | Number of PII entities found - only included when PII is found |
X-LockLLM-PII-Action | The PII action that was applied (strip, block, or allow_with_warning) - only included when PII is found |
Compression Headers (included when prompt compression is enabled):
| Header | Description |
|---|---|
X-LockLLM-Compression-Method | Compression method used: "toon", "compact", or "combined" |
X-LockLLM-Compression-Applied | "true" or "false" - whether compression was successfully applied |
X-LockLLM-Compression-Ratio | Compression ratio (only included when compression was applied) |
Link to section: Detection SettingsDetection Settings
Link to section: Scan ResultsScan Results
Every request is scanned and returns a confidence score indicating whether the prompt is safe or potentially malicious.
Scan result format:
{
"safe": true,
"label": 0, // 0 = safe, 1 = malicious
"confidence": 92, // Confidence in the prediction (0-100)
"injection": 8, // Injection score (0-100, higher = more likely malicious)
"sensitivity": "medium"
}
The proxy scans all requests and provides confidence scores. By default, threats are flagged with warnings but requests are still forwarded. Enable blocking via the x-lockllm-scan-action: block header.
Learn more about Threat Detection
Link to section: Scan ModesScan Modes
Control what gets scanned using the x-lockllm-scan-mode header:
1. Normal Mode
- Scans only for core security threats (prompt injection, jailbreaks, etc.)
- No custom policy checks
- Use when you only need basic security protection
2. Policy-Only Mode
- Skips core security scan
- Checks only custom content policies
- Use when you want to enforce content guidelines without injection detection
3. Combined Mode (default)
- Scans for both core security threats AND custom policy violations
- Most comprehensive protection
- Use for production applications with strict content requirements
Example:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-scan-mode': 'combined' // Enable both core scan + custom policies
}
})
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: userPrompt }]
})
Link to section: Action HeadersAction Headers
Fine-grained control over how threats are handled using request headers.
Default Behavior (No Headers): When you don't configure any headers, the proxy uses safe defaults:
- Core security scan:
allow_with_warning(scans but doesn't block) - Policy violations:
allow_with_warning(detects but doesn't block) - Abuse detection: disabled (opt-in only)
- PII detection: disabled (opt-in only)
- Prompt compression: disabled (opt-in only)
- Routing:
disabled(uses your original model choice)
This means by default, the proxy scans for threats and adds warnings to responses without blocking requests.
Available Headers:
x-lockllm-scan-mode (controls what gets scanned):
normal: Core security threats onlypolicy_only: Custom policies onlycombined(default): Both core threats and policies
x-lockllm-sensitivity (controls detection strictness):
low: Fewer false positives, more permissivemedium(default): Balanced approachhigh: Maximum security, most aggressive detection
x-lockllm-scan-action (controls core injection scan behavior):
allow_with_warning(default): Allow request, add warning to responseblock: Block malicious requests with 400 error
x-lockllm-policy-action (controls custom policy violation behavior):
allow_with_warning(default): Allow request, add policy warnings to responseblock: Block policy violations with 403 error
x-lockllm-abuse-action (controls AI abuse detection):
- Not set (default): Skip abuse detection entirely
allow_with_warning: Detect abuse, add warnings to responseblock: Block abusive requests with 400 error
x-lockllm-route-action (controls smart routing):
disabled(default): No routing, use original modelauto: Automatic routing based on AI task classificationcustom: Use user-defined routing rules from dashboard
x-lockllm-pii-action (controls PII detection and redaction):
- Not set (default): Skip PII detection entirely
allow_with_warning: Detect PII, add results to response headersblock: Block requests containing personal information (403 error)strip: Replace detected PII with[TYPE]placeholders before forwarding to provider
x-lockllm-compression (controls prompt compression):
- Not set (default): Skip compression entirely
toon: Compress JSON data using TOON format (free, JSON-only, instant)compact: Compress any text using ML-based compression ($0.0001 per use)combined: Apply TOON first then Compact for maximum compression ($0.0001 per use)
x-lockllm-compression-rate (controls compact/combined compression aggressiveness):
- Default:
0.5(balanced) - Range:
0.3(aggressive) to0.7(conservative) - Only applies when using
compactorcombinedmethod
x-lockllm-cache-response (controls response caching):
true(default): Enable response caching for identical requestsfalse: Disable response caching (always get fresh responses)
x-lockllm-cache-ttl (controls response cache duration, in seconds):
- Default:
3600(1 hour) - Maximum:
86400(24 hours) - Only applies when response caching is enabled
Example:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-scan-mode': 'combined', // Scan both core + policies
'x-lockllm-sensitivity': 'high', // Maximum security
'x-lockllm-scan-action': 'block', // Block injection attacks
'x-lockllm-policy-action': 'block', // Block policy violations
'x-lockllm-abuse-action': 'allow_with_warning', // Detect abuse, don't block
'x-lockllm-route-action': 'auto', // Enable smart routing
'x-lockllm-pii-action': 'strip', // Redact personal information
'x-lockllm-compression': 'compact' // Compress prompts for token savings
}
})
Link to section: Custom Content PoliciesCustom Content Policies
Create your own content policies beyond the built-in security categories. Custom policies let you enforce brand-specific guidelines, compliance requirements, or content restrictions.
Link to section: Setting Up Custom PoliciesSetting Up Custom Policies
- Go to Dashboard → Policies
- Click Create Policy
- Enter a policy name (e.g., "No Medical Advice")
- Write a description (up to 10,000 characters) defining what should be blocked
- Enable the policy
- Set the
x-lockllm-scan-modeheader tocombinedorpolicy_onlyin your requests
Example:
// Dashboard policy: "No Financial Advice"
// Description: "Block requests asking for investment advice, stock tips, or financial planning guidance"
// In your code:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-scan-mode': 'combined' // Checks both core security + custom policies
}
})
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: userPrompt }]
})
Link to section: Policy Violations ResponsePolicy Violations Response
When a custom policy is violated:
{
"error": {
"message": "Request blocked by custom policy",
"type": "lockllm_policy_error",
"code": "policy_violation",
"violated_policies": [
{
"policy_name": "No Financial Advice",
"violated_categories": [
{ "name": "Investment Guidance" }
],
"violation_details": "User requested stock recommendations"
}
],
"request_id": "req_abc123"
}
}
Learn more about Custom Policies
Link to section: Smart RoutingSmart Routing
Automatically route requests to the optimal model based on task complexity and type. Save costs by using cheaper models for simple tasks while maintaining quality for complex ones.
Link to section: How Routing WorksHow Routing Works
- Task Classification: AI analyzes your prompt and determines the task type (e.g., "Code Generation", "Summarization", "Open QA")
- Complexity Analysis: Assigns a complexity score (0-1) and tier (low/medium/high)
- Model Selection: Chooses the best model based on task requirements and your routing rules
- Execution: Routes request to selected model (uses your BYOK key or LockLLM credits)
Link to section: Routing ModesRouting Modes
Auto Routing (X-LockLLM-Route-Action: auto):
- AI-powered task classification and complexity analysis
- Automatically selects optimal model based on predefined routing logic
- Routes high-complexity tasks to advanced models (Claude Sonnet, GPT-4)
- Routes low-complexity tasks to efficient models for cost savings
Custom Routing (X-LockLLM-Route-Action: custom):
- Uses your own routing rules defined in the dashboard
- Configure rules by task type and complexity tier
- Specify target model and whether to use BYOK
- Falls back to auto routing if no matching rule found
Link to section: Setting Up RoutingSetting Up Routing
Enable Auto Routing:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-route-action': 'auto'
}
})
// Example: User requests GPT-4, but prompt is simple
// Router detects low complexity and routes to GPT-3.5
// You save money, get fast response, maintain quality
Custom Routing Rules (Dashboard):
- Go to Dashboard → Routing
- Click Create Rule
- Select task type (Code Generation, Summarization, etc.)
- Select complexity tier (low, medium, high)
- Choose target model
- Enable BYOK or use LockLLM credits
- Save rule
Example Rule:
- Task Type: Code Generation
- Complexity: High
- Target Model: claude-3-7-sonnet
- Use BYOK: Yes (uses your Anthropic key)
Link to section: Routing FeesRouting Fees
- Routing to cheaper model: 5% of cost savings
- Routing to more expensive or same-cost model: FREE
- When routing is disabled: FREE
You only pay routing fees when the router actually saves you money!
Link to section: Routing MetadataRouting Metadata
All routed requests include metadata in response headers:
X-LockLLM-Task-Type: Detected task classificationX-LockLLM-Complexity: Complexity score (0.0-1.0)X-LockLLM-Selected-Model: Model chosen by routerX-LockLLM-Routing-Reason: Explanation for selection
Learn more about Smart Routing
Link to section: AI Abuse DetectionAI Abuse Detection
Protect your application from end-users abusing your AI endpoints with automated requests, bot-generated prompts, or resource exhaustion attacks.
Link to section: What Gets DetectedWhat Gets Detected
Content Analysis:
- Bot-generated content: Template-like structures, excessive special characters
- Excessive repetition: Character, word, and phrase-level repetition
- Resource exhaustion: Extremely long prompts, deep nesting, oversized inputs
Pattern Analysis (behavioral):
- Rapid requests: Unusual request frequency from single API key
- Duplicate prompts: Identical prompts within short time window
- Burst detection: >50% of requests concentrated in last 30 seconds
Link to section: Enabling Abuse DetectionEnabling Abuse Detection
Add the X-LockLLM-Abuse-Action header to enable abuse detection:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-abuse-action': 'block' // or 'allow_with_warning'
}
})
Link to section: Abuse Detection ResponseAbuse Detection Response
When abuse is detected with block action:
{
"error": {
"message": "Request blocked due to abuse detection",
"type": "lockllm_abuse_error",
"code": "abuse_detected",
"abuse_details": {
"confidence": 87,
"abuse_types": ["bot_generated", "rapid_requests"],
"indicators": {
"bot_score": 95,
"repetition_score": 45,
"resource_score": 30,
"pattern_score": 80
},
"details": {
"recommendation": "Implement rate limiting or CAPTCHA for this user"
}
},
"request_id": "req_abc123"
}
}
With allow_with_warning action, request proceeds with abuse warnings added to response.
Learn more about Abuse Detection
Link to section: PII Detection & RedactionPII Detection & Redaction
Automatically detect and protect personal information in prompts before they reach your LLM provider. When enabled, LockLLM scans for names, email addresses, phone numbers, Social Security numbers, credit card numbers, and 12 other entity types.
Link to section: Supported Entity TypesSupported Entity Types
LockLLM detects 17 types of personal information:
- Identity: First Name, Last Name, Date of Birth, Username
- Contact: Email, Phone Number, Street Address, City, Zip Code, Building Number
- Financial: Credit Card, Account Number, Tax ID
- Government IDs: Social Security Number, Driver's License, ID Card Number
- Security: Password
Link to section: Enabling PII DetectionEnabling PII Detection
Add the X-LockLLM-PII-Action header to enable PII detection:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-pii-action': 'strip' // or 'block' or 'allow_with_warning'
}
})
Link to section: PII ActionsPII Actions
allow_with_warning - Detect PII and add metadata to response headers, but forward the original request:
- Response includes
X-LockLLM-PII-Detected,X-LockLLM-PII-Types,X-LockLLM-PII-Countheaders - Your application can read these headers to decide how to handle PII
block - Block requests containing personal information:
- Returns 403 error with details of detected entity types
- Request is NOT forwarded to your LLM provider
- Prevents personal data from reaching the model
strip (recommended for privacy) - Automatically redact PII before forwarding:
- Detected entities are replaced with
[TYPE]placeholders (e.g.,[EMAIL],[GIVENNAME]) - The redacted request is forwarded to your LLM provider
- Your LLM never sees the actual personal information
- Response headers indicate what was redacted
Link to section: Example: Stripping PIIExample: Stripping PII
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-pii-action': 'strip'
}
})
// User sends: "Contact John Smith at [email protected] or 555-123-4567"
// LLM receives: "Contact [GIVENNAME] [SURNAME] at [EMAIL] or [TELEPHONENUM]"
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: userPrompt }]
})
// Check PII headers in raw response:
// X-LockLLM-PII-Detected: true
// X-LockLLM-PII-Types: First Name,Last Name,Email,Phone Number
// X-LockLLM-PII-Count: 4
// X-LockLLM-PII-Action: strip
Link to section: PII Detection PricingPII Detection Pricing
- PII not detected: FREE
- PII detected: $0.0001 per detection
- PII detection is opt-in and disabled by default
Learn more about PII Detection
Link to section: Prompt CompressionPrompt Compression
Reduce token usage and AI costs by compressing prompts before they are forwarded to your LLM provider. Compression happens after security scanning, so your prompts are always scanned in their original form.
Link to section: Compression MethodsCompression Methods
TOON (Token-Oriented Object Notation)
- Converts JSON data to a compact, token-efficient format
- Removes redundant syntax (braces, quotes) while remaining LLM-readable
- Works only on valid JSON input (returns original text for non-JSON)
- FREE - no additional cost
- Instant, local transformation
Compact (ML-based)
- Advanced token classification that removes unnecessary tokens from any text
- Works on natural language, code, structured data, and mixed content
- Configurable compression rate (0.3-0.7)
- Costs $0.0001 per use
- 5-second timeout with fail-open behavior
Link to section: Enabling Prompt CompressionEnabling Prompt Compression
Add the X-LockLLM-Compression header to enable compression:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-compression': 'compact',
'x-lockllm-compression-rate': '0.5'
}
})
// Prompts are compressed before reaching OpenAI
// You save tokens on upstream API calls
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: longDocument }]
})
// Check compression headers in raw response:
// X-LockLLM-Compression-Method: compact
// X-LockLLM-Compression-Applied: true
// X-LockLLM-Compression-Ratio: 0.4500
Link to section: Example: TOON for JSON DataExample: TOON for JSON Data
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-compression': 'toon' // Free JSON compression
}
})
// JSON data in prompt is automatically compressed to TOON format
// Example: {"users": [{"name": "Alice", "age": 30}]} becomes compact notation
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: JSON.stringify(jsonData) }]
})
Link to section: Example: Compact with PythonExample: Compact with Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get('LOCKLLM_API_KEY'),
base_url='https://api.lockllm.com/v1/proxy/openai',
default_headers={
'X-LockLLM-Compression': 'compact',
'X-LockLLM-Compression-Rate': '0.5'
}
)
response = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': long_document}]
)
Link to section: Prompt Compression PricingPrompt Compression Pricing
- TOON: FREE
- Compact: $0.0001 per use
- Compression is opt-in and disabled by default
Learn more about Prompt Compression in the dedicated guide
Link to section: MonitoringMonitoring
Link to section: View Proxy LogsView Proxy Logs
All proxy requests are logged in your dashboard:
- Go to Activity Logs
- Filter by Proxy Requests
- See scan results, providers, models used
- View blocked requests
Link to section: Webhook NotificationsWebhook Notifications
Get notified when malicious prompts are detected:
- Go to Webhooks
- Add a webhook URL
- Choose format (raw, Slack, Discord)
- Receive alerts for blocked requests
Link to section: PerformancePerformance
Link to section: Does Proxy Mode Add Latency?Does Proxy Mode Add Latency?
Minimal latency is added:
- Scanning: ~100-200ms
- Network overhead: ~50ms
- Total: ~150-250ms additional latency
For most applications, this is negligible compared to LLM response times (1-10 seconds).
Link to section: Scan Result CachingScan Result Caching
Proxy mode automatically caches scan results for identical prompts (30-minute TTL), reducing latency on repeated security scans.
Link to section: Response CachingResponse Caching
LockLLM automatically caches AI responses. When you make the same request multiple times, you get instant responses without paying for duplicate API calls.
Benefits:
- Save money on duplicate requests
- Faster responses from cache
- Works automatically, no setup needed
Example:
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'What is 2+2?' }]
})
// First time: Charged normal rate
// Ask again: Instant response, no charge
If you ask the same question 10 times, you only pay for the first request. The other 9 are free and instant.
Note: Response caching is automatically disabled for streaming requests (stream: true). Streaming responses are always served fresh from the provider. Non-streaming requests are cached by default.
Custom cache duration:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-cache-ttl': '7200' // Cache for 2 hours (default: 3600, max: 86400)
}
})
Disable caching if you always need fresh responses:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/openai',
defaultHeaders: {
'x-lockllm-cache-response': 'false'
}
})
Link to section: TroubleshootingTroubleshooting
Link to section: "Unauthorized" Error (401)"Unauthorized" Error (401)
Problem: Your LockLLM API key is missing or invalid.
Solution:
- Verify you're passing your LockLLM API key (not provider key) in the
apiKeyparameter - Check that your LockLLM API key is valid
- Ensure you haven't revoked or deleted the API key in your dashboard
- Verify you're authenticated and the key is valid
Link to section: "No [provider] API key configured" Error (400)"No [provider] API key configured" Error (400)
Problem: You haven't added the provider's API key to the dashboard, or it's disabled.
Error message example:
{
"error": {
"message": "No openai API key configured. Please add your API key at the dashboard.",
"type": "lockllm_config_error",
"code": "no_upstream_key"
}
}
Solution:
- Go to your LockLLM dashboard → Proxy Settings
- Click "Add API Key" and select your provider (e.g., OpenAI)
- Enter your provider API key and save
- Ensure the key is enabled (toggle should be on)
Link to section: "Could not extract prompt from request" Error (400)"Could not extract prompt from request" Error (400)
Problem: The request body format is not recognized.
Solution:
- Ensure you're using a supported API format for your provider
- Check that your request has the correct structure (e.g.,
messagesarray for OpenAI/Anthropic) - Verify the SDK version is compatible
Link to section: Requests Not Being ScannedRequests Not Being Scanned
Problem: Requests bypass the proxy or fail silently.
Solution:
- Verify you're using the correct base URL:
https://api.lockllm.com/v1/proxy/{provider} - Check that you added the provider key to dashboard
- Ensure the provider key is enabled (not disabled)
- Confirm you're passing your LockLLM API key for authentication
Link to section: Azure-Specific ErrorsAzure-Specific Errors
Problem: Azure requests fail with "azure_config_error".
Solution:
- Verify endpoint URL format:
https://your-resource.openai.azure.com(no trailing slash) - Check deployment name matches your Azure deployment exactly
- Ensure API version is compatible (default:
2024-10-21) - Confirm you added all required fields in the dashboard:
- Azure API key
- Endpoint URL
- Deployment name
Link to section: Rate Limit Exceeded Error (429)Rate Limit Exceeded Error (429)
Problem: Too many requests in a short time.
Error message:
{
"error": {
"message": "Rate limit exceeded. Please try again later.",
"type": "rate_limit_error"
}
}
Solution:
- Rate limits are tier-based (Tier 1: 300 req/min, up to Tier 10: 200,000 req/min)
- Implement exponential backoff in your application
- Upgrade your tier for higher limits - view tier benefits
Link to section: Insufficient Credits Error (402)Insufficient Credits Error (402)
Problem: Your LockLLM credit balance is too low.
Error message:
{
"error": {
"message": "Insufficient credit balance",
"type": "lockllm_balance_error",
"code": "insufficient_balance"
}
}
Solution:
- Add credits in your LockLLM dashboard
- Switch to BYOK mode to avoid credit requirements for LLM usage
- Free monthly tier credits may cover detection fees for most users
Link to section: Credits Service Unavailable (503)Credits Service Unavailable (503)
Problem: The LockLLM credits service is temporarily unavailable.
Error message:
{
"error": {
"message": "LockLLM credits are temporarily unavailable. Please try again later or use a provider-specific endpoint with your own API key.",
"type": "lockllm_service_error",
"code": "credits_unavailable"
}
}
Solution:
- This is a temporary service issue - retry after a short delay
- Switch to BYOK mode (provider-specific endpoints) as a fallback
- Implement retry logic with exponential backoff
- If the issue persists, contact [email protected]
Link to section: Invalid Provider for Credits Mode (400)Invalid Provider for Credits Mode (400)
Problem: You are trying to use a non-OpenRouter provider with the universal endpoint (LockLLM credits mode).
Error message:
{
"error": {
"message": "LockLLM credits mode only supports OpenRouter-compatible models. Please use the provider-specific endpoint with a BYOK key instead.",
"type": "lockllm_config_error",
"code": "invalid_provider_for_credits_mode"
}
}
Solution:
- If using the universal endpoint (
/v1/proxy/chat/completions), use OpenRouter-compatible model names (e.g.,openai/gpt-4,anthropic/claude-3-opus) - For other providers, use the provider-specific endpoint (e.g.,
/v1/proxy/openai) with a BYOK key configured in the dashboard - Add your provider API key in the dashboard if you haven't already
Link to section: Upstream Provider Error (502)Upstream Provider Error (502)
Problem: The upstream AI provider (OpenAI, Anthropic, etc.) failed to process the request.
Error message:
{
"error": {
"message": "Failed to forward request to provider",
"type": "upstream_error"
}
}
Solution:
- Check the upstream provider's status page for outages
- Verify your provider API key is valid and has sufficient quota
- Retry the request after a short delay
- If using Azure, verify your endpoint URL and deployment name are correct
- If the issue persists, try a different model or provider
Link to section: FAQFAQ
Link to section: Does the proxy block requests by default?Does the proxy block requests by default?
No. By default, the proxy uses allow_with_warning mode for all security scans. This means:
- Threats are detected and flagged with warnings in the response
- Requests are NOT blocked - they proceed to your LLM provider
- You receive threat information in response headers and body
- No disruption to your application's normal operation
To enable blocking, explicitly set headers:
defaultHeaders: {
'x-lockllm-scan-action': 'block', // Block prompt injection
'x-lockllm-policy-action': 'block' // Block policy violations
}
This design lets you test and monitor security threats in production without breaking user experience, then enable blocking when you're ready.
Link to section: Is the proxy free?Is the proxy free?
Partially. LockLLM uses a usage-based pricing model where you only pay for security detections and routing optimization:
- Safe prompts: FREE (no charge when passing security checks)
- Security detections: $0.0001-$0.0002 per detection (only when threats found)
- Routing fees: 5% of cost savings (only when routing saves you money)
- LLM usage (BYOK): FREE (you pay your provider directly)
- LLM usage (non-BYOK): Variable via LockLLM credits
All users receive free monthly credits based on their tier (1-10). You may never pay anything if your usage stays within free tier limits and all prompts are safe.
Link to section: What is BYOK (Bring Your Own Key)?What is BYOK (Bring Your Own Key)?
BYOK means you use your own API keys from OpenAI, Anthropic, etc. You add them to the LockLLM dashboard, and LockLLM proxies requests using your keys. You maintain full control over your keys, and billing stays with your provider account.
With BYOK:
- LLM usage: FREE (you pay provider directly)
- Security detections: $0.0001-$0.0002 per detection
- Routing fees: 5% of savings (when enabled)
Without BYOK (universal endpoint):
- Everything billed via LockLLM credits
- Use OpenRouter for 200+ models
- No provider API keys needed
Link to section: How does authentication work?How does authentication work?
There are two types of API keys:
-
Provider API Key (OpenAI, Anthropic, etc.): You add this to the LockLLM dashboard once. It's stored encrypted and never put in your code.
-
LockLLM API Key: You pass this in your SDK configuration. It authenticates your requests and tells the proxy which provider keys to use.
In your code, you only use your LockLLM API key. The proxy handles retrieving and using your provider keys securely.
Link to section: Are my provider API keys secure?Are my provider API keys secure?
Yes! Your API keys are:
- Encrypted at rest using industry-grade encryption
- Stored securely in our database
- Never exposed in API responses or logs
- Never transmitted to your application
- Completely unaccessible
Link to section: Which providers are supported?Which providers are supported?
17 providers with full support:
- OpenAI, Anthropic, Google Gemini, Cohere
- Azure OpenAI, AWS Bedrock, Google Vertex AI
- OpenRouter, Perplexity, Mistral AI
- Groq, DeepSeek, Together AI
- xAI (Grok), Fireworks AI, Anyscale
- Hugging Face
All providers support custom endpoint URLs for self-hosted or alternative endpoints.
Link to section: How do I configure Azure OpenAI?How do I configure Azure OpenAI?
Azure OpenAI requires additional configuration:
-
In the dashboard, select Azure OpenAI as provider
-
Enter:
- API key: Your Azure OpenAI key (or Microsoft Entra ID token)
- Endpoint URL:
https://your-resource.openai.azure.com - Deployment name: Your Azure deployment name (e.g.,
gpt-4) - API version (optional): Defaults to
2024-10-21
-
In your code:
const openai = new OpenAI({
apiKey: process.env.LOCKLLM_API_KEY,
baseURL: 'https://api.lockllm.com/v1/proxy/azure'
})
LockLLM supports both Azure API formats (legacy deployment-based and new v1 API).
Link to section: Does this add latency to my requests?Does this add latency to my requests?
Yes, approximately 150-250ms for scanning:
- Scanning: ~100-200ms
- Network overhead: ~50ms
This is minimal compared to typical LLM response times (1-10+ seconds) and provides critical security protection. The proxy caches scan results for identical prompts to reduce latency on repeated requests.
Link to section: Will this work with official SDKs?Will this work with official SDKs?
Yes! Proxy mode works seamlessly with official SDKs:
- OpenAI SDK (Node.js, Python): Just change
baseURLparameter - Anthropic SDK (Node.js, Python): Just change
baseURLparameter - Other provider SDKs: Works with any SDK that supports custom base URLs
The proxy is fully compatible with all SDK features including streaming, function calling, and multi-modal inputs.
Link to section: Can I use multiple keys for the same provider?Can I use multiple keys for the same provider?
Yes! You can add multiple keys for the same provider with different nicknames (e.g., "Production" and "Development"). However, only one key per provider can be enabled at a time.
To switch between keys, enable/disable them in the dashboard. The proxy automatically uses the enabled key for each provider.
Link to section: What happens when a malicious prompt is detected?What happens when a malicious prompt is detected?
It depends on your configuration:
Default behavior (allow_with_warning):
- The threat is detected and flagged with warning headers (
X-LockLLM-Scan-Warning,X-LockLLM-Injection-Score, etc.) - The request is still forwarded to your LLM provider
- Your application can read the warning headers to decide how to handle it
- The event is logged in your dashboard
With blocking enabled (x-lockllm-scan-action: block):
- The request is blocked immediately (not forwarded to provider)
- Returns a 400 Bad Request error with scan details and request ID
- The event is logged in your dashboard
- Can trigger webhook notifications if configured
You can catch blocked requests in your code and handle them appropriately (e.g., alert security team, log incident, show user error message).
Link to section: Do you log or store my prompts?Do you log or store my prompts?
No. We do not store prompt content. We only log:
- Metadata (timestamp, model, provider, request ID)
- Scan results (safe/malicious, confidence scores)
- Prompt length (character count)
Prompt content is scanned in memory and immediately discarded. This ensures privacy while providing security monitoring.
Link to section: How does the detection work?How does the detection work?
The proxy scans every request and assigns confidence scores:
- Injection score: 0 (definitely safe) to 100 (definitely malicious)
- By default, threats are flagged with warnings but requests are still forwarded (
allow_with_warning) - Enable
x-lockllm-scan-action: blockto block malicious prompts instead of forwarding them - Safe prompts are always forwarded to your provider
The detection system is tuned to balance security and minimize false positives. Contact [email protected] if you need custom detection settings for your use case.
Link to section: Can I test the proxy without adding my real API keys?Can I test the proxy without adding my real API keys?
Yes! You can test with:
- Universal endpoint: Use
/v1/proxy/chat/completionswith LockLLM credits (no provider keys needed) - OpenAI-compatible test endpoints: Use a local LLM server or test API
- Custom endpoints: Point to your staging/test environments
- Provider test keys: Use provider-issued test/sandbox keys
Just add the test configuration in the dashboard and point your SDK to the proxy.
Link to section: What are custom content policies?What are custom content policies?
Custom policies let you enforce your own content restrictions beyond built-in security. For example:
- Block medical or legal advice
- Prevent competitor mentions
- Enforce brand guidelines
- Meet compliance requirements (HIPAA, GDPR, etc.)
Create policies in the dashboard, then set the x-lockllm-scan-mode header to combined to check both security threats and custom policies. Each policy can be up to 10,000 characters and describe exactly what should be blocked.
Link to section: How does smart routing work?How does smart routing work?
Routing analyzes your prompt to determine:
- Task type: What the user is asking for (code generation, summarization, chatbot, etc.)
- Complexity: How difficult the task is (low/medium/high)
- Optimal model: Which model provides best quality/cost ratio
For example, a simple "What is 2+2?" might route from GPT-4 to GPT-3.5 (faster, cheaper, same quality). A complex code generation task stays on GPT-4 (quality matters).
You only pay routing fees (5% of savings) when the router actually saves you money by selecting a cheaper model.
Link to section: What is AI abuse detection?What is AI abuse detection?
Abuse detection protects you from malicious end-users trying to:
- Send bot-generated or automated requests
- Spam your endpoints with repetitive prompts
- Exhaust resources with oversized inputs
- Overwhelm your quota with burst requests
Enable it by adding X-LockLLM-Abuse-Action header. It's optional (opt-in) and designed for production applications where end-users directly interact with your AI.
Link to section: What is PII detection?What is PII detection?
PII (Personally Identifiable Information) detection scans prompts for sensitive personal data before forwarding to your LLM provider. It detects 17 entity types including names, emails, phone numbers, SSNs, credit cards, and addresses.
Enable it with the X-LockLLM-PII-Action header:
allow_with_warning: Detect PII, include in response headers, forward requestblock: Reject requests containing personal information (403 error)strip: Replace PII with[TYPE]placeholders before forwarding
PII detection is opt-in (disabled by default) and costs $0.0001 per detection (only when PII is found). It works alongside all other features (scanning, routing, abuse detection).
Link to section: Can I prevent personal data from reaching my LLM?Can I prevent personal data from reaching my LLM?
Yes! Use x-lockllm-pii-action: strip to automatically redact personal information before it reaches your LLM provider. Detected entities are replaced with type placeholders (e.g., [EMAIL], [GIVENNAME]). This ensures your LLM never sees actual personal data while still understanding the context of the request.
Link to section: What is prompt compression?What is prompt compression?
Prompt compression reduces the token count of your prompts before they reach your LLM provider, helping you save on API costs. Three methods are available:
- TOON (Token-Oriented Object Notation): Converts JSON data to a compact, token-efficient format. Free, instant, JSON-only.
- Compact: Uses advanced ML-based token classification to compress any text. $0.0001 per use, up to 5 seconds.
- Combined: Applies TOON first, then Compact on the result for maximum compression. $0.0001 per use. Best for maximum token reduction.
Enable compression with the X-LockLLM-Compression header set to toon, compact, or combined. Compression is disabled by default and applied only after all security checks pass. Learn more about Prompt Compression.
Link to section: Does compression affect response quality?Does compression affect response quality?
TOON produces a compact notation that LLMs understand effectively - the data structure and values are fully preserved. Compact uses ML-based classification to preserve meaning while removing redundant tokens. If quality is a concern, use a higher compression rate (closer to 0.7) with the Compact method via the X-LockLLM-Compression-Rate header.
Link to section: Do you charge for every request?Do you charge for every request?
No! We only charge when providing value:
- Safe prompts: FREE (no security detection fee)
- Unsafe prompts: $0.0001-$0.0002 (detected threat)
- PII detected: $0.0001 (personal information found)
- Prompt compression (TOON): FREE
- Prompt compression (Compact): $0.0001 per use
- Routing to cheaper model: 5% of savings (saved you money)
- Routing to same/more expensive model: FREE (no savings)
- BYOK LLM usage: FREE (you pay provider directly)
If all your prompts are safe, contain no PII, compression is disabled, and routing is disabled, you pay nothing for security scanning.
Link to section: How do I control my costs?How do I control my costs?
1. Use BYOK (Bring Your Own Key):
- No LLM usage charges from LockLLM
- Only pay for detections and routing
2. Monitor your dashboard:
- Track detection rates
- View routing savings
3. Adjust sensitivity:
- Higher tiers get more free credits
- Increase monthly spending to unlock higher tiers
4. Disable optional features if not needed:
- Set
x-lockllm-route-action: disabledto skip routing fees (if you don't want cost optimization) - Use
x-lockllm-scan-mode: normalinstead ofcombinedif you don't need custom policy checks - Only add
x-lockllm-pii-actionheader for inputs that may contain personal data (PII detection is disabled by default) - Use TOON compression (
x-lockllm-compression: toon) for free token savings on JSON data - Use Compact compression selectively, as it costs $0.0001 per use
Note: Abuse detection (x-lockllm-abuse-action) is FREE and has no cost impact. PII detection (x-lockllm-pii-action) costs $0.0001 only when PII is actually found - no charge when prompts contain no personal information. TOON compression is FREE with no cost impact.
Link to section: What happens if I run out of credits?What happens if I run out of credits?
If you're using LockLLM credits (non-BYOK mode) and run out:
- Proxy returns 402 Payment Required error
- Add credits in the dashboard to resume service
- Consider switching to BYOK to avoid credit requirements for LLM usage
If you're using BYOK mode:
- LLM usage never consumes LockLLM credits (you pay provider directly)
- You only need credits for security detections and routing
- Free monthly tier credits often cover detection fees for most users