Proxy Mode

Automatically scan all LLM requests with custom policies, smart routing, PII redaction, prompt compression, injection and abuse detection. Zero code changes required.

Link to section: What is Proxy Mode?What is Proxy Mode?

Proxy Mode is LockLLM's automatic security solution that scans all your LLM requests without changing your code. Simply change your base URL and either use your own API keys (BYOK) or LockLLM credits - that's it!

Available in two modes:

  • BYOK Mode: Use your own provider API keys (OpenAI, Anthropic, etc.) - you only pay LockLLM for security detections and routing optimization
  • Universal Mode: Use LockLLM credits for everything via OpenRouter (200+ models) - no provider keys needed

Key benefits:

  • Zero code changes required
  • Works with official SDKs (OpenAI, Anthropic, etc.)
  • Automatic scanning of all requests
  • Supports 17+ LLM providers with custom endpoint support
  • Your API keys are securely stored
  • Pay only for security detections and smart routing
  • Free tier with monthly credits

Link to section: How It WorksHow It Works

BYOK Mode (recommended for production):

  1. Add your provider API key (OpenAI, Anthropic, etc.) to the LockLLM dashboard
  2. Change your base URL to https://api.lockllm.com/v1/proxy/{provider}
  3. All requests are automatically scanned before being forwarded to the provider
  4. By default, threats are detected with warnings but NOT blocked (use headers to enable blocking)
  5. You pay provider directly for LLM usage, LockLLM only for security

Universal Mode (no provider keys needed):

  1. Change your base URL to https://api.lockllm.com/v1/proxy/chat/completions
  2. All requests are scanned and forwarded via OpenRouter (200+ models)
  3. Everything billed via LockLLM credits
Your AppLockLLM Proxy (scan + route) → Provider (OpenAI/Anthropic/etc)
            ✓ Safe: ForwardMalicious (default): Forward with warning
            ✗ Malicious (block mode): Block request

Link to section: PricingPricing

LockLLM uses a transparent, usage-based pricing model. You only pay for security detections and smart routing when they detect actual threats or optimize your costs.

Link to section: What You Pay ForWhat You Pay For

1. Security Detection Fees (charged only when threats are found):

  • Safe prompts: FREE - No charge when prompt passes security checks
  • Unsafe core scan (prompt injection detected): $0.0001 per detection
  • Policy violation (custom policy triggered): $0.0001 per detection
  • Both unsafe (injection + policy violation): $0.0002 per detection
  • PII detected (personal information found): $0.0001 per detection
  • Prompt compression (TOON): FREE
  • Prompt compression (Compact): $0.0001 per use
  • Maximum per request (injection + policy + PII + compact compression): $0.0004

2. Smart Routing Fees (optional, charged only when saving you money):

  • When routing to a cheaper model: 5% of cost savings
  • When routing to a more expensive or same-cost model: FREE
  • When routing is disabled: FREE

3. LLM API Usage:

  • BYOK (Bring Your Own Key): FREE - You pay your provider directly
  • Non-BYOK (Universal Endpoint): Variable cost based on model usage via LockLLM credits

Link to section: BYOK vs Non-BYOKBYOK vs Non-BYOK

BYOK Mode (Provider-Specific Endpoints):

  • Use your own API keys for OpenAI, Anthropic, Gemini, etc.
  • You pay your provider directly for LLM usage
  • LockLLM charges only for security detections and routing optimization
  • Example: https://api.lockllm.com/v1/proxy/openai

Non-BYOK Mode (Universal Endpoint):

  • Use LockLLM credits for LLM usage (200+ models)
  • Billed for security detections, routing optimization, and LLM usage
  • No need to configure provider API keys
  • Example: https://api.lockllm.com/v1/proxy/chat/completions

Link to section: Free Monthly CreditsFree Monthly Credits

All users receive free monthly credits based on their tier (1-10). Higher tiers unlock more free credits and higher rate limits. Learn more about the tier system.

Link to section: Example Cost BreakdownExample Cost Breakdown

Scenario 1: BYOK user with 100 safe requests

  • Security scanning: $0 (all prompts safe)
  • Routing fees: $0 (not using routing)
  • LLM usage: $5 (paid directly to OpenAI)
  • Total LockLLM cost: $0

Scenario 2: BYOK user with 2 unsafe prompts detected

  • Security scanning: $0.0002 (2 × $0.0001)
  • Routing fees: $0
  • LLM usage: Blocked (unsafe prompts not forwarded)
  • Total LockLLM cost: $0.0002

Scenario 3: BYOK user with routing enabled (saved $10 in costs)

  • Security scanning: $0.0001 (1 unsafe prompt)
  • Routing fees: $0.50 (5% of $10 savings)
  • LLM usage: $40 (paid directly to provider after routing)
  • Total LockLLM cost: $0.5001 (but you saved $9.50 overall!)

Scenario 4: Non-BYOK user (universal endpoint)

  • Security scanning: $0.0001
  • Routing fees: $0.50
  • LLM usage: $15 (via LockLLM credits)
  • Total LockLLM cost: $15.5001

Link to section: Supported ProvidersSupported Providers

LockLLM proxy mode supports 17 providers with their own URLs:

ProviderProxy URL
OpenAIhttps://api.lockllm.com/v1/proxy/openai
Anthropichttps://api.lockllm.com/v1/proxy/anthropic
Google Geminihttps://api.lockllm.com/v1/proxy/gemini
Coherehttps://api.lockllm.com/v1/proxy/cohere
OpenRouterhttps://api.lockllm.com/v1/proxy/openrouter
Perplexityhttps://api.lockllm.com/v1/proxy/perplexity
Mistral AIhttps://api.lockllm.com/v1/proxy/mistral
Groqhttps://api.lockllm.com/v1/proxy/groq
DeepSeekhttps://api.lockllm.com/v1/proxy/deepseek
Together AIhttps://api.lockllm.com/v1/proxy/together
xAI (Grok)https://api.lockllm.com/v1/proxy/xai
Fireworks AIhttps://api.lockllm.com/v1/proxy/fireworks
Anyscalehttps://api.lockllm.com/v1/proxy/anyscale
Hugging Facehttps://api.lockllm.com/v1/proxy/huggingface
Azure OpenAIhttps://api.lockllm.com/v1/proxy/azure
AWS Bedrockhttps://api.lockllm.com/v1/proxy/bedrock
Google Vertex AIhttps://api.lockllm.com/v1/proxy/vertex-ai

Link to section: Custom EndpointsCustom Endpoints

All providers support custom endpoint URLs for self-hosted models, alternative endpoints, or proxy/gateway services. When adding your API key in the dashboard, you can optionally specify a custom endpoint URL to override the default.

Link to section: Universal Endpoint (No BYOK Required)Universal Endpoint (No BYOK Required)

Don't want to configure provider API keys? Use the universal endpoint with LockLLM credits:

Endpoint: https://api.lockllm.com/v1/proxy/chat/completions

Features:

  • Access 200+ models
  • No provider API keys needed
  • No BYOK configuration required
  • Everything billed via LockLLM credits
  • Same security scanning and routing features

You can browse all supported models and their IDs in the Model List page in your dashboard. When making requests, you must use the exact model ID shown there (e.g., openai/gpt-4).

Example:

const OpenAI = require('openai')

const client = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key only
  baseURL: 'https://api.lockllm.com/v1/proxy/chat/completions'
})

// Use the model ID from the Model List page
const response = await client.chat.completions.create({
  model: 'openai/gpt-4',  // Must match the model ID exactly
  messages: [{ role: 'user', content: userPrompt }]
})

When to use:

  • Testing without provider API keys
  • Access to multiple providers without separate keys
  • Simplified billing (one invoice from LockLLM)
  • Rapid prototyping

When to use BYOK instead:

  • Production applications (lower costs)
  • You already have provider API keys
  • Want direct provider billing
  • Need provider-specific features

Link to section: Quick StartQuick Start

Link to section: Step 1: Add Your Provider API Key to DashboardStep 1: Add Your Provider API Key to Dashboard

  1. Sign in to your LockLLM dashboard
  2. Navigate to Proxy Settings (or API Keys)
  3. Click Add API Key
  4. Select your provider (e.g., OpenAI, Anthropic, Azure)
  5. Enter your provider API key (from OpenAI, Anthropic, etc.)
  6. Give it a nickname (optional, e.g., "Production Key")
  7. Click Add API Key

Link to section: Step 2: Get Your LockLLM API KeyStep 2: Get Your LockLLM API Key

  1. In the LockLLM dashboard, go to API Keys section
  2. Copy your LockLLM API key
  3. You'll use this to authenticate proxy requests (not your provider key!)

Link to section: Step 3: Update Your CodeStep 3: Update Your Code

Change your SDK configuration to use LockLLM's proxy. Important: Pass your LockLLM API key (not your provider key) for authentication.

Link to section: OpenAI SDK (JavaScript/TypeScript)OpenAI SDK (JavaScript/TypeScript)

const OpenAI = require('openai')

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key (...)
  baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})

// All requests automatically scanned with default allow_with_warning behavior
// Threats are detected and warnings added to response, but requests are NOT blocked
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Anthropic SDK (JavaScript/TypeScript)Anthropic SDK (JavaScript/TypeScript)

const Anthropic = require('@anthropic-ai/sdk')

const anthropic = new Anthropic({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key
  baseURL: 'https://api.lockllm.com/v1/proxy/anthropic'
})

// Automatically scanned!
const response = await anthropic.messages.create({
  model: 'claude-3-opus-20240229',
  max_tokens: 1024,
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Python OpenAIPython OpenAI

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),  # Your LockLLM API key
    base_url='https://api.lockllm.com/v1/proxy/openai'
)

# Automatically scanned!
response = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': user_prompt}]
)

Link to section: Python AnthropicPython Anthropic

import os
from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ.get('LOCKLLM_API_KEY'),  # Your LockLLM API key (...)
    base_url='https://api.lockllm.com/v1/proxy/anthropic'
)

# Automatically scanned!
response = client.messages.create(
    model='claude-3-opus-20240229',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': user_prompt}]
)

Link to section: How Authentication WorksHow Authentication Works

Important: Understanding the two types of API keys:

  1. Provider API Key (OpenAI, Anthropic, etc.):

    • You add this to the LockLLM dashboard once
    • Stored securely and encrypted
    • Never put this in your code when using the proxy
  2. LockLLM API Key:

    • You pass this in your SDK configuration (apiKey parameter)
    • Authenticates your requests to the LockLLM proxy
    • This is what goes in your code

Link to section: Provider-Specific ConfigurationProvider-Specific Configuration

Link to section: Azure OpenAIAzure OpenAI

Azure requires additional configuration:

Dashboard Setup:

  1. Select Azure OpenAI as provider
  2. Enter your Azure OpenAI API key
  3. Enter your Endpoint URL (e.g., https://your-resource.openai.azure.com)
  4. Enter your Deployment Name (e.g., gpt-4)
  5. Enter API Version (e.g., 2024-10-21) - optional, defaults to latest

Code:

const OpenAI = require('openai')

const client = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key (NOT Azure key)
  baseURL: 'https://api.lockllm.com/v1/proxy/azure'
})

// Use Azure OpenAI models
const response = await client.chat.completions.create({
  model: 'gpt-4',  // Uses your configured deployment
  messages: [{ role: 'user', content: userPrompt }]
})

Azure API Format Support:

LockLLM supports both Azure OpenAI API formats:

  1. Legacy format (deployment-based): /openai/deployments/{deployment}/chat/completions?api-version=2024-10-21
  2. v1 API format (preview): /openai/v1/chat/completions?api-version=2024-10-21 with deployment in header

The proxy automatically handles both formats. When you configure your deployment name in the dashboard, it's used for all requests.

Link to section: AWS BedrockAWS Bedrock

Bedrock requires AWS credentials:

Dashboard Setup:

  1. Select AWS Bedrock as provider
  2. Enter your AWS credentials JSON:
{
  "accessKeyId": "your-access-key",
  "secretAccessKey": "your-secret-key",
  "region": "us-east-1"
}

Code:

// Use with Bedrock-compatible SDK or direct fetch
const response = await fetch('https://api.lockllm.com/v1/proxy/bedrock/model/anthropic.claude-3-sonnet-20240229-v1:0/invoke', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'  // Your LockLLM API key
  },
  body: JSON.stringify({
    anthropic_version: 'bedrock-2023-05-31',
    max_tokens: 1024,
    messages: [{ role: 'user', content: userPrompt }]
  })
})

Link to section: Google Vertex AIGoogle Vertex AI

Vertex AI requires service account credentials:

Dashboard Setup:

  1. Select Vertex AI as provider
  2. Enter your Google Cloud project ID
  3. Enter your service account JSON key

Code:

// Use with Vertex AI-compatible SDK or direct fetch
const response = await fetch('https://api.lockllm.com/v1/proxy/vertex-ai/v1/projects/YOUR_PROJECT/locations/us-central1/publishers/anthropic/models/claude-3-opus@20240229:streamRawPredict', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'  // Your LockLLM API key
  },
  body: JSON.stringify({
    anthropic_version: 'vertex-2023-10-16',
    max_tokens: 1024,
    messages: [{ role: 'user', content: userPrompt }]
  })
})

Link to section: Multiple Keys for Same ProviderMultiple Keys for Same Provider

You can add multiple API keys for the same provider with different nicknames:

Dashboard Setup:

  1. Add multiple keys for the same provider (e.g., "OpenAI - Production" and "OpenAI - Development")
  2. Give each a unique nickname
  3. Only one can be enabled at a time for each provider

Code:

// The proxy uses the enabled key for the provider
const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key
  baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})

// Whichever OpenAI key is enabled in your dashboard will be used

Note: The proxy automatically selects the enabled key for each provider. To switch between keys, enable/disable them in the dashboard.

Link to section: Managing Provider KeysManaging Provider Keys

Link to section: View Your KeysView Your Keys

Visit the Proxy Settings page in your dashboard to see all configured provider keys:

  • Provider name
  • Nickname
  • Last used timestamp
  • Enable/disable toggle
  • Delete option

Link to section: Enable/Disable KeysEnable/Disable Keys

Toggle keys on/off without deleting them:

  1. Go to Proxy Settings
  2. Find your key
  3. Click the toggle switch
  4. Disabled keys won't be used for proxying

Link to section: Delete KeysDelete Keys

Permanently remove provider keys:

  1. Go to Proxy Settings
  2. Find your key
  3. Click Delete
  4. Confirm deletion

Link to section: SecuritySecurity

Link to section: Your Provider API Keys Are SecureYour Provider API Keys Are Secure

  • Provider API keys (OpenAI, Anthropic, etc.) are encrypted at rest using industry-standard encryption
  • Keys are stored securely in our database and never exposed in API responses
  • Only you can view or manage your keys through the dashboard
  • Keys are never logged or included in error messages

Link to section: Request AuthenticationRequest Authentication

Standard authentication (recommended):

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,  // Your LockLLM API key (...)
  baseURL: 'https://api.lockllm.com/v1/proxy/openai'
})

Alternative: Using Authorization header directly:

If you need to pass the LockLLM API key separately (e.g., for custom implementations):

const openai = new OpenAI({
  apiKey: 'dummy-key',  // SDK requires a value, but proxy ignores this
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'Authorization': 'Bearer YOUR_LOCKLLM_API_KEY'
  }
})

The proxy checks both the apiKey parameter and the Authorization header for your LockLLM API key.

Link to section: Blocked RequestsBlocked Requests

Default Behavior: By default, the proxy uses allow_with_warning for security scans and policy checks. This means malicious prompts are allowed but flagged with warnings in the response. Requests are only blocked if you explicitly set the action headers to block.

When blocking is enabled (via X-LockLLM-Scan-Action: block), the proxy returns a 400 Bad Request error with the following structure:

{
  "error": {
    "message": "Malicious prompt detected by LockLLM",
    "type": "lockllm_security_error",
    "code": "prompt_injection_detected",
    "scan_result": {
      "safe": false,
      "label": 1,
      "confidence": 95,
      "injection": 95,
      "sensitivity": "medium"
    },
    "request_id": "req_abc123"
  }
}

Response Headers:

  • X-Request-Id: Unique request identifier
  • X-LockLLM-Blocked: "true" (indicates request was blocked)

Handle blocked requests in your application:

try {
  const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: userPrompt }]
  })
} catch (error) {
  // Check if error is from LockLLM security block
  if (error.response?.status === 400 && error.response?.data?.error?.code === 'prompt_injection_detected') {
    console.log('Malicious prompt blocked by LockLLM')
    const scanResult = error.response.data.error.scan_result
    console.log('Injection confidence:', scanResult.injection)
    console.log('Request ID:', error.response.data.error.request_id)

    // Handle security incident (log, alert, etc.)
    // You can find this request in your LockLLM dashboard logs
  } else {
    // Handle other errors
    throw error
  }
}

When PII blocking is enabled (via X-LockLLM-PII-Action: block), the proxy returns a 403 Forbidden error:

{
  "error": {
    "message": "Request blocked due to personal information detected",
    "type": "lockllm_pii_error",
    "code": "pii_detected",
    "pii_details": {
      "entity_types": ["Email", "Phone Number"],
      "entity_count": 3
    },
    "request_id": "req_abc123"
  }
}

Successful requests return the normal provider response (OpenAI, Anthropic, etc.) with additional LockLLM headers:

Standard Headers (always included):

HeaderDescription
X-Request-IdUnique request identifier
X-LockLLM-Scanned"true" - confirms request was scanned
X-LockLLM-Safe"true" or "false" - scan result
X-Scan-ModeScan mode used for this request
X-LockLLM-ModelModel used for the request
X-LockLLM-ProviderProvider used
X-LockLLM-Credits-Mode"lockllm_credits" or "byok"
X-LockLLM-SensitivitySensitivity level used for scanning

Credits Mode Headers (included when using LockLLM credits/universal endpoint):

HeaderDescription
X-LockLLM-Credits-ReservedAmount of credits reserved for this request (in USD). Actual usage may differ; unused credits are refunded automatically.

Warning Headers (included when threats detected with allow_with_warning):

HeaderDescription
X-LockLLM-Scan-Warning"true" if core security threat detected
X-LockLLM-Injection-ScoreInjection score (0-100)
X-LockLLM-ConfidenceDetection confidence (0-100)
X-LockLLM-Label"0" for safe, "1" for malicious
X-LockLLM-Scan-DetailBase64-encoded JSON with full scan detail (message, injection score, confidence, label, sensitivity)
X-LockLLM-Policy-Warnings"true" if policy violations found
X-LockLLM-Warning-CountNumber of policy warnings
X-LockLLM-Warning-DetailBase64-encoded JSON with first policy warning detail (policy name, violated categories, violation details)
X-LockLLM-Policy-ConfidencePolicy check confidence score (0-100), included for combined and policy_only scan modes
X-LockLLM-Abuse-Detected"true" if abuse was detected (only when x-lockllm-abuse-action: allow_with_warning)
X-LockLLM-Abuse-ConfidenceAbuse confidence score (0-100)
X-LockLLM-Abuse-TypesComma-separated list of detected abuse types (e.g., "bot_generated,rapid_requests")
X-LockLLM-Abuse-DetailBase64-encoded JSON with full abuse detail (confidence, abuse types, indicators, recommendation)

Routing Headers (included when routing is enabled):

HeaderDescription
X-LockLLM-Route-Enabled"true" if routing was active
X-LockLLM-Task-TypeDetected task classification
X-LockLLM-ComplexityComplexity score (0.0-1.0)
X-LockLLM-Selected-ModelModel chosen by router
X-LockLLM-Routing-ReasonExplanation for model selection
X-LockLLM-Original-ModelOriginal model before routing (only if changed)
X-LockLLM-Original-ProviderOriginal provider before routing (only if changed)
X-LockLLM-Estimated-Original-CostEstimated cost with original model
X-LockLLM-Estimated-Routed-CostEstimated cost with routed model
X-LockLLM-Estimated-SavingsCost savings from routing
X-LockLLM-Estimated-Input-TokensEstimated input token count used for cost calculation
X-LockLLM-Estimated-Output-TokensEstimated output token count used for cost calculation
X-LockLLM-Routing-Fee-ReservedRouting fee reserved upfront (in USD, 6 decimal places)
X-LockLLM-Routing-Fee-ReasonReason no routing fee was charged (e.g., "routing_to_more_expensive_model")

Cache Headers (included for response caching):

HeaderDescription
X-LockLLM-Cache-Status"HIT" or "MISS"
X-LockLLM-Cache-AgeCache entry age in seconds
X-LockLLM-Tokens-SavedTokens saved from cache hit
X-LockLLM-Cost-SavedCost saved from cache hit

PII Headers (included when PII detection is enabled):

HeaderDescription
X-LockLLM-PII-Detected"true" or "false" - always included when x-lockllm-pii-action is set
X-LockLLM-PII-TypesComma-separated list of detected entity types (e.g., "Email,Phone Number") - only included when PII is found
X-LockLLM-PII-CountNumber of PII entities found - only included when PII is found
X-LockLLM-PII-ActionThe PII action that was applied (strip, block, or allow_with_warning) - only included when PII is found

Compression Headers (included when prompt compression is enabled):

HeaderDescription
X-LockLLM-Compression-MethodCompression method used: "toon", "compact", or "combined"
X-LockLLM-Compression-Applied"true" or "false" - whether compression was successfully applied
X-LockLLM-Compression-RatioCompression ratio (only included when compression was applied)

Link to section: Detection SettingsDetection Settings

Link to section: Scan ResultsScan Results

Every request is scanned and returns a confidence score indicating whether the prompt is safe or potentially malicious.

Scan result format:

{
  "safe": true,
  "label": 0,         // 0 = safe, 1 = malicious
  "confidence": 92,   // Confidence in the prediction (0-100)
  "injection": 8,     // Injection score (0-100, higher = more likely malicious)
  "sensitivity": "medium"
}

The proxy scans all requests and provides confidence scores. By default, threats are flagged with warnings but requests are still forwarded. Enable blocking via the x-lockllm-scan-action: block header.

Learn more about Threat Detection

Link to section: Scan ModesScan Modes

Control what gets scanned using the x-lockllm-scan-mode header:

1. Normal Mode

  • Scans only for core security threats (prompt injection, jailbreaks, etc.)
  • No custom policy checks
  • Use when you only need basic security protection

2. Policy-Only Mode

  • Skips core security scan
  • Checks only custom content policies
  • Use when you want to enforce content guidelines without injection detection

3. Combined Mode (default)

  • Scans for both core security threats AND custom policy violations
  • Most comprehensive protection
  • Use for production applications with strict content requirements

Example:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-scan-mode': 'combined'  // Enable both core scan + custom policies
  }
})

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Action HeadersAction Headers

Fine-grained control over how threats are handled using request headers.

Default Behavior (No Headers): When you don't configure any headers, the proxy uses safe defaults:

  • Core security scan: allow_with_warning (scans but doesn't block)
  • Policy violations: allow_with_warning (detects but doesn't block)
  • Abuse detection: disabled (opt-in only)
  • PII detection: disabled (opt-in only)
  • Prompt compression: disabled (opt-in only)
  • Routing: disabled (uses your original model choice)

This means by default, the proxy scans for threats and adds warnings to responses without blocking requests.

Available Headers:

x-lockllm-scan-mode (controls what gets scanned):

  • normal: Core security threats only
  • policy_only: Custom policies only
  • combined (default): Both core threats and policies

x-lockllm-sensitivity (controls detection strictness):

  • low: Fewer false positives, more permissive
  • medium (default): Balanced approach
  • high: Maximum security, most aggressive detection

x-lockllm-scan-action (controls core injection scan behavior):

  • allow_with_warning (default): Allow request, add warning to response
  • block: Block malicious requests with 400 error

x-lockllm-policy-action (controls custom policy violation behavior):

  • allow_with_warning (default): Allow request, add policy warnings to response
  • block: Block policy violations with 403 error

x-lockllm-abuse-action (controls AI abuse detection):

  • Not set (default): Skip abuse detection entirely
  • allow_with_warning: Detect abuse, add warnings to response
  • block: Block abusive requests with 400 error

x-lockllm-route-action (controls smart routing):

  • disabled (default): No routing, use original model
  • auto: Automatic routing based on AI task classification
  • custom: Use user-defined routing rules from dashboard

x-lockllm-pii-action (controls PII detection and redaction):

  • Not set (default): Skip PII detection entirely
  • allow_with_warning: Detect PII, add results to response headers
  • block: Block requests containing personal information (403 error)
  • strip: Replace detected PII with [TYPE] placeholders before forwarding to provider

x-lockllm-compression (controls prompt compression):

  • Not set (default): Skip compression entirely
  • toon: Compress JSON data using TOON format (free, JSON-only, instant)
  • compact: Compress any text using ML-based compression ($0.0001 per use)
  • combined: Apply TOON first then Compact for maximum compression ($0.0001 per use)

x-lockllm-compression-rate (controls compact/combined compression aggressiveness):

  • Default: 0.5 (balanced)
  • Range: 0.3 (aggressive) to 0.7 (conservative)
  • Only applies when using compact or combined method

x-lockllm-cache-response (controls response caching):

  • true (default): Enable response caching for identical requests
  • false: Disable response caching (always get fresh responses)

x-lockllm-cache-ttl (controls response cache duration, in seconds):

  • Default: 3600 (1 hour)
  • Maximum: 86400 (24 hours)
  • Only applies when response caching is enabled

Example:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-scan-mode': 'combined',              // Scan both core + policies
    'x-lockllm-sensitivity': 'high',                // Maximum security
    'x-lockllm-scan-action': 'block',               // Block injection attacks
    'x-lockllm-policy-action': 'block',             // Block policy violations
    'x-lockllm-abuse-action': 'allow_with_warning', // Detect abuse, don't block
    'x-lockllm-route-action': 'auto',               // Enable smart routing
    'x-lockllm-pii-action': 'strip',                // Redact personal information
    'x-lockllm-compression': 'compact'               // Compress prompts for token savings
  }
})

Link to section: Custom Content PoliciesCustom Content Policies

Create your own content policies beyond the built-in security categories. Custom policies let you enforce brand-specific guidelines, compliance requirements, or content restrictions.

Link to section: Setting Up Custom PoliciesSetting Up Custom Policies

  1. Go to Dashboard → Policies
  2. Click Create Policy
  3. Enter a policy name (e.g., "No Medical Advice")
  4. Write a description (up to 10,000 characters) defining what should be blocked
  5. Enable the policy
  6. Set the x-lockllm-scan-mode header to combined or policy_only in your requests

Example:

// Dashboard policy: "No Financial Advice"
// Description: "Block requests asking for investment advice, stock tips, or financial planning guidance"

// In your code:
const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-scan-mode': 'combined'  // Checks both core security + custom policies
  }
})

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

Link to section: Policy Violations ResponsePolicy Violations Response

When a custom policy is violated:

{
  "error": {
    "message": "Request blocked by custom policy",
    "type": "lockllm_policy_error",
    "code": "policy_violation",
    "violated_policies": [
      {
        "policy_name": "No Financial Advice",
        "violated_categories": [
          { "name": "Investment Guidance" }
        ],
        "violation_details": "User requested stock recommendations"
      }
    ],
    "request_id": "req_abc123"
  }
}

Learn more about Custom Policies

Link to section: Smart RoutingSmart Routing

Automatically route requests to the optimal model based on task complexity and type. Save costs by using cheaper models for simple tasks while maintaining quality for complex ones.

Link to section: How Routing WorksHow Routing Works

  1. Task Classification: AI analyzes your prompt and determines the task type (e.g., "Code Generation", "Summarization", "Open QA")
  2. Complexity Analysis: Assigns a complexity score (0-1) and tier (low/medium/high)
  3. Model Selection: Chooses the best model based on task requirements and your routing rules
  4. Execution: Routes request to selected model (uses your BYOK key or LockLLM credits)

Link to section: Routing ModesRouting Modes

Auto Routing (X-LockLLM-Route-Action: auto):

  • AI-powered task classification and complexity analysis
  • Automatically selects optimal model based on predefined routing logic
  • Routes high-complexity tasks to advanced models (Claude Sonnet, GPT-4)
  • Routes low-complexity tasks to efficient models for cost savings

Custom Routing (X-LockLLM-Route-Action: custom):

  • Uses your own routing rules defined in the dashboard
  • Configure rules by task type and complexity tier
  • Specify target model and whether to use BYOK
  • Falls back to auto routing if no matching rule found

Link to section: Setting Up RoutingSetting Up Routing

Enable Auto Routing:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-route-action': 'auto'
  }
})

// Example: User requests GPT-4, but prompt is simple
// Router detects low complexity and routes to GPT-3.5
// You save money, get fast response, maintain quality

Custom Routing Rules (Dashboard):

  1. Go to Dashboard → Routing
  2. Click Create Rule
  3. Select task type (Code Generation, Summarization, etc.)
  4. Select complexity tier (low, medium, high)
  5. Choose target model
  6. Enable BYOK or use LockLLM credits
  7. Save rule

Example Rule:

  • Task Type: Code Generation
  • Complexity: High
  • Target Model: claude-3-7-sonnet
  • Use BYOK: Yes (uses your Anthropic key)

Link to section: Routing FeesRouting Fees

  • Routing to cheaper model: 5% of cost savings
  • Routing to more expensive or same-cost model: FREE
  • When routing is disabled: FREE

You only pay routing fees when the router actually saves you money!

Link to section: Routing MetadataRouting Metadata

All routed requests include metadata in response headers:

  • X-LockLLM-Task-Type: Detected task classification
  • X-LockLLM-Complexity: Complexity score (0.0-1.0)
  • X-LockLLM-Selected-Model: Model chosen by router
  • X-LockLLM-Routing-Reason: Explanation for selection

Learn more about Smart Routing

Link to section: AI Abuse DetectionAI Abuse Detection

Protect your application from end-users abusing your AI endpoints with automated requests, bot-generated prompts, or resource exhaustion attacks.

Link to section: What Gets DetectedWhat Gets Detected

Content Analysis:

  • Bot-generated content: Template-like structures, excessive special characters
  • Excessive repetition: Character, word, and phrase-level repetition
  • Resource exhaustion: Extremely long prompts, deep nesting, oversized inputs

Pattern Analysis (behavioral):

  • Rapid requests: Unusual request frequency from single API key
  • Duplicate prompts: Identical prompts within short time window
  • Burst detection: >50% of requests concentrated in last 30 seconds

Link to section: Enabling Abuse DetectionEnabling Abuse Detection

Add the X-LockLLM-Abuse-Action header to enable abuse detection:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-abuse-action': 'block'  // or 'allow_with_warning'
  }
})

Link to section: Abuse Detection ResponseAbuse Detection Response

When abuse is detected with block action:

{
  "error": {
    "message": "Request blocked due to abuse detection",
    "type": "lockllm_abuse_error",
    "code": "abuse_detected",
    "abuse_details": {
      "confidence": 87,
      "abuse_types": ["bot_generated", "rapid_requests"],
      "indicators": {
        "bot_score": 95,
        "repetition_score": 45,
        "resource_score": 30,
        "pattern_score": 80
      },
      "details": {
        "recommendation": "Implement rate limiting or CAPTCHA for this user"
      }
    },
    "request_id": "req_abc123"
  }
}

With allow_with_warning action, request proceeds with abuse warnings added to response.

Learn more about Abuse Detection

Link to section: PII Detection & RedactionPII Detection & Redaction

Automatically detect and protect personal information in prompts before they reach your LLM provider. When enabled, LockLLM scans for names, email addresses, phone numbers, Social Security numbers, credit card numbers, and 12 other entity types.

Link to section: Supported Entity TypesSupported Entity Types

LockLLM detects 17 types of personal information:

  • Identity: First Name, Last Name, Date of Birth, Username
  • Contact: Email, Phone Number, Street Address, City, Zip Code, Building Number
  • Financial: Credit Card, Account Number, Tax ID
  • Government IDs: Social Security Number, Driver's License, ID Card Number
  • Security: Password

Link to section: Enabling PII DetectionEnabling PII Detection

Add the X-LockLLM-PII-Action header to enable PII detection:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-pii-action': 'strip'  // or 'block' or 'allow_with_warning'
  }
})

Link to section: PII ActionsPII Actions

allow_with_warning - Detect PII and add metadata to response headers, but forward the original request:

  • Response includes X-LockLLM-PII-Detected, X-LockLLM-PII-Types, X-LockLLM-PII-Count headers
  • Your application can read these headers to decide how to handle PII

block - Block requests containing personal information:

  • Returns 403 error with details of detected entity types
  • Request is NOT forwarded to your LLM provider
  • Prevents personal data from reaching the model

strip (recommended for privacy) - Automatically redact PII before forwarding:

  • Detected entities are replaced with [TYPE] placeholders (e.g., [EMAIL], [GIVENNAME])
  • The redacted request is forwarded to your LLM provider
  • Your LLM never sees the actual personal information
  • Response headers indicate what was redacted

Link to section: Example: Stripping PIIExample: Stripping PII

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-pii-action': 'strip'
  }
})

// User sends: "Contact John Smith at [email protected] or 555-123-4567"
// LLM receives: "Contact [GIVENNAME] [SURNAME] at [EMAIL] or [TELEPHONENUM]"
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: userPrompt }]
})

// Check PII headers in raw response:
// X-LockLLM-PII-Detected: true
// X-LockLLM-PII-Types: First Name,Last Name,Email,Phone Number
// X-LockLLM-PII-Count: 4
// X-LockLLM-PII-Action: strip

Link to section: PII Detection PricingPII Detection Pricing

  • PII not detected: FREE
  • PII detected: $0.0001 per detection
  • PII detection is opt-in and disabled by default

Learn more about PII Detection

Link to section: Prompt CompressionPrompt Compression

Reduce token usage and AI costs by compressing prompts before they are forwarded to your LLM provider. Compression happens after security scanning, so your prompts are always scanned in their original form.

Link to section: Compression MethodsCompression Methods

TOON (Token-Oriented Object Notation)

  • Converts JSON data to a compact, token-efficient format
  • Removes redundant syntax (braces, quotes) while remaining LLM-readable
  • Works only on valid JSON input (returns original text for non-JSON)
  • FREE - no additional cost
  • Instant, local transformation

Compact (ML-based)

  • Advanced token classification that removes unnecessary tokens from any text
  • Works on natural language, code, structured data, and mixed content
  • Configurable compression rate (0.3-0.7)
  • Costs $0.0001 per use
  • 5-second timeout with fail-open behavior

Link to section: Enabling Prompt CompressionEnabling Prompt Compression

Add the X-LockLLM-Compression header to enable compression:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-compression': 'compact',
    'x-lockllm-compression-rate': '0.5'
  }
})

// Prompts are compressed before reaching OpenAI
// You save tokens on upstream API calls
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: longDocument }]
})

// Check compression headers in raw response:
// X-LockLLM-Compression-Method: compact
// X-LockLLM-Compression-Applied: true
// X-LockLLM-Compression-Ratio: 0.4500

Link to section: Example: TOON for JSON DataExample: TOON for JSON Data

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-compression': 'toon'  // Free JSON compression
  }
})

// JSON data in prompt is automatically compressed to TOON format
// Example: {"users": [{"name": "Alice", "age": 30}]} becomes compact notation
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: JSON.stringify(jsonData) }]
})

Link to section: Example: Compact with PythonExample: Compact with Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),
    base_url='https://api.lockllm.com/v1/proxy/openai',
    default_headers={
        'X-LockLLM-Compression': 'compact',
        'X-LockLLM-Compression-Rate': '0.5'
    }
)

response = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': long_document}]
)

Link to section: Prompt Compression PricingPrompt Compression Pricing

  • TOON: FREE
  • Compact: $0.0001 per use
  • Compression is opt-in and disabled by default

Learn more about Prompt Compression in the dedicated guide

Link to section: MonitoringMonitoring

Link to section: View Proxy LogsView Proxy Logs

All proxy requests are logged in your dashboard:

  1. Go to Activity Logs
  2. Filter by Proxy Requests
  3. See scan results, providers, models used
  4. View blocked requests

Link to section: Webhook NotificationsWebhook Notifications

Get notified when malicious prompts are detected:

  1. Go to Webhooks
  2. Add a webhook URL
  3. Choose format (raw, Slack, Discord)
  4. Receive alerts for blocked requests

Learn more about Webhooks

Link to section: PerformancePerformance

Link to section: Does Proxy Mode Add Latency?Does Proxy Mode Add Latency?

Minimal latency is added:

  • Scanning: ~100-200ms
  • Network overhead: ~50ms
  • Total: ~150-250ms additional latency

For most applications, this is negligible compared to LLM response times (1-10 seconds).

Link to section: Scan Result CachingScan Result Caching

Proxy mode automatically caches scan results for identical prompts (30-minute TTL), reducing latency on repeated security scans.

Link to section: Response CachingResponse Caching

LockLLM automatically caches AI responses. When you make the same request multiple times, you get instant responses without paying for duplicate API calls.

Benefits:

  • Save money on duplicate requests
  • Faster responses from cache
  • Works automatically, no setup needed

Example:

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }]
})

// First time: Charged normal rate
// Ask again: Instant response, no charge

If you ask the same question 10 times, you only pay for the first request. The other 9 are free and instant.

Note: Response caching is automatically disabled for streaming requests (stream: true). Streaming responses are always served fresh from the provider. Non-streaming requests are cached by default.

Custom cache duration:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-cache-ttl': '7200'  // Cache for 2 hours (default: 3600, max: 86400)
  }
})

Disable caching if you always need fresh responses:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'x-lockllm-cache-response': 'false'
  }
})

Link to section: TroubleshootingTroubleshooting

Link to section: "Unauthorized" Error (401)"Unauthorized" Error (401)

Problem: Your LockLLM API key is missing or invalid.

Solution:

  1. Verify you're passing your LockLLM API key (not provider key) in the apiKey parameter
  2. Check that your LockLLM API key is valid
  3. Ensure you haven't revoked or deleted the API key in your dashboard
  4. Verify you're authenticated and the key is valid

Link to section: "No [provider] API key configured" Error (400)"No [provider] API key configured" Error (400)

Problem: You haven't added the provider's API key to the dashboard, or it's disabled.

Error message example:

{
  "error": {
    "message": "No openai API key configured. Please add your API key at the dashboard.",
    "type": "lockllm_config_error",
    "code": "no_upstream_key"
  }
}

Solution:

  1. Go to your LockLLM dashboard → Proxy Settings
  2. Click "Add API Key" and select your provider (e.g., OpenAI)
  3. Enter your provider API key and save
  4. Ensure the key is enabled (toggle should be on)

Link to section: "Could not extract prompt from request" Error (400)"Could not extract prompt from request" Error (400)

Problem: The request body format is not recognized.

Solution:

  1. Ensure you're using a supported API format for your provider
  2. Check that your request has the correct structure (e.g., messages array for OpenAI/Anthropic)
  3. Verify the SDK version is compatible

Link to section: Requests Not Being ScannedRequests Not Being Scanned

Problem: Requests bypass the proxy or fail silently.

Solution:

  1. Verify you're using the correct base URL: https://api.lockllm.com/v1/proxy/{provider}
  2. Check that you added the provider key to dashboard
  3. Ensure the provider key is enabled (not disabled)
  4. Confirm you're passing your LockLLM API key for authentication

Link to section: Azure-Specific ErrorsAzure-Specific Errors

Problem: Azure requests fail with "azure_config_error".

Solution:

  1. Verify endpoint URL format: https://your-resource.openai.azure.com (no trailing slash)
  2. Check deployment name matches your Azure deployment exactly
  3. Ensure API version is compatible (default: 2024-10-21)
  4. Confirm you added all required fields in the dashboard:
    • Azure API key
    • Endpoint URL
    • Deployment name

Link to section: Rate Limit Exceeded Error (429)Rate Limit Exceeded Error (429)

Problem: Too many requests in a short time.

Error message:

{
  "error": {
    "message": "Rate limit exceeded. Please try again later.",
    "type": "rate_limit_error"
  }
}

Solution:

  1. Rate limits are tier-based (Tier 1: 300 req/min, up to Tier 10: 200,000 req/min)
  2. Implement exponential backoff in your application
  3. Upgrade your tier for higher limits - view tier benefits

Link to section: Insufficient Credits Error (402)Insufficient Credits Error (402)

Problem: Your LockLLM credit balance is too low.

Error message:

{
  "error": {
    "message": "Insufficient credit balance",
    "type": "lockllm_balance_error",
    "code": "insufficient_balance"
  }
}

Solution:

  1. Add credits in your LockLLM dashboard
  2. Switch to BYOK mode to avoid credit requirements for LLM usage
  3. Free monthly tier credits may cover detection fees for most users

Link to section: Credits Service Unavailable (503)Credits Service Unavailable (503)

Problem: The LockLLM credits service is temporarily unavailable.

Error message:

{
  "error": {
    "message": "LockLLM credits are temporarily unavailable. Please try again later or use a provider-specific endpoint with your own API key.",
    "type": "lockllm_service_error",
    "code": "credits_unavailable"
  }
}

Solution:

  1. This is a temporary service issue - retry after a short delay
  2. Switch to BYOK mode (provider-specific endpoints) as a fallback
  3. Implement retry logic with exponential backoff
  4. If the issue persists, contact [email protected]

Link to section: Invalid Provider for Credits Mode (400)Invalid Provider for Credits Mode (400)

Problem: You are trying to use a non-OpenRouter provider with the universal endpoint (LockLLM credits mode).

Error message:

{
  "error": {
    "message": "LockLLM credits mode only supports OpenRouter-compatible models. Please use the provider-specific endpoint with a BYOK key instead.",
    "type": "lockllm_config_error",
    "code": "invalid_provider_for_credits_mode"
  }
}

Solution:

  1. If using the universal endpoint (/v1/proxy/chat/completions), use OpenRouter-compatible model names (e.g., openai/gpt-4, anthropic/claude-3-opus)
  2. For other providers, use the provider-specific endpoint (e.g., /v1/proxy/openai) with a BYOK key configured in the dashboard
  3. Add your provider API key in the dashboard if you haven't already

Link to section: Upstream Provider Error (502)Upstream Provider Error (502)

Problem: The upstream AI provider (OpenAI, Anthropic, etc.) failed to process the request.

Error message:

{
  "error": {
    "message": "Failed to forward request to provider",
    "type": "upstream_error"
  }
}

Solution:

  1. Check the upstream provider's status page for outages
  2. Verify your provider API key is valid and has sufficient quota
  3. Retry the request after a short delay
  4. If using Azure, verify your endpoint URL and deployment name are correct
  5. If the issue persists, try a different model or provider

Link to section: FAQFAQ

Link to section: Does the proxy block requests by default?Does the proxy block requests by default?

No. By default, the proxy uses allow_with_warning mode for all security scans. This means:

  • Threats are detected and flagged with warnings in the response
  • Requests are NOT blocked - they proceed to your LLM provider
  • You receive threat information in response headers and body
  • No disruption to your application's normal operation

To enable blocking, explicitly set headers:

defaultHeaders: {
  'x-lockllm-scan-action': 'block',        // Block prompt injection
  'x-lockllm-policy-action': 'block'       // Block policy violations
}

This design lets you test and monitor security threats in production without breaking user experience, then enable blocking when you're ready.

Link to section: Is the proxy free?Is the proxy free?

Partially. LockLLM uses a usage-based pricing model where you only pay for security detections and routing optimization:

  • Safe prompts: FREE (no charge when passing security checks)
  • Security detections: $0.0001-$0.0002 per detection (only when threats found)
  • Routing fees: 5% of cost savings (only when routing saves you money)
  • LLM usage (BYOK): FREE (you pay your provider directly)
  • LLM usage (non-BYOK): Variable via LockLLM credits

All users receive free monthly credits based on their tier (1-10). You may never pay anything if your usage stays within free tier limits and all prompts are safe.

Link to section: What is BYOK (Bring Your Own Key)?What is BYOK (Bring Your Own Key)?

BYOK means you use your own API keys from OpenAI, Anthropic, etc. You add them to the LockLLM dashboard, and LockLLM proxies requests using your keys. You maintain full control over your keys, and billing stays with your provider account.

With BYOK:

  • LLM usage: FREE (you pay provider directly)
  • Security detections: $0.0001-$0.0002 per detection
  • Routing fees: 5% of savings (when enabled)

Without BYOK (universal endpoint):

  • Everything billed via LockLLM credits
  • Use OpenRouter for 200+ models
  • No provider API keys needed

Link to section: How does authentication work?How does authentication work?

There are two types of API keys:

  1. Provider API Key (OpenAI, Anthropic, etc.): You add this to the LockLLM dashboard once. It's stored encrypted and never put in your code.

  2. LockLLM API Key: You pass this in your SDK configuration. It authenticates your requests and tells the proxy which provider keys to use.

In your code, you only use your LockLLM API key. The proxy handles retrieving and using your provider keys securely.

Link to section: Are my provider API keys secure?Are my provider API keys secure?

Yes! Your API keys are:

  • Encrypted at rest using industry-grade encryption
  • Stored securely in our database
  • Never exposed in API responses or logs
  • Never transmitted to your application
  • Completely unaccessible

Link to section: Which providers are supported?Which providers are supported?

17 providers with full support:

  • OpenAI, Anthropic, Google Gemini, Cohere
  • Azure OpenAI, AWS Bedrock, Google Vertex AI
  • OpenRouter, Perplexity, Mistral AI
  • Groq, DeepSeek, Together AI
  • xAI (Grok), Fireworks AI, Anyscale
  • Hugging Face

All providers support custom endpoint URLs for self-hosted or alternative endpoints.

Link to section: How do I configure Azure OpenAI?How do I configure Azure OpenAI?

Azure OpenAI requires additional configuration:

  1. In the dashboard, select Azure OpenAI as provider

  2. Enter:

    • API key: Your Azure OpenAI key (or Microsoft Entra ID token)
    • Endpoint URL: https://your-resource.openai.azure.com
    • Deployment name: Your Azure deployment name (e.g., gpt-4)
    • API version (optional): Defaults to 2024-10-21
  3. In your code:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/azure'
})

LockLLM supports both Azure API formats (legacy deployment-based and new v1 API).

Link to section: Does this add latency to my requests?Does this add latency to my requests?

Yes, approximately 150-250ms for scanning:

  • Scanning: ~100-200ms
  • Network overhead: ~50ms

This is minimal compared to typical LLM response times (1-10+ seconds) and provides critical security protection. The proxy caches scan results for identical prompts to reduce latency on repeated requests.

Link to section: Will this work with official SDKs?Will this work with official SDKs?

Yes! Proxy mode works seamlessly with official SDKs:

  • OpenAI SDK (Node.js, Python): Just change baseURL parameter
  • Anthropic SDK (Node.js, Python): Just change baseURL parameter
  • Other provider SDKs: Works with any SDK that supports custom base URLs

The proxy is fully compatible with all SDK features including streaming, function calling, and multi-modal inputs.

Link to section: Can I use multiple keys for the same provider?Can I use multiple keys for the same provider?

Yes! You can add multiple keys for the same provider with different nicknames (e.g., "Production" and "Development"). However, only one key per provider can be enabled at a time.

To switch between keys, enable/disable them in the dashboard. The proxy automatically uses the enabled key for each provider.

Link to section: What happens when a malicious prompt is detected?What happens when a malicious prompt is detected?

It depends on your configuration:

Default behavior (allow_with_warning):

  1. The threat is detected and flagged with warning headers (X-LockLLM-Scan-Warning, X-LockLLM-Injection-Score, etc.)
  2. The request is still forwarded to your LLM provider
  3. Your application can read the warning headers to decide how to handle it
  4. The event is logged in your dashboard

With blocking enabled (x-lockllm-scan-action: block):

  1. The request is blocked immediately (not forwarded to provider)
  2. Returns a 400 Bad Request error with scan details and request ID
  3. The event is logged in your dashboard
  4. Can trigger webhook notifications if configured

You can catch blocked requests in your code and handle them appropriately (e.g., alert security team, log incident, show user error message).

Link to section: Do you log or store my prompts?Do you log or store my prompts?

No. We do not store prompt content. We only log:

  • Metadata (timestamp, model, provider, request ID)
  • Scan results (safe/malicious, confidence scores)
  • Prompt length (character count)

Prompt content is scanned in memory and immediately discarded. This ensures privacy while providing security monitoring.

Link to section: How does the detection work?How does the detection work?

The proxy scans every request and assigns confidence scores:

  • Injection score: 0 (definitely safe) to 100 (definitely malicious)
  • By default, threats are flagged with warnings but requests are still forwarded (allow_with_warning)
  • Enable x-lockllm-scan-action: block to block malicious prompts instead of forwarding them
  • Safe prompts are always forwarded to your provider

The detection system is tuned to balance security and minimize false positives. Contact [email protected] if you need custom detection settings for your use case.

Link to section: Can I test the proxy without adding my real API keys?Can I test the proxy without adding my real API keys?

Yes! You can test with:

  1. Universal endpoint: Use /v1/proxy/chat/completions with LockLLM credits (no provider keys needed)
  2. OpenAI-compatible test endpoints: Use a local LLM server or test API
  3. Custom endpoints: Point to your staging/test environments
  4. Provider test keys: Use provider-issued test/sandbox keys

Just add the test configuration in the dashboard and point your SDK to the proxy.

Link to section: What are custom content policies?What are custom content policies?

Custom policies let you enforce your own content restrictions beyond built-in security. For example:

  • Block medical or legal advice
  • Prevent competitor mentions
  • Enforce brand guidelines
  • Meet compliance requirements (HIPAA, GDPR, etc.)

Create policies in the dashboard, then set the x-lockllm-scan-mode header to combined to check both security threats and custom policies. Each policy can be up to 10,000 characters and describe exactly what should be blocked.

Link to section: How does smart routing work?How does smart routing work?

Routing analyzes your prompt to determine:

  1. Task type: What the user is asking for (code generation, summarization, chatbot, etc.)
  2. Complexity: How difficult the task is (low/medium/high)
  3. Optimal model: Which model provides best quality/cost ratio

For example, a simple "What is 2+2?" might route from GPT-4 to GPT-3.5 (faster, cheaper, same quality). A complex code generation task stays on GPT-4 (quality matters).

You only pay routing fees (5% of savings) when the router actually saves you money by selecting a cheaper model.

Link to section: What is AI abuse detection?What is AI abuse detection?

Abuse detection protects you from malicious end-users trying to:

  • Send bot-generated or automated requests
  • Spam your endpoints with repetitive prompts
  • Exhaust resources with oversized inputs
  • Overwhelm your quota with burst requests

Enable it by adding X-LockLLM-Abuse-Action header. It's optional (opt-in) and designed for production applications where end-users directly interact with your AI.

Link to section: What is PII detection?What is PII detection?

PII (Personally Identifiable Information) detection scans prompts for sensitive personal data before forwarding to your LLM provider. It detects 17 entity types including names, emails, phone numbers, SSNs, credit cards, and addresses.

Enable it with the X-LockLLM-PII-Action header:

  • allow_with_warning: Detect PII, include in response headers, forward request
  • block: Reject requests containing personal information (403 error)
  • strip: Replace PII with [TYPE] placeholders before forwarding

PII detection is opt-in (disabled by default) and costs $0.0001 per detection (only when PII is found). It works alongside all other features (scanning, routing, abuse detection).

Link to section: Can I prevent personal data from reaching my LLM?Can I prevent personal data from reaching my LLM?

Yes! Use x-lockllm-pii-action: strip to automatically redact personal information before it reaches your LLM provider. Detected entities are replaced with type placeholders (e.g., [EMAIL], [GIVENNAME]). This ensures your LLM never sees actual personal data while still understanding the context of the request.

Link to section: What is prompt compression?What is prompt compression?

Prompt compression reduces the token count of your prompts before they reach your LLM provider, helping you save on API costs. Three methods are available:

  • TOON (Token-Oriented Object Notation): Converts JSON data to a compact, token-efficient format. Free, instant, JSON-only.
  • Compact: Uses advanced ML-based token classification to compress any text. $0.0001 per use, up to 5 seconds.
  • Combined: Applies TOON first, then Compact on the result for maximum compression. $0.0001 per use. Best for maximum token reduction.

Enable compression with the X-LockLLM-Compression header set to toon, compact, or combined. Compression is disabled by default and applied only after all security checks pass. Learn more about Prompt Compression.

Link to section: Does compression affect response quality?Does compression affect response quality?

TOON produces a compact notation that LLMs understand effectively - the data structure and values are fully preserved. Compact uses ML-based classification to preserve meaning while removing redundant tokens. If quality is a concern, use a higher compression rate (closer to 0.7) with the Compact method via the X-LockLLM-Compression-Rate header.

Link to section: Do you charge for every request?Do you charge for every request?

No! We only charge when providing value:

  • Safe prompts: FREE (no security detection fee)
  • Unsafe prompts: $0.0001-$0.0002 (detected threat)
  • PII detected: $0.0001 (personal information found)
  • Prompt compression (TOON): FREE
  • Prompt compression (Compact): $0.0001 per use
  • Routing to cheaper model: 5% of savings (saved you money)
  • Routing to same/more expensive model: FREE (no savings)
  • BYOK LLM usage: FREE (you pay provider directly)

If all your prompts are safe, contain no PII, compression is disabled, and routing is disabled, you pay nothing for security scanning.

Link to section: How do I control my costs?How do I control my costs?

1. Use BYOK (Bring Your Own Key):

  • No LLM usage charges from LockLLM
  • Only pay for detections and routing

2. Monitor your dashboard:

  • Track detection rates
  • View routing savings

3. Adjust sensitivity:

  • Higher tiers get more free credits
  • Increase monthly spending to unlock higher tiers

4. Disable optional features if not needed:

  • Set x-lockllm-route-action: disabled to skip routing fees (if you don't want cost optimization)
  • Use x-lockllm-scan-mode: normal instead of combined if you don't need custom policy checks
  • Only add x-lockllm-pii-action header for inputs that may contain personal data (PII detection is disabled by default)
  • Use TOON compression (x-lockllm-compression: toon) for free token savings on JSON data
  • Use Compact compression selectively, as it costs $0.0001 per use

Note: Abuse detection (x-lockllm-abuse-action) is FREE and has no cost impact. PII detection (x-lockllm-pii-action) costs $0.0001 only when PII is actually found - no charge when prompts contain no personal information. TOON compression is FREE with no cost impact.

Link to section: What happens if I run out of credits?What happens if I run out of credits?

If you're using LockLLM credits (non-BYOK mode) and run out:

  • Proxy returns 402 Payment Required error
  • Add credits in the dashboard to resume service
  • Consider switching to BYOK to avoid credit requirements for LLM usage

If you're using BYOK mode:

  • LLM usage never consumes LockLLM credits (you pay provider directly)
  • You only need credits for security detections and routing
  • Free monthly tier credits often cover detection fees for most users
Updated 8 days ago