Python SDK
Native Python SDK for prompt security with both sync and async support. Drop-in replacements for OpenAI, Anthropic, and 17+ other providers including custom endpoints. Supports scan modes, custom policy enforcement, abuse detection, smart routing, PII detection and redaction, and response caching.
Link to section: IntroductionIntroduction
The LockLLM Python SDK is a production-ready library that provides comprehensive AI security for your LLM applications. Built with Python type hints and designed for modern Python development, it offers both synchronous and asynchronous APIs with drop-in replacements for popular AI provider SDKs with automatic prompt injection detection and jailbreak prevention.
Key features:
- Real-time security scanning with minimal latency (<250ms)
- Dual sync/async API for maximum flexibility
- Drop-in replacements for 17+ AI providers (custom endpoint support for each)
- Configurable scan modes with custom policy enforcement
- AI abuse detection (opt-in) to protect against automated misuse
- Smart routing for automatic model selection by task type and complexity
- Response caching for cost optimization (enabled by default in proxy mode)
- Universal proxy mode supporting 200+ models without provider API keys
- PII detection and redaction (names, emails, phone numbers, SSNs, credit cards, and more)
- Full type hints with mypy support
- Works with Python 3.8 through 3.12
- Streaming-compatible with all providers
- Context manager support
- Free tier available with generous limits
Use cases:
- Production LLM applications requiring security
- AI agents and autonomous systems
- Chatbots and conversational interfaces
- RAG (Retrieval Augmented Generation) systems
- Multi-tenant AI applications
- Enterprise AI deployments
- FastAPI, Django, and Flask applications
Link to section: InstallationInstallation
Install the SDK using your preferred package manager:
# pip
pip install lockllm
# pip3
pip3 install lockllm
# poetry
poetry add lockllm
# pipenv
pipenv install lockllm
Requirements:
- Python 3.8 or higher (supports 3.8, 3.9, 3.10, 3.11, 3.12)
- requests and httpx (installed automatically as dependencies)
- typing-extensions (installed automatically for Python < 3.11)
Link to section: Optional DependenciesOptional Dependencies
For provider wrapper functions, install the relevant official SDKs:
# For OpenAI and OpenAI-compatible providers (16 providers)
pip install openai
# For Anthropic Claude
pip install anthropic
Provider SDK mapping:
openai- OpenAI, Groq, DeepSeek, Mistral, Perplexity, OpenRouter, Together AI, xAI, Fireworks, Anyscale, Hugging Face, Gemini, Cohere, Azure, Bedrock, Vertex AIanthropic- Anthropic Claude
These SDKs are only required if you use the wrapper functions for those providers.
Link to section: Verify InstallationVerify Installation
from lockllm import __version__
print(__version__) # Prints the installed SDK version
Link to section: Quick StartQuick Start
Link to section: Step 1: Get Your API KeysStep 1: Get Your API Keys
- Visit lockllm.com and create a free account
- Navigate to API Keys section and copy your LockLLM API key
- Go to Proxy Settings and add your provider API keys (OpenAI, Anthropic, etc.)
Your provider keys are encrypted and stored securely. You'll only need your LockLLM API key in your code.
Link to section: Step 2: Basic UsageStep 2: Basic Usage
Choose from four integration methods: wrapper functions (easiest), direct scan API, official SDKs with custom base URL, or the universal proxy wrapper for non-BYOK usage.
Link to section: Wrapper Functions (Recommended)Wrapper Functions (Recommended)
The simplest way to add security - replace your SDK initialization:
Synchronous:
from lockllm import create_openai
import os
# Before:
# from openai import OpenAI
# openai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# After (one line change):
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
# Everything else works exactly the same
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
Asynchronous:
from lockllm import create_async_openai
import os
import asyncio
async def main():
openai = create_async_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
asyncio.run(main())
Link to section: Direct Scan APIDirect Scan API
For manual control and custom workflows:
Synchronous:
from lockllm import LockLLM
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
# Scan user input before processing
result = lockllm.scan(
input=user_prompt,
sensitivity="medium" # "low" | "medium" | "high"
)
if not result.safe:
print("Malicious input detected!")
print(f"Injection score: {result.injection}%")
print(f"Confidence: {result.confidence}%")
print(f"Request ID: {result.request_id}")
# Handle security incident
return {"error": "Invalid input detected"}
# Safe to proceed
response = your_llm_call(user_prompt)
Asynchronous:
from lockllm import AsyncLockLLM
import os
import asyncio
async def main():
lockllm = AsyncLockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
result = await lockllm.scan(
input=user_prompt,
sensitivity="medium"
)
if not result.safe:
print(f"Malicious prompt detected: {result.injection}%")
return
# Safe to proceed
response = await your_llm_call(user_prompt)
asyncio.run(main())
See Scan Modes below for advanced scanning options including custom policy checks, abuse detection, PII detection, and smart routing.
Link to section: Official SDKs with ProxyOfficial SDKs with Proxy
Use official SDKs with LockLLM's proxy:
from openai import OpenAI
from lockllm import get_proxy_url
import os
client = OpenAI(
api_key=os.getenv("LOCKLLM_API_KEY"),
base_url=get_proxy_url('openai')
)
# Works exactly like the official SDK
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
Link to section: Universal ProxyUniversal Proxy
If you haven't configured provider API keys, use the universal proxy endpoint which uses LockLLM credits. You can browse all supported models and their IDs in the Model List page in your dashboard. When making requests, you must use the exact model ID shown there (e.g., openai/gpt-4).
Using the OpenAI SDK:
from openai import OpenAI
from lockllm import get_universal_proxy_url
import os
client = OpenAI(
api_key=os.getenv("LOCKLLM_API_KEY"),
base_url=get_universal_proxy_url()
)
response = client.chat.completions.create(
model="openai/gpt-4", # Use the model ID from the Model List page
messages=[{"role": "user", "content": "Hello!"}]
)
Using LockLLM Wrapper (Recommended):
The SDK provides dedicated wrapper functions for the universal proxy endpoint. These are the simplest way to get started without configuring provider API keys:
Synchronous:
from lockllm import create_client, ProxyOptions
import os
# No BYOK required - uses LockLLM credits
client = create_client(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(scan_action="block")
)
response = client.chat.completions.create(
model="openai/gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
Asynchronous:
from lockllm import create_async_client, ProxyOptions
import os
import asyncio
async def main():
client = create_async_client(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(scan_action="block")
)
response = await client.chat.completions.create(
model="openai/gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
asyncio.run(main())
create_client() and create_async_client() default to the universal proxy URL (https://api.lockllm.com/v1/proxy). You can override this with the base_url parameter.
Link to section: Custom OpenAI-Compatible Endpoint WrapperCustom OpenAI-Compatible Endpoint Wrapper
For custom endpoints that follow the OpenAI API format but are not one of the 17 built-in providers:
Synchronous:
from lockllm import create_openai_compatible, ProxyOptions
import os
# base_url is required for custom endpoints
client = create_openai_compatible(
api_key=os.getenv("LOCKLLM_API_KEY"),
base_url="https://api.lockllm.com/v1/proxy/custom",
proxy_options=ProxyOptions(scan_action="block")
)
response = client.chat.completions.create(
model="your-custom-model",
messages=[{"role": "user", "content": user_input}]
)
Asynchronous:
from lockllm import create_async_openai_compatible
import os
import asyncio
async def main():
client = create_async_openai_compatible(
api_key=os.getenv("LOCKLLM_API_KEY"),
base_url="https://api.lockllm.com/v1/proxy/custom"
)
response = await client.chat.completions.create(
model="your-custom-model",
messages=[{"role": "user", "content": user_input}]
)
asyncio.run(main())
Unlike provider-specific wrappers, create_openai_compatible() requires a base_url parameter since there is no default endpoint for custom providers.
Link to section: Provider WrappersProvider Wrappers
LockLLM provides drop-in replacements for 17+ AI providers with custom endpoint support. All wrappers work identically to the official SDKs with automatic security scanning.
Link to section: OpenAI (Sync)OpenAI (Sync)
from lockllm import create_openai
import os
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
# Chat completions
response = openai.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": user_input}
],
temperature=0.7,
max_tokens=1000
)
print(response.choices[0].message.content)
# Streaming
stream = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Count from 1 to 10"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end='')
# Function calling
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in Boston?"}],
functions=[{
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}]
)
Link to section: OpenAI (Async)OpenAI (Async)
from lockllm import create_async_openai
import os
import asyncio
async def main():
openai = create_async_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
# Chat completions
response = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
# Async streaming
stream = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
async for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end='')
asyncio.run(main())
Link to section: Anthropic Claude (Sync)Anthropic Claude (Sync)
from lockllm import create_anthropic
import os
anthropic = create_anthropic(api_key=os.getenv("LOCKLLM_API_KEY"))
# Messages API
message = anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[
{"role": "user", "content": user_input}
]
)
print(message.content[0].text)
# Streaming
with anthropic.messages.stream(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a poem"}]
) as stream:
for text in stream.text_stream:
print(text, end='', flush=True)
Link to section: Anthropic Claude (Async)Anthropic Claude (Async)
from lockllm import create_async_anthropic
import os
import asyncio
async def main():
anthropic = create_async_anthropic(api_key=os.getenv("LOCKLLM_API_KEY"))
# Async messages
message = await anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": user_input}]
)
# Async streaming
async with anthropic.messages.stream(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a poem"}]
) as stream:
async for text in stream.text_stream:
print(text, end='', flush=True)
asyncio.run(main())
Link to section: Groq, DeepSeek, PerplexityGroq, DeepSeek, Perplexity
from lockllm import create_groq, create_deepseek, create_perplexity
import os
# Groq - Fast inference with Llama models
groq = create_groq(api_key=os.getenv("LOCKLLM_API_KEY"))
response = groq.chat.completions.create(
model='llama-3.1-70b-versatile',
messages=[{'role': 'user', 'content': user_input}]
)
# DeepSeek - Advanced reasoning models
deepseek = create_deepseek(api_key=os.getenv("LOCKLLM_API_KEY"))
response = deepseek.chat.completions.create(
model='deepseek-chat',
messages=[{'role': 'user', 'content': user_input}]
)
# Perplexity - Models with internet access
perplexity = create_perplexity(api_key=os.getenv("LOCKLLM_API_KEY"))
response = perplexity.chat.completions.create(
model='llama-3.1-sonar-huge-128k-online',
messages=[{'role': 'user', 'content': user_input}]
)
Link to section: All Supported ProvidersAll Supported Providers
LockLLM supports 17+ providers with ready-to-use wrappers. All providers support custom endpoint URLs configured via the dashboard.
Import any wrapper function:
Synchronous wrappers:
from lockllm import (
create_openai, # OpenAI GPT models
create_anthropic, # Anthropic Claude
create_groq, # Groq LPU inference
create_deepseek, # DeepSeek models
create_perplexity, # Perplexity (with internet)
create_mistral, # Mistral AI
create_openrouter, # OpenRouter (multi-provider)
create_together, # Together AI
create_xai, # xAI Grok
create_fireworks, # Fireworks AI
create_anyscale, # Anyscale Endpoints
create_huggingface, # Hugging Face Inference
create_gemini, # Google Gemini
create_cohere, # Cohere
create_azure, # Azure OpenAI
create_bedrock, # AWS Bedrock
create_vertex_ai # Google Vertex AI
)
Asynchronous wrappers:
from lockllm import (
create_async_openai,
create_async_anthropic,
create_async_groq,
create_async_deepseek,
create_async_perplexity,
create_async_mistral,
create_async_openrouter,
create_async_together,
create_async_xai,
create_async_fireworks,
create_async_anyscale,
create_async_huggingface,
create_async_gemini,
create_async_cohere,
create_async_azure,
create_async_bedrock,
create_async_vertex_ai
)
Provider compatibility:
- 16 providers use OpenAI-compatible API (require
openaipackage) - Anthropic uses its own SDK (requires
anthropic) - All providers support custom endpoint URLs via dashboard
Link to section: Configuring Proxy OptionsConfiguring Proxy Options
All wrapper functions accept a proxy_options parameter to configure scanning, routing, and caching behavior for every request made through that client:
from lockllm import create_openai, ProxyOptions
import os
# Configure scanning and routing for all requests through this client
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
scan_action="block", # Block malicious requests
policy_action="block", # Block policy violations
route_action="auto", # Enable smart routing
sensitivity="high" # Maximum protection
)
)
# All requests through this client use the configured options
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
This works with all sync and async wrappers. See Proxy Options for the full list of configurable fields.
Link to section: ConfigurationConfiguration
Link to section: LockLLM Client ConfigurationLockLLM Client Configuration
Synchronous:
from lockllm import LockLLM, LockLLMConfig
import os
config = LockLLMConfig(
api_key=os.getenv("LOCKLLM_API_KEY"), # Required
base_url="https://api.lockllm.com", # Optional: custom endpoint
timeout=60.0, # Optional: request timeout (seconds)
max_retries=3 # Optional: max retry attempts
)
lockllm = LockLLM(api_key=config.api_key, base_url=config.base_url, timeout=config.timeout)
Asynchronous:
from lockllm import AsyncLockLLM
import os
lockllm = AsyncLockLLM(
api_key=os.getenv("LOCKLLM_API_KEY"),
base_url="https://api.lockllm.com",
timeout=60.0,
max_retries=3
)
Link to section: Sensitivity LevelsSensitivity Levels
Control detection strictness with the sensitivity parameter:
from lockllm import LockLLM
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
# Low sensitivity - fewer false positives
# Use for: creative applications, exploratory use cases
low_result = lockllm.scan(input=user_prompt, sensitivity="low")
# Medium sensitivity - balanced detection - DEFAULT
# Use for: general user inputs, standard applications
medium_result = lockllm.scan(input=user_prompt, sensitivity="medium")
# High sensitivity - maximum protection
# Use for: sensitive operations, admin panels, data exports
high_result = lockllm.scan(input=user_prompt, sensitivity="high")
Choosing sensitivity:
- High: Critical systems (admin, payments, sensitive data)
- Medium: General applications (default, recommended)
- Low: Creative tools (writing assistants, brainstorming)
Link to section: Scan ModesScan Modes
Control which security checks are performed with the scan_mode parameter:
from lockllm import LockLLM
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
# Normal mode - core injection detection only
result = lockllm.scan(input=user_prompt, scan_mode="normal")
# Policy-only mode - custom content policies only (skips core injection scan)
result = lockllm.scan(input=user_prompt, scan_mode="policy_only")
# Combined mode - both core injection and custom policies (default, maximum security)
result = lockllm.scan(input=user_prompt, scan_mode="combined")
Available modes:
"normal"- Core injection detection only (prompt injection, jailbreak, instruction override, etc.)"policy_only"- Custom content policies only (checks your policies configured in the dashboard)"combined"- Both core injection and custom policies (default, recommended for maximum protection)
Link to section: Scan ActionsScan Actions
Control what happens when threats or violations are detected:
# Block mode - raises an error, request is stopped
result = lockllm.scan(
input=user_prompt,
scan_action="block", # Block on core injection
policy_action="block" # Block on policy violations
)
# Allow with warning mode - request proceeds, warnings included in response (default)
result = lockllm.scan(
input=user_prompt,
scan_action="allow_with_warning",
policy_action="allow_with_warning"
)
Available actions:
"block"- RaisesPromptInjectionErrororPolicyViolationErrorwhen detected"allow_with_warning"- Request proceeds, warning details included in the response (default)
Link to section: Abuse DetectionAbuse Detection
Opt-in abuse detection protects against automated misuse and resource exhaustion:
# Enable abuse detection
result = lockllm.scan(
input=user_prompt,
abuse_action="block" # Block abusive requests
)
# Or allow with warnings
result = lockllm.scan(
input=user_prompt,
abuse_action="allow_with_warning" # Allow but flag abusive requests
)
Abuse detection is disabled by default. Set abuse_action to enable it.
Link to section: PII DetectionPII Detection
Opt-in PII (Personally Identifiable Information) detection scans prompts for sensitive data before processing. When enabled, it identifies entity types such as names, email addresses, phone numbers, Social Security numbers, credit card numbers, street addresses, dates of birth, driver's license numbers, and more.
Scan API usage:
from lockllm import LockLLM
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
# Strip mode - replaces PII with placeholders
result = lockllm.scan(
input="My name is John Smith and my email is [email protected]",
pii_action="strip"
)
if result.pii_result and result.pii_result.detected:
print(f"PII detected: {result.pii_result.entity_count} entities")
print(f"Entity types: {', '.join(result.pii_result.entity_types)}")
print(f"Redacted text: {result.pii_result.redacted_input}")
# Block mode - raises PIIDetectedError if PII is found
result = lockllm.scan(
input="Call me at 555-0123",
pii_action="block"
)
# Allow with warning mode - request proceeds, PII info included in response
result = lockllm.scan(
input="My SSN is 123-45-6789",
pii_action="allow_with_warning"
)
if result.pii_result and result.pii_result.detected:
print(f"Warning: {result.pii_result.entity_count} PII entities found")
print(f"Types: {', '.join(result.pii_result.entity_types)}")
Async usage:
from lockllm import AsyncLockLLM
import os
import asyncio
async def main():
lockllm = AsyncLockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
result = await lockllm.scan(
input="My credit card is 4111-1111-1111-1111",
pii_action="strip"
)
if result.pii_result and result.pii_result.detected:
print(f"Redacted: {result.pii_result.redacted_input}")
asyncio.run(main())
Proxy mode usage:
from lockllm import create_openai, ProxyOptions
import os
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
scan_action="block",
pii_action="strip" # Strip PII before sending to provider
)
)
# PII is automatically stripped from the prompt before reaching the AI provider
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
Available PII actions:
"strip"- Detects PII and replaces identified entities with[TYPE]placeholders (e.g.,[Email],[Phone Number]) before forwarding to the AI provider (proxy mode) or returns redacted text in scan response"block"- Blocks the request entirely if PII is detected, raisingPIIDetectedError"allow_with_warning"- Allows the request through but includes PII detection results in the responseNone(default) - PII detection is disabled
Supported entity types:
- Account Number
- Building Number
- City
- Credit Card Number
- Date of Birth
- Driver's License Number
- Email Address
- First Name
- Last Name
- ID Card Number
- Password
- Phone Number
- Social Security Number
- Street Address
- Tax ID Number
- Username
- Zip Code
PII detection is disabled by default. Set pii_action to enable it.
Link to section: Prompt CompressionPrompt Compression
Opt-in prompt compression reduces token count before sending prompts to AI providers. Three compression methods are available: TOON for structured JSON data, Compact for general text, and Combined for maximum compression.
Scan API usage:
from lockllm import LockLLM
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
# TOON - converts JSON to compact notation (free)
result = lockllm.scan(
input='{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}',
compression="toon"
)
if result.compression_result:
print(f"Method: {result.compression_result.method}")
print(f"Original: {result.compression_result.original_length} chars")
print(f"Compressed: {result.compression_result.compressed_length} chars")
print(f"Ratio: {result.compression_result.compression_ratio:.2f}")
print(f"Compressed text: {result.compression_result.compressed_input}")
# Compact - ML-based compression for any text ($0.0001/use)
result = lockllm.scan(
input="A long prompt with detailed instructions that could be compressed...",
compression="compact",
compression_rate=0.5 # Optional: 0.3-0.7 (default 0.5, lower = more aggressive)
)
if result.compression_result:
print(f"Compressed to {result.compression_result.compression_ratio:.0%} of original")
print(f"Compressed text: {result.compression_result.compressed_input}")
# Combined - TOON then ML-based compression ($0.0001/use, maximum compression)
# Best for JSON data: applies TOON first, then ML compression on the result
result = lockllm.scan(
input='{"data": [{"id": 1, "value": "long text content here..."}, {"id": 2, "value": "more content"}]}',
compression="combined",
compression_rate=0.5 # Optional: controls the ML compression stage
)
if result.compression_result:
print(f"Compressed to {result.compression_result.compression_ratio:.0%} of original")
print(f"Compressed text: {result.compression_result.compressed_input}")
Async usage:
from lockllm import AsyncLockLLM
import os
import asyncio
async def main():
lockllm = AsyncLockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
result = await lockllm.scan(
input='{"data": [{"key": "value1"}, {"key": "value2"}]}',
compression="toon"
)
if result.compression_result:
print(f"Compressed: {result.compression_result.compressed_input}")
asyncio.run(main())
Proxy mode usage:
from lockllm import create_openai, ProxyOptions
import os
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
scan_action="block",
compression="toon" # Compress JSON prompts before sending to provider
)
)
# JSON content in prompts is automatically compressed before reaching the AI provider
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": json_prompt}]
)
# For ML-based compression with custom rate
openai_compact = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
compression="compact",
compression_rate=0.4 # More aggressive compression
)
)
Available compression methods:
"toon"- JSON-to-compact notation (free). Converts structured JSON to a token-efficient format with 30-60% token savings. Non-JSON input is returned unchanged - it will not error or crash on free text."compact"- ML-based compression ($0.0001/use). Works on any text type. Uses token-level classification to remove non-essential tokens while preserving meaning. Configurable compression rate (0.3-0.7, default 0.5)."combined"- Maximum compression ($0.0001/use). Applies TOON first, then runs ML-based Compact on the result. Non-JSON input skips the TOON stage and goes directly to ML compression. Best when you want maximum token reduction.None(default) - Compression is disabled
Compression rate (compact and combined methods):
0.3- Most aggressive compression (removes more tokens)0.5- Balanced compression (default)0.7- Conservative compression (preserves more tokens)
Compression is disabled by default. Set compression to enable it.
Link to section: Chunked ScanningChunked Scanning
For long prompts, enable chunked scanning to process input in segments:
result = lockllm.scan(
input=long_prompt,
chunk=True # Enable chunked scanning for long inputs
)
Link to section: Reusable Scan ConfigurationReusable Scan Configuration
Use ScanOptions to create reusable scan configurations:
from lockllm import LockLLM, ScanOptions
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
# Create reusable scan configuration
opts = ScanOptions(
scan_mode="combined",
scan_action="block",
policy_action="block",
abuse_action="allow_with_warning",
pii_action="strip"
)
# Use the same options for multiple scans
result1 = lockllm.scan(input=prompt1, scan_options=opts)
result2 = lockllm.scan(input=prompt2, scan_options=opts)
# Override individual options when needed (takes precedence over ScanOptions)
result3 = lockllm.scan(input=prompt3, scan_options=opts, sensitivity="high")
Link to section: Proxy OptionsProxy Options
When using wrapper functions, configure proxy behavior with ProxyOptions:
from lockllm import ProxyOptions
options = ProxyOptions(
scan_mode="combined", # "normal" | "policy_only" | "combined"
scan_action="block", # "block" | "allow_with_warning"
policy_action="block", # "block" | "allow_with_warning"
abuse_action="block", # "block" | "allow_with_warning" | None (disabled)
route_action="auto", # "disabled" | "auto" | "custom"
sensitivity="medium", # "low" | "medium" | "high"
cache_response=True, # Enable/disable response caching
cache_ttl=3600, # Cache TTL in seconds (max 86400)
chunk=None, # Enable/disable chunked scanning
pii_action="strip" # "strip" | "block" | "allow_with_warning" | None (disabled)
)
Fields:
scan_mode- Which security checks to run (see Scan Modes)scan_action- Action on core injection detectionpolicy_action- Action on policy violationsabuse_action- Action on abuse detection (disabled if None)route_action- Smart routing mode (see below)sensitivity- Detection sensitivity levelcache_response- Enable response caching to reduce costs and latency (enabled by default, streaming requests are not cached)cache_ttl- Cache time-to-live in seconds, max 86400 (24 hours)chunk- Enable chunked scanning for long inputspii_action- PII detection behavior (disabled if None). See PII Detection for details and supported entity types
Smart routing: Automatically selects the best AI model based on task type and complexity to optimize cost and quality. Available only in proxy mode.
"disabled"- No routing, use the model you specified (default)"auto"- Automatic routing based on task type and complexity analysis"custom"- Use your custom routing rules configured in the dashboard
Note: ProxyOptions is for wrapper functions (create_openai, etc.) and direct proxy usage. For the scan API, use ScanOptions or individual parameters instead.
Link to section: Custom EndpointsCustom Endpoints
All providers support custom endpoint URLs for:
- Self-hosted LLM deployments (OpenAI-compatible APIs)
- Azure OpenAI resources with custom endpoints
- Alternative API gateways and reverse proxies
- Private cloud or air-gapped deployments
- Development and staging environments
How it works: Configure custom endpoints in the LockLLM dashboard when adding any provider API key. The SDK wrappers automatically use your custom endpoint URL.
# The wrapper automatically uses your custom endpoint
azure = create_azure(api_key=os.getenv("LOCKLLM_API_KEY"))
# Your custom Azure endpoint is configured in the dashboard:
# - Endpoint: https://your-resource.openai.azure.com
# - Deployment: gpt-4
# - API Version: 2024-10-21
Example - Self-hosted model: If you have a self-hosted model with an OpenAI-compatible API, configure it in the dashboard using one of the OpenAI-compatible provider wrappers (e.g., OpenAI, Groq) with your custom endpoint URL.
# Use OpenAI wrapper with custom endpoint configured in dashboard
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
# Dashboard configuration:
# - Provider: OpenAI
# - Custom Endpoint: https://your-self-hosted-llm.com/v1
# - API Key: your-model-api-key
Link to section: Request OptionsRequest Options
Override configuration per-request:
# Per-request timeout
result = lockllm.scan(
input=user_prompt,
sensitivity="high",
timeout=30.0 # 30 second timeout for this request
)
Link to section: Advanced FeaturesAdvanced Features
Link to section: Streaming ResponsesStreaming Responses
All provider wrappers support streaming:
Synchronous streaming:
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
stream = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end='')
Asynchronous streaming:
import asyncio
async def main():
openai = create_async_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
stream = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
async for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end='')
asyncio.run(main())
Link to section: Function CallingFunction Calling
OpenAI function calling works seamlessly:
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in Boston?"}],
functions=[{
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}],
function_call="auto"
)
if response.choices[0].message.function_call:
function_call = response.choices[0].message.function_call
import json
args = json.loads(function_call.arguments)
# Call your function with the parsed arguments
weather = get_weather(args['location'], args.get('unit', 'celsius'))
# Send function result back to LLM
final_response = openai.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "What's the weather in Boston?"},
response.choices[0].message,
{"role": "function", "name": "get_weather", "content": json.dumps(weather)}
]
)
Link to section: Multi-Turn ConversationsMulti-Turn Conversations
Maintain conversation context with message history:
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
response = openai.chat.completions.create(
model="gpt-4",
messages=messages
)
# Add assistant response to history
messages.append({
"role": "assistant",
"content": response.choices[0].message.content
})
# Continue conversation
messages.append({
"role": "user",
"content": "What is its population?"
})
response = openai.chat.completions.create(
model="gpt-4",
messages=messages
)
Link to section: Response Metadata (Proxy Mode)Response Metadata (Proxy Mode)
When using wrapper functions or the proxy directly, you can extract detailed scan and routing metadata from response headers:
from lockllm import parse_proxy_metadata, decode_detail_field
# After a proxy request using official SDKs with proxy base URL,
# parse metadata from response headers
metadata = parse_proxy_metadata(dict(response.headers))
# Core info
print(f"Request ID: {metadata.request_id}")
print(f"Scanned: {metadata.scanned}")
print(f"Safe: {metadata.safe}")
print(f"Provider: {metadata.provider}")
print(f"Model: {metadata.model}")
print(f"Sensitivity: {metadata.sensitivity}")
# Check if request was blocked
if metadata.blocked:
print("Request was blocked by LockLLM")
# Check for scan warnings
if metadata.scan_warning:
print(f"Injection score: {metadata.scan_warning.injection_score}")
print(f"Confidence: {metadata.scan_warning.confidence}")
# Decode detailed scan info
detail = decode_detail_field(metadata.scan_warning.detail)
if detail:
print(f"Scan detail: {detail}")
# Check for policy warnings
if metadata.policy_warnings:
print(f"Policy violations: {metadata.policy_warnings.count}")
print(f"Policy confidence: {metadata.policy_warnings.confidence}")
# Decode detailed policy violation info
detail = decode_detail_field(metadata.policy_warnings.detail)
if detail:
print(f"Policy detail: {detail}")
# Check abuse detection
if metadata.abuse_detected:
print(f"Abuse confidence: {metadata.abuse_detected.confidence}")
print(f"Abuse types: {metadata.abuse_detected.types}")
# Check routing info
if metadata.routing:
print(f"Routed to: {metadata.routing.selected_model}")
print(f"Original model: {metadata.routing.original_model}")
print(f"Task type: {metadata.routing.task_type}")
print(f"Complexity: {metadata.routing.complexity}")
print(f"Estimated savings: ${metadata.routing.estimated_savings}")
# Check PII detection results
if metadata.pii_detected:
print(f"PII detected: {metadata.pii_detected.detected}")
print(f"Entity types: {metadata.pii_detected.entity_types}")
print(f"Entity count: {metadata.pii_detected.entity_count}")
print(f"Action taken: {metadata.pii_detected.action}")
# Check credit usage
if metadata.credits_deducted is not None:
print(f"Credits charged: ${metadata.credits_deducted}")
print(f"Balance remaining: ${metadata.balance_after}")
# Check cache status
if metadata.cache_status == "HIT":
print(f"Cache hit - saved {metadata.tokens_saved} tokens")
print(f"Cost saved: ${metadata.cost_saved}")
Link to section: FastAPI IntegrationFastAPI Integration
Integrate with FastAPI for automatic request scanning:
from fastapi import FastAPI, HTTPException, Depends
from lockllm import AsyncLockLLM
from pydantic import BaseModel
import os
app = FastAPI()
lockllm = AsyncLockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
class ChatRequest(BaseModel):
prompt: str
async def scan_prompt(request: ChatRequest):
"""Dependency to scan prompts"""
result = await lockllm.scan(
input=request.prompt,
sensitivity="medium"
)
if not result.safe:
raise HTTPException(
status_code=400,
detail={
"error": "Malicious input detected",
"injection": result.injection,
"confidence": result.confidence,
"request_id": result.request_id
}
)
return result
@app.post("/chat")
async def chat(
request: ChatRequest,
scan_result = Depends(scan_prompt)
):
# Request already scanned by dependency
openai = create_async_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": request.prompt}]
)
return {
"response": response.choices[0].message.content,
"scan_result": {
"safe": scan_result.safe,
"request_id": scan_result.request_id
}
}
Link to section: Django IntegrationDjango Integration
Integrate with Django middleware:
# middleware.py
from lockllm import LockLLM
import os
import json
class LockLLMMiddleware:
def __init__(self, get_response):
self.get_response = get_response
self.lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
def __call__(self, request):
if request.method == 'POST' and request.content_type == 'application/json':
try:
body = json.loads(request.body)
if 'prompt' in body:
result = self.lockllm.scan(
input=body['prompt'],
sensitivity="medium"
)
if not result.safe:
from django.http import JsonResponse
return JsonResponse({
'error': 'Malicious input detected',
'details': {
'injection': result.injection,
'confidence': result.confidence,
'request_id': result.request_id
}
}, status=400)
# Attach scan result to request
request.scan_result = result
except Exception:
pass
response = self.get_response(request)
return response
# settings.py
MIDDLEWARE = [
# ... other middleware
'yourapp.middleware.LockLLMMiddleware',
]
Link to section: Flask IntegrationFlask Integration
from flask import Flask, request, jsonify
from lockllm import LockLLM
import os
app = Flask(__name__)
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
@app.before_request
def scan_request():
if request.method == 'POST' and request.is_json:
data = request.get_json()
if 'prompt' in data:
result = lockllm.scan(
input=data['prompt'],
sensitivity="medium"
)
if not result.safe:
return jsonify({
'error': 'Malicious input detected',
'details': {
'injection': result.injection,
'confidence': result.confidence,
'request_id': result.request_id
}
}), 400
# Store scan result for the request
request.scan_result = result
@app.route('/chat', methods=['POST'])
def chat():
# Request already scanned by before_request
data = request.get_json()
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": data['prompt']}]
)
return jsonify({
'response': response.choices[0].message.content
})
if __name__ == '__main__':
app.run()
Link to section: Batch ProcessingBatch Processing
Process multiple requests concurrently with asyncio:
import asyncio
from lockllm import create_async_openai
import os
async def process_prompt(openai, prompt):
"""Process a single prompt"""
try:
response = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
except Exception as e:
return f"Error: {str(e)}"
async def batch_process(prompts):
"""Process multiple prompts concurrently"""
openai = create_async_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
# Process all prompts concurrently
tasks = [process_prompt(openai, prompt) for prompt in prompts]
results = await asyncio.gather(*tasks)
return results
# Usage
prompts = [
"What is AI?",
"Explain machine learning",
"What is deep learning?"
]
results = asyncio.run(batch_process(prompts))
for i, result in enumerate(results):
print(f"Prompt {i+1}: {result}\n")
Link to section: Context Manager SupportContext Manager Support
Use context managers for automatic resource cleanup:
Synchronous:
from lockllm import LockLLM
import os
# Context manager ensures proper cleanup
with LockLLM(api_key=os.getenv("LOCKLLM_API_KEY")) as client:
result = client.scan(input="test prompt")
print(f"Safe: {result.safe}")
Asynchronous:
from lockllm import AsyncLockLLM
import os
import asyncio
async def main():
# Async context manager
async with AsyncLockLLM(api_key=os.getenv("LOCKLLM_API_KEY")) as client:
result = await client.scan(input="test prompt")
print(f"Safe: {result.safe}")
asyncio.run(main())
Link to section: Custom Policy EnforcementCustom Policy Enforcement
LockLLM lets you upload custom content policies through the dashboard and enforce them at runtime via policy_action. This allows you to block or flag responses that violate your application-specific rules in addition to the built-in safety checks.
Scan API - enforce custom policies with scan mode:
Synchronous:
from lockllm import LockLLM, PolicyViolationError
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
try:
result = lockllm.scan(
input=user_prompt,
scan_mode="combined", # Run both injection and policy checks
scan_action="block", # Block detected injection attempts
policy_action="block", # Block custom policy violations
sensitivity="medium"
)
if result.safe:
# No issues found - safe to proceed
response = your_llm_call(user_prompt)
else:
# scan_action="allow_with_warning" would reach here instead of raising
print(f"Warning: potential threat detected")
except PolicyViolationError as e:
# Raised when policy_action="block" and a violation is detected
print(f"Custom policy violated: {e.message}")
if e.violated_policies:
for policy in e.violated_policies:
print(f" Policy: {policy.get('policy_name')}")
for category in policy.get('violated_categories', []):
print(f" Category: {category.get('name')}")
return {"error": "Request blocked by content policy"}
Asynchronous:
from lockllm import AsyncLockLLM, PolicyViolationError
import os
import asyncio
async def main():
lockllm = AsyncLockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
try:
result = await lockllm.scan(
input=user_prompt,
scan_mode="combined",
scan_action="block",
policy_action="block",
sensitivity="medium"
)
response = await your_llm_call(user_prompt)
except PolicyViolationError as e:
print(f"Policy violation: {e.message}")
asyncio.run(main())
Proxy mode - enforce policies on all requests transparently:
from lockllm import create_openai, ProxyOptions
import os
# All requests through this client are checked against your custom policies
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
scan_mode="combined",
scan_action="block",
policy_action="block",
sensitivity="medium"
)
)
# PolicyViolationError raised automatically if a violation is detected
from lockllm import PolicyViolationError
try:
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
except PolicyViolationError as e:
print(f"Blocked by policy: {e.message}")
You can also use scan_mode="policy_only" if you only want to check custom policies without running the core injection scan.
Link to section: Smart RoutingSmart Routing
Smart routing automatically selects the optimal AI model for each request based on the detected task type and prompt complexity. This helps reduce costs by routing simple tasks to lighter models while ensuring complex tasks use more capable ones.
Routing is only available in proxy mode and is configured via route_action in ProxyOptions.
Auto routing (recommended):
Synchronous:
from lockllm import create_openai, ProxyOptions
import os
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
route_action="auto", # Automatic task-based model selection
scan_action="block"
)
)
response = openai.chat.completions.create(
model="gpt-4", # Starting point; router may select a different model
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
Asynchronous:
from lockllm import create_async_openai, ProxyOptions
import os
import asyncio
async def main():
openai = create_async_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(route_action="auto")
)
response = await openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
asyncio.run(main())
Custom routing rules:
Set route_action="custom" to apply your own routing rules configured in the dashboard. Custom rules let you map specific task types and complexity tiers to models of your choosing. If no matching rule is found, the router falls back to auto mode.
from lockllm import create_openai, ProxyOptions
import os
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(route_action="custom")
)
Accessing routing metadata from the scan API:
When you use scan() directly, the ScanResponse.routing field contains routing metadata if routing was applied:
from lockllm import LockLLM
import os
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
result = lockllm.scan(
input=user_prompt,
sensitivity="medium"
)
if result.routing and result.routing.enabled:
print(f"Task type: {result.routing.task_type}")
print(f"Complexity: {result.routing.complexity:.2f}")
print(f"Selected model: {result.routing.selected_model}")
print(f"Routing reason: {result.routing.reasoning}")
print(f"Estimated cost: ${result.routing.estimated_cost}")
For proxy mode, use parse_proxy_metadata() to read routing information from response headers (see Response Metadata).
Link to section: Response CachingResponse Caching
Response caching is enabled by default in proxy mode. When an identical request is received within the cache window, LockLLM returns the cached response directly, saving tokens and reducing cost. Streaming requests are never cached.
Default behavior (caching on):
from lockllm import create_openai
import os
# Caching is enabled by default - no configuration needed
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
Custom cache TTL:
from lockllm import create_openai, ProxyOptions
import os
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(
cache_response=True,
cache_ttl=1800 # Cache responses for 30 minutes (max: 86400 seconds)
)
)
Disable caching:
from lockllm import create_openai, ProxyOptions
import os
openai = create_openai(
api_key=os.getenv("LOCKLLM_API_KEY"),
proxy_options=ProxyOptions(cache_response=False)
)
Checking cache status from response headers:
from lockllm import create_openai, parse_proxy_metadata
import os
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
metadata = parse_proxy_metadata(dict(response.headers))
if metadata.cache_status == "HIT":
print(f"Cache hit - response served from cache")
print(f"Cache age: {metadata.cache_age}s")
print(f"Tokens saved: {metadata.tokens_saved}")
print(f"Cost saved: ${metadata.cost_saved:.6f}")
else:
print("Cache miss - response generated fresh")
Link to section: Error HandlingError Handling
LockLLM provides typed exceptions for comprehensive error handling:
Link to section: Error TypesError Types
from lockllm import (
LockLLMError, # Base error class
AuthenticationError, # 401 - Invalid API key
RateLimitError, # 429 - Rate limit exceeded
PromptInjectionError, # 400 - Malicious input detected
PolicyViolationError, # 403 - Custom policy violation
AbuseDetectedError, # 400 - Abuse pattern detected
PIIDetectedError, # 403 - PII detected (when pii_action is "block")
InsufficientCreditsError, # 402 - Insufficient credits
UpstreamError, # 502 - Provider API error
ConfigurationError, # 400 - Invalid configuration
NetworkError # 0 - Network/connection error
)
Link to section: Complete Error HandlingComplete Error Handling
from lockllm import create_openai
from lockllm import (
PromptInjectionError,
PolicyViolationError,
AbuseDetectedError,
PIIDetectedError,
InsufficientCreditsError,
AuthenticationError,
RateLimitError,
UpstreamError,
ConfigurationError,
NetworkError
)
import os
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
try:
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
print(response.choices[0].message.content)
except PromptInjectionError as error:
# Security threat detected
print("Malicious input detected!")
print(f"Injection score: {error.scan_result.injection}%")
print(f"Confidence: {error.scan_result.confidence}%")
print(f"Request ID: {error.request_id}")
# Log security incident
import logging
logging.warning(f"Prompt injection blocked: {error.request_id}")
# Return user-friendly error
return {"error": "Your input could not be processed for security reasons."}
except PolicyViolationError as error:
# Custom content policy violation
print("Content policy violation detected!")
print(f"Request ID: {error.request_id}")
if error.violated_policies:
for policy in error.violated_policies:
print(f" Policy: {policy.get('policy_name')}")
except AbuseDetectedError as error:
# Abuse pattern detected (bot content, repetition, resource exhaustion)
print("Abuse detected!")
print(f"Request ID: {error.request_id}")
if error.abuse_details:
print(f" Confidence: {error.abuse_details.get('confidence')}%")
except PIIDetectedError as error:
# Personally identifiable information detected (when pii_action is "block")
print("PII detected in input!")
print(f"Request ID: {error.request_id}")
if error.entity_types:
print(f" Entity types: {', '.join(error.entity_types)}")
print(f" Entity count: {error.entity_count}")
# Return user-friendly error
return {"error": "Your input contains personal information that cannot be processed."}
except InsufficientCreditsError as error:
# Not enough credits for the request
print("Insufficient credits!")
if error.current_balance is not None:
print(f" Current balance: ${error.current_balance}")
if error.estimated_cost is not None:
print(f" Estimated cost: ${error.estimated_cost}")
print("Top up at: https://www.lockllm.com/billing")
except AuthenticationError as error:
print("Invalid API key")
# Check your LOCKLLM_API_KEY environment variable
except RateLimitError as error:
print("Rate limit exceeded")
print(f"Retry after (ms): {error.retry_after}")
# Wait and retry
if error.retry_after:
import time
time.sleep(error.retry_after / 1000)
# Retry request...
except UpstreamError as error:
print("Provider API error")
print(f"Provider: {error.provider}")
print(f"Status: {error.upstream_status}")
print(f"Message: {error.message}")
# Handle provider-specific errors
if error.provider == 'openai' and error.upstream_status == 429:
# OpenAI rate limit
pass
except ConfigurationError as error:
print(f"Configuration error: {error.message}")
# Check provider key is added in dashboard
except NetworkError as error:
print(f"Network error: {error.message}")
# Check internet connection, firewall, etc.
Link to section: Exponential BackoffExponential Backoff
Implement exponential backoff for transient errors:
import time
from lockllm import RateLimitError, NetworkError
def call_with_backoff(fn, max_retries=5):
"""Call function with exponential backoff"""
for attempt in range(max_retries):
try:
return fn()
except (RateLimitError, NetworkError) as error:
if attempt == max_retries - 1:
raise
delay = min(1.0 * (2 ** attempt), 30.0)
print(f"Retry attempt {attempt + 1} after {delay}s")
time.sleep(delay)
raise Exception('Max retries exceeded')
# Usage
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY"))
response = call_with_backoff(lambda: openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
))
Link to section: Type HintsType Hints
Full type hint support with mypy:
Link to section: Type AnnotationsType Annotations
from lockllm import (
LockLLM,
LockLLMConfig,
RequestOptions,
ScanRequest,
ScanResponse,
ScanResult,
ScanOptions,
ScanMode,
ScanAction,
RouteAction,
PIIAction,
PIIResult,
ProxyOptions,
ProxyResponseMetadata,
ProxyScanWarning,
ProxyPolicyWarnings,
ProxyAbuseDetected,
ProxyPIIDetected,
ProxyRoutingMetadata,
Sensitivity,
ProviderName,
TaskType,
ComplexityTier,
Usage,
Debug,
)
from typing import Optional
# Configuration types
def create_client(api_key: str, base_url: Optional[str] = None) -> LockLLM:
return LockLLM(api_key=api_key, base_url=base_url)
# Scan function with type hints
def scan_prompt(text: str, level: Sensitivity = "medium") -> ScanResponse:
lockllm: LockLLM = LockLLM(api_key="...")
result: ScanResponse = lockllm.scan(input=text, sensitivity=level)
return result
# Response types
response: ScanResponse = scan_prompt("test")
is_safe: bool = response.safe
score: float = response.injection
request_id: str = response.request_id
Link to section: mypy Supportmypy Support
The SDK includes a py.typed marker for mypy:
# Run mypy type checking
mypy your_code.py
# Example output:
# your_code.py:10: error: Argument "sensitivity" has incompatible type "str"; expected "Literal['low', 'medium', 'high']"
Link to section: IDE AutocompleteIDE Autocomplete
Full IDE autocomplete with type stubs:
from lockllm import LockLLM
lockllm = LockLLM(api_key="...")
# IDE suggests: scan(...)
lockllm.s # <-- autocomplete suggestions
# IDE suggests: input, sensitivity, scan_mode, scan_action, policy_action, ...
lockllm.scan(i # <-- autocomplete for parameters
# IDE suggests: "low", "medium", "high"
lockllm.scan(input="test", sensitivity="m # <-- autocomplete for values
Link to section: API ReferenceAPI Reference
Link to section: Type AliasesType Aliases
from lockllm import (
Sensitivity, # Literal["low", "medium", "high"]
ScanMode, # Literal["normal", "policy_only", "combined"]
ScanAction, # Literal["block", "allow_with_warning"]
RouteAction, # Literal["disabled", "auto", "custom"]
PIIAction, # Literal["strip", "block", "allow_with_warning"]
ProviderName, # Literal["openai", "anthropic", "gemini", "cohere", ...]
TaskType, # Literal["Open QA", "Closed QA", "Summarization", ...]
ComplexityTier, # Literal["low", "medium", "high"]
)
Sensitivity - Detection sensitivity threshold level
Sensitivity = Literal["low", "medium", "high"]
ScanMode - Which security checks to perform
ScanMode = Literal["normal", "policy_only", "combined"]
ScanAction - Behavior when threats or violations are detected
ScanAction = Literal["block", "allow_with_warning"]
RouteAction - Smart routing mode
RouteAction = Literal["disabled", "auto", "custom"]
PIIAction - PII detection behavior
PIIAction = Literal["strip", "block", "allow_with_warning"]
ProviderName - All supported AI provider identifiers
ProviderName = Literal[
"openai", "anthropic", "gemini", "cohere", "openrouter",
"perplexity", "mistral", "groq", "deepseek", "together",
"xai", "fireworks", "anyscale", "huggingface", "azure",
"bedrock", "vertex-ai",
]
TaskType - Supported task types for smart routing
TaskType = Literal[
"Open QA", "Closed QA", "Summarization", "Text Generation",
"Code Generation", "Chatbot", "Classification", "Rewrite",
"Brainstorming", "Extraction", "Other",
]
ComplexityTier - Complexity tiers for routing
ComplexityTier = Literal["low", "medium", "high"]
# Mapped from complexity score: low (0-0.4), medium (0.4-0.7), high (0.7-1.0)
Link to section: ConstantsConstants
from lockllm import PROVIDER_BASE_URLS, UNIVERSAL_PROXY_URL
# Provider-specific proxy URLs (dict mapping ProviderName -> URL)
# e.g., PROVIDER_BASE_URLS["openai"] -> "https://api.lockllm.com/v1/proxy/openai"
# Universal proxy URL for non-BYOK users
# UNIVERSAL_PROXY_URL -> "https://api.lockllm.com/v1/proxy"
Link to section: ClassesClasses
Link to section: LockLLMLockLLM
Synchronous client class for scanning prompts.
class LockLLM:
def __init__(
self,
api_key: str,
base_url: Optional[str] = None,
timeout: Optional[float] = None,
max_retries: Optional[int] = None
)
def scan(
self,
input: str,
sensitivity: Literal["low", "medium", "high"] = "medium",
scan_mode: Optional[Literal["normal", "policy_only", "combined"]] = None,
scan_action: Optional[Literal["block", "allow_with_warning"]] = None,
policy_action: Optional[Literal["block", "allow_with_warning"]] = None,
abuse_action: Optional[Literal["block", "allow_with_warning"]] = None,
pii_action: Optional[Literal["strip", "block", "allow_with_warning"]] = None,
compression: Optional[Literal["toon", "compact", "combined"]] = None,
compression_rate: Optional[float] = None,
chunk: Optional[bool] = None,
scan_options: Optional[ScanOptions] = None,
**options
) -> ScanResponse
@property
def config(self) -> LockLLMConfig
def close(self) -> None
Link to section: AsyncLockLLMAsyncLockLLM
Asynchronous client class for scanning prompts.
class AsyncLockLLM:
def __init__(
self,
api_key: str,
base_url: Optional[str] = None,
timeout: Optional[float] = None,
max_retries: Optional[int] = None
)
async def scan(
self,
input: str,
sensitivity: Literal["low", "medium", "high"] = "medium",
scan_mode: Optional[Literal["normal", "policy_only", "combined"]] = None,
scan_action: Optional[Literal["block", "allow_with_warning"]] = None,
policy_action: Optional[Literal["block", "allow_with_warning"]] = None,
abuse_action: Optional[Literal["block", "allow_with_warning"]] = None,
pii_action: Optional[Literal["strip", "block", "allow_with_warning"]] = None,
compression: Optional[Literal["toon", "compact", "combined"]] = None,
compression_rate: Optional[float] = None,
chunk: Optional[bool] = None,
scan_options: Optional[ScanOptions] = None,
**options
) -> ScanResponse
@property
def config(self) -> LockLLMConfig
async def close(self) -> None
Link to section: Data ClassesData Classes
Link to section: LockLLMConfigLockLLMConfig
@dataclass
class LockLLMConfig:
api_key: str # Your LockLLM API key (required)
base_url: str # API endpoint (default: "https://api.lockllm.com")
timeout: float # Request timeout in seconds (default: 60.0)
max_retries: int # Max retry attempts (default: 3)
Link to section: RequestOptionsRequestOptions
Per-request configuration overrides:
@dataclass
class RequestOptions:
headers: Optional[Dict[str, str]] = None # Additional HTTP headers to include
timeout: Optional[float] = None # Override default timeout for this request
Link to section: ScanResultScanResult
Base class for scan responses. Contains the core detection fields shared by ScanResponse and by PromptInjectionError.scan_result:
@dataclass
class ScanResult:
safe: bool # true if safe, false if malicious
label: Literal[0, 1] # 0=safe, 1=malicious
confidence: Optional[float] # Confidence score 0-100 (None in policy_only mode)
injection: Optional[float] # Injection risk score 0-100 (None in policy_only mode)
sensitivity: str # Sensitivity level used for this scan
Link to section: ScanResponseScanResponse
@dataclass
class ScanResponse(ScanResult):
safe: bool # true if safe, false if malicious
label: Literal[0, 1] # 0=safe, 1=malicious
confidence: Optional[float] # Confidence score 0-100
injection: Optional[float] # Injection risk score 0-100
sensitivity: str # Sensitivity level used
request_id: str # Unique request identifier
usage: Usage # Usage statistics
debug: Optional[Debug] # Debug info (optional)
policy_confidence: Optional[float] # Policy check confidence (0-100)
policy_warnings: Optional[List[PolicyViolation]] # Policy violations (if any)
scan_warning: Optional[ScanWarning] # Injection warning details
abuse_warnings: Optional[AbuseWarning] # Abuse detection results
routing: Optional[RoutingInfo] # Routing metadata
pii_result: Optional[PIIResult] # PII detection result (when enabled)
Link to section: ScanRequestScanRequest
A single scan input with an optional sensitivity override. Used when constructing scan payloads programmatically:
@dataclass
class ScanRequest:
input: str # The text prompt to scan (required)
sensitivity: Sensitivity = "medium" # Detection sensitivity level
Link to section: ScanOptionsScanOptions
Reusable scan configuration for the scan API:
@dataclass
class ScanOptions:
scan_mode: Optional[ScanMode] = None # "normal" | "policy_only" | "combined"
scan_action: Optional[ScanAction] = None # "block" | "allow_with_warning"
policy_action: Optional[ScanAction] = None # "block" | "allow_with_warning"
abuse_action: Optional[ScanAction] = None # "block" | "allow_with_warning"
chunk: Optional[bool] = None # Enable/disable chunked scanning
pii_action: Optional[PIIAction] = None # "strip" | "block" | "allow_with_warning"
compression: Optional[CompressionAction] = None # "toon" | "compact" | "combined"
compression_rate: Optional[float] = None # 0.3-0.7 (compact/combined only)
Link to section: ProxyOptionsProxyOptions
Configuration for wrapper functions and proxy requests:
@dataclass
class ProxyOptions:
scan_mode: Optional[str] = None # "normal" | "policy_only" | "combined"
scan_action: Optional[str] = None # "block" | "allow_with_warning"
policy_action: Optional[str] = None # "block" | "allow_with_warning"
abuse_action: Optional[str] = None # "block" | "allow_with_warning"
route_action: Optional[str] = None # "disabled" | "auto" | "custom"
sensitivity: Optional[str] = None # "low" | "medium" | "high"
cache_response: Optional[bool] = None # Enable/disable response caching
cache_ttl: Optional[int] = None # Cache TTL in seconds (max 86400)
chunk: Optional[bool] = None # Enable/disable chunked scanning
pii_action: Optional[str] = None # "strip" | "block" | "allow_with_warning"
compression: Optional[str] = None # "toon" | "compact" | "combined"
compression_rate: Optional[float] = None # 0.3-0.7 (compact/combined only)
Link to section: UsageUsage
@dataclass
class Usage:
requests: int # Number of upstream inference requests used
input_chars: int # Number of characters in the input
Link to section: DebugDebug
Debug information (available on certain plans):
@dataclass
class Debug:
duration_ms: int # Total processing time in milliseconds
inference_ms: int # ML inference time in milliseconds
mode: Literal["single", "chunked"] # Processing mode used
Link to section: PolicyViolationPolicyViolation
@dataclass
class PolicyViolation:
policy_name: str # Name of the violated policy
violated_categories: List[ViolatedCategory] # List of violated categories
violation_details: Optional[str] # Text that triggered the violation
Link to section: ViolatedCategoryViolatedCategory
@dataclass
class ViolatedCategory:
name: str # Category name
description: Optional[str] # Category description
Link to section: ScanWarningScanWarning
@dataclass
class ScanWarning:
message: str # Warning description
injection_score: float # Injection risk score (0-100)
confidence: float # Detection confidence (0-100)
label: int # 0=safe, 1=unsafe
Link to section: AbuseWarningAbuseWarning
@dataclass
class AbuseWarning:
detected: bool # Whether abuse was detected
confidence: float # Abuse confidence (0-100)
abuse_types: List[str] # Types of abuse detected
indicators: Dict[str, float] # Scores per abuse category
recommendation: Optional[str] # Suggested action
Link to section: RoutingInfoRoutingInfo
@dataclass
class RoutingInfo:
enabled: bool # Whether routing was applied
task_type: str # Detected task classification
complexity: float # Complexity score (0-1)
selected_model: Optional[str] # Model chosen by router
reasoning: Optional[str] # Why this model was selected
estimated_cost: Optional[float] # Estimated cost
Link to section: PIIResultPIIResult
@dataclass
class PIIResult:
detected: bool # Whether PII was detected
entity_types: List[str] # Types of PII entities found
entity_count: int # Number of PII entities found
redacted_input: Optional[str] # Redacted text (only when pii_action is "strip")
Link to section: CompressionResultCompressionResult
@dataclass
class CompressionResult:
method: str # "toon", "compact", or "combined"
compressed_input: str # The compressed text
original_length: int # Original text length in characters
compressed_length: int # Compressed text length in characters
compression_ratio: float # Ratio of compressed/original (lower = better)
Link to section: ProxyScanWarningProxyScanWarning
Scan warning metadata parsed from proxy response headers:
@dataclass
class ProxyScanWarning:
injection_score: float # Injection score from scan (0-100)
confidence: float # Confidence level of the detection (0-100)
detail: str # Base64-encoded JSON with detailed scan info
Use decode_detail_field(warning.detail) to decode the detail field into a Python dict.
Link to section: ProxyPolicyWarningsProxyPolicyWarnings
Policy warning metadata parsed from proxy response headers:
@dataclass
class ProxyPolicyWarnings:
count: int # Number of policy violations detected
confidence: float # Confidence level of the detection (0-100)
detail: str # Base64-encoded JSON with violation details
Use decode_detail_field(warnings.detail) to decode the detail field into a Python dict.
Link to section: ProxyAbuseDetectedProxyAbuseDetected
Abuse detection metadata parsed from proxy response headers:
@dataclass
class ProxyAbuseDetected:
confidence: float # Confidence level of abuse detection (0-100)
types: str # Comma-separated abuse types detected
detail: str # Base64-encoded JSON with abuse details
Use decode_detail_field(abuse.detail) to decode the detail field into a Python dict.
Link to section: ProxyPIIDetectedProxyPIIDetected
PII detection metadata parsed from proxy response headers:
@dataclass
class ProxyPIIDetected:
detected: bool # Whether PII was detected
entity_types: str # Comma-separated PII entity types (e.g., "Email,Phone Number")
entity_count: int # Number of PII entities found
action: str # PII action taken ("strip", "block", "allow_with_warning")
Link to section: ProxyCompressionMetadataProxyCompressionMetadata
Compression metadata parsed from proxy response headers:
@dataclass
class ProxyCompressionMetadata:
method: str # "toon", "compact", or "combined"
applied: bool # Whether compression was actually applied
ratio: Optional[float] # Compression ratio (only when applied)
Link to section: ProxyRoutingMetadataProxyRoutingMetadata
Detailed routing metadata parsed from proxy response headers. Contains more information than RoutingInfo (which is returned from the scan API):
@dataclass
class ProxyRoutingMetadata:
enabled: bool # Whether routing was applied
task_type: str # Detected task classification
complexity: float # Prompt complexity score (0-1)
selected_model: str # Model chosen by router
routing_reason: str # Explanation for model selection
original_provider: str # Original provider requested
original_model: str # Original model requested
estimated_savings: float # Estimated cost savings
estimated_original_cost: Optional[float] # Estimated cost with original model
estimated_routed_cost: Optional[float] # Estimated cost with routed model
estimated_input_tokens: Optional[int] # Estimated input tokens for routing
estimated_output_tokens: Optional[int] # Estimated output tokens for routing
routing_fee_reason: Optional[str] # Reason for routing fee or waiver
Link to section: ProxyResponseMetadataProxyResponseMetadata
Comprehensive metadata extracted from proxy response headers. Use parse_proxy_metadata() to parse response headers into this typed object:
@dataclass
class ProxyResponseMetadata:
# Core
request_id: str # Unique request identifier
scanned: bool # Whether request was scanned
safe: bool # Whether request was safe
scan_mode: str # Scan mode used
credits_mode: str # "lockllm_credits" | "byok"
provider: str # AI provider name
model: Optional[str] # Model identifier
sensitivity: Optional[str] # Detection sensitivity level used
label: Optional[int] # Binary classification (0=safe, 1=unsafe)
policy_confidence: Optional[float] # Policy check confidence (0-100)
blocked: Optional[bool] # Whether the request was blocked
# Warnings
scan_warning: Optional[ProxyScanWarning] # Injection warning details
policy_warnings: Optional[ProxyPolicyWarnings] # Policy violation details
abuse_detected: Optional[ProxyAbuseDetected] # Abuse detection details
pii_detected: Optional[ProxyPIIDetected] # PII detection details (when enabled)
# Routing
routing: Optional[ProxyRoutingMetadata] # Routing metadata
# Credits
credits_reserved: Optional[float] # Credits reserved for request
routing_fee_reserved: Optional[float] # Routing fee reserved
routing_fee_reason: Optional[str] # Reason for routing fee or waiver
credits_deducted: Optional[float] # Credits actually deducted
balance_after: Optional[float] # Balance after request
# Cost estimates
estimated_original_cost: Optional[float] # Estimated cost with original model
estimated_routed_cost: Optional[float] # Estimated cost with routed model
estimated_input_tokens: Optional[int] # Estimated input tokens
estimated_output_tokens: Optional[int] # Estimated output tokens
# Cache
cache_status: Optional[str] # "HIT" | "MISS"
cache_age: Optional[int] # Cache age in seconds
tokens_saved: Optional[int] # Tokens saved by cache hit
cost_saved: Optional[float] # Cost saved by cache hit
# Decoded detail fields
scan_detail: Optional[Any] # Decoded scan detail from header
policy_detail: Optional[Any] # Decoded policy warning detail from header
abuse_detail: Optional[Any] # Decoded abuse detail from header
Link to section: FunctionsFunctions
Link to section: Wrapper Functions (Sync)Wrapper Functions (Sync)
def create_client(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_openai_compatible(api_key: str, base_url: str, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_openai(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_anthropic(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> Anthropic
def create_groq(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_deepseek(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_perplexity(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_mistral(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_openrouter(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_together(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_xai(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_fireworks(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_anyscale(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_huggingface(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_gemini(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_cohere(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_azure(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_bedrock(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
def create_vertex_ai(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> OpenAI
Link to section: Wrapper Functions (Async)Wrapper Functions (Async)
def create_async_client(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_openai_compatible(api_key: str, base_url: str, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_openai(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_anthropic(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncAnthropic
def create_async_groq(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_deepseek(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_perplexity(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_mistral(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_openrouter(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_together(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_xai(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_fireworks(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_anyscale(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_huggingface(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_gemini(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_cohere(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_azure(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_bedrock(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
def create_async_vertex_ai(api_key: str, base_url: Optional[str] = None, proxy_options: Optional[ProxyOptions] = None, **kwargs) -> AsyncOpenAI
Link to section: Utility FunctionsUtility Functions
def get_proxy_url(provider: ProviderName) -> str
# Get proxy URL for a specific provider
# Example: get_proxy_url('openai') -> 'https://api.lockllm.com/v1/proxy/openai'
def get_all_proxy_urls() -> Dict[ProviderName, str]
# Get all provider proxy URLs as a dict
def get_universal_proxy_url() -> str
# Get universal proxy URL for non-BYOK users
# Returns: 'https://api.lockllm.com/v1/proxy'
def build_lockllm_headers(options: ProxyOptions) -> Dict[str, str]
# Convert ProxyOptions to X-LockLLM-* HTTP headers
# Useful when making raw HTTP requests to the proxy
def parse_proxy_metadata(headers: Dict[str, str]) -> ProxyResponseMetadata
# Parse response headers into typed metadata object
def decode_detail_field(detail: str) -> Optional[Any]
# Decode base64-encoded detail fields from proxy response headers
Link to section: Exception ClassesException Classes
class LockLLMError(Exception):
message: str
type: str
code: Optional[str]
status: Optional[int]
request_id: Optional[str]
class AuthenticationError(LockLLMError): pass # 401
class RateLimitError(LockLLMError): # 429
retry_after: Optional[int] # milliseconds
class PromptInjectionError(LockLLMError): # 400
scan_result: ScanResult
class PolicyViolationError(LockLLMError): # 403
violated_policies: Optional[List[Dict[str, Any]]]
class AbuseDetectedError(LockLLMError): # 400
abuse_details: Optional[Dict[str, Any]]
class PIIDetectedError(LockLLMError): # 403
entity_types: List[str] # PII entity types detected
entity_count: int # Number of PII entities found
class InsufficientCreditsError(LockLLMError): # 402
current_balance: Optional[float]
estimated_cost: Optional[float]
class UpstreamError(LockLLMError): # 502
provider: Optional[str]
upstream_status: Optional[int]
class ConfigurationError(LockLLMError): pass # 400
class NetworkError(LockLLMError): # 0
cause: Optional[Exception]
Link to section: Best PracticesBest Practices
Link to section: SecuritySecurity
- Never hardcode API keys - Use environment variables or secret managers
- Log security incidents - Track blocked requests with request IDs
- Set appropriate sensitivity - Balance security needs with false positives
- Use block mode for critical paths - Set
scan_action="block"andpolicy_action="block"for sensitive operations - Handle errors gracefully - Provide user-friendly error messages
- Monitor request patterns - Watch for attack trends in dashboard
- Rotate keys regularly - Update API keys periodically
- Use HTTPS only - Never send API keys over unencrypted connections
Link to section: PerformancePerformance
- Use wrapper functions - Most efficient integration method
- Use async for I/O-bound workloads - Better concurrency with AsyncLockLLM
- Use response caching - Enable response caching via
ProxyOptions(cache_response=True)to reduce costs and latency for repeated queries - Set reasonable timeouts - Balance user experience with reliability
- Connection pooling - The SDK handles this automatically
- Batch when possible - Group similar requests with asyncio.gather()
Link to section: Async ProgrammingAsync Programming
- Use async context managers - Ensure proper resource cleanup
- Avoid blocking calls in async - Don't mix sync and async code
- Handle event loop properly - Use asyncio.run() or existing loop
- Be cautious with global state - Async can expose race conditions
- Use asyncio.gather() for concurrency - Process multiple requests in parallel
Link to section: Production DeploymentProduction Deployment
- Test sensitivity levels - Validate with real user data
- Implement monitoring - Track blocked requests and false positives
- Set up alerting - Get notified of security incidents
- Review logs regularly - Analyze attack patterns
- Keep SDK updated - Benefit from latest improvements (pip install -U lockllm)
- Document incidents - Maintain security incident log
- Load test - Verify performance under expected load
Link to section: Migration GuidesMigration Guides
Link to section: From Direct API IntegrationFrom Direct API Integration
If you're currently calling LLM APIs directly:
# Before: Direct OpenAI API call
from openai import OpenAI
import os
openai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
# After: With LockLLM security (one line change)
from lockllm import create_openai
import os
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY")) # Only change
# Everything else stays the same
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": user_input}]
)
Link to section: From OpenAI LibraryFrom OpenAI Library
Minimal changes required:
# Before
from openai import OpenAI
openai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# After
from lockllm import create_openai
openai = create_openai(api_key=os.getenv("LOCKLLM_API_KEY")) # Use LockLLM key
# All other code remains unchanged
Link to section: From Anthropic LibraryFrom Anthropic Library
Minimal changes required:
# Before
from anthropic import Anthropic
anthropic = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
# After
from lockllm import create_anthropic
anthropic = create_anthropic(api_key=os.getenv("LOCKLLM_API_KEY")) # Use LockLLM key
# All other code remains unchanged
Link to section: Async Programming GuideAsync Programming Guide
Link to section: Event Loop ManagementEvent Loop Management
import asyncio
from lockllm import AsyncLockLLM
# Method 1: Using asyncio.run() (recommended for scripts)
async def main():
lockllm = AsyncLockLLM(api_key="...")
result = await lockllm.scan(input="test")
print(result.safe)
asyncio.run(main())
# Method 2: Using existing event loop (for frameworks)
loop = asyncio.get_event_loop()
result = loop.run_until_complete(
lockllm.scan(input="test")
)
# Method 3: In Jupyter notebooks
await lockllm.scan(input="test") # Works directly
Link to section: Concurrency PatternsConcurrency Patterns
import asyncio
from lockllm import AsyncLockLLM
async def concurrent_scans():
lockllm = AsyncLockLLM(api_key="...")
# Scan multiple prompts concurrently
prompts = ["prompt1", "prompt2", "prompt3"]
# Method 1: asyncio.gather()
results = await asyncio.gather(*[
lockllm.scan(input=prompt)
for prompt in prompts
])
# Method 2: asyncio.create_task()
tasks = [
asyncio.create_task(lockllm.scan(input=prompt))
for prompt in prompts
]
results = await asyncio.gather(*tasks)
return results
Link to section: Resource CleanupResource Cleanup
from lockllm import AsyncLockLLM
import asyncio
# Always use async context managers
async def main():
async with AsyncLockLLM(api_key="...") as client:
result = await client.scan(input="test")
# Client is automatically closed when exiting the block
return result
asyncio.run(main())
Link to section: TroubleshootingTroubleshooting
Link to section: Common IssuesCommon Issues
"Invalid API key" error (401)
- Verify your LockLLM API key is correct
- Check the key hasn't been revoked in the dashboard
- Ensure you're using your LockLLM key, not your provider key
"No provider API key configured" error (400)
- Add your provider API key (OpenAI, Anthropic, etc.) in the dashboard
- Navigate to Proxy Settings and configure provider keys
- Ensure the provider key is enabled (toggle switch on)
"Could not extract prompt from request" error (400)
- Verify request body format matches provider API spec
- Check you're using the correct SDK version
- Ensure messages array is properly formatted
"Custom policy violation" error (403)
- Check your custom policies in the dashboard
- If using
policy_action="block", violations will raisePolicyViolationError - Use
scan_mode="combined"orscan_mode="policy_only"to enable policy checks
"Insufficient credits" error (402)
- Check your credit balance in the dashboard billing page
- Top up credits at https://www.lockllm.com/billing
- The error includes
current_balanceandestimated_costfor reference
"Abuse detected" error (400)
- Your request matched abuse patterns (bot content, repetition, resource exhaustion)
- Review the
abuse_detailsin the error for specifics - Abuse detection is opt-in - remove
abuse_actionparameter to disable
"PII detected" error (403)
- Your input contains personally identifiable information and
pii_actionis set to"block" - Review the
entity_typesandentity_countin the error for specifics - Use
pii_action="strip"to automatically redact PII instead of blocking - Use
pii_action="allow_with_warning"to allow the request while flagging PII - PII detection is opt-in - remove the
pii_actionparameter to disable
High latency
- Check your network connection
- Verify LockLLM API status
- Consider adjusting timeout settings
- Review provider API latency
mypy errors
- Ensure Python 3.8+ is installed
- Check peer dependencies are installed (openai, anthropic)
- Run
pip install types-requests - Verify SDK is installed:
pip show lockllm
Async event loop issues
- Don't mix sync and async code
- Use
asyncio.run()for scripts - Use existing event loop in frameworks
- Close resources properly with async context managers
Link to section: Debugging TipsDebugging Tips
Enable detailed logging:
import logging
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
from lockllm import LockLLM
lockllm = LockLLM(api_key=os.getenv("LOCKLLM_API_KEY"))
try:
result = lockllm.scan(input=user_prompt)
print(f"Scan result: safe={result.safe}, injection={result.injection}%, request_id={result.request_id}")
except Exception as error:
print(f"Error: {type(error).__name__}: {error}")
if hasattr(error, 'request_id'):
print(f"Request ID: {error.request_id}")
Link to section: Getting HelpGetting Help
- Documentation: https://www.lockllm.com/docs
- GitHub Issues: https://github.com/lockllm/lockllm-pip/issues
- Email Support: [email protected]
- Status Page: status.lockllm.com
Link to section: FAQFAQ
Link to section: How do I install the SDK?How do I install the SDK?
Install using pip, poetry, or pipenv:
pip install lockllm
poetry add lockllm
pipenv install lockllm
The SDK requires Python 3.8+ and works with Python 3.8, 3.9, 3.10, 3.11, and 3.12.
Link to section: Does the SDK work as a drop-in replacement for OpenAI and Anthropic?Does the SDK work as a drop-in replacement for OpenAI and Anthropic?
Yes. Use create_openai() or create_anthropic() to get wrapped clients that work exactly like the official SDKs. All methods, streaming, and function calling are supported. Prompts are automatically scanned before being sent to the provider.
Link to section: What's the difference between sync and async?What's the difference between sync and async?
The SDK provides both synchronous (LockLLM, create_openai) and asynchronous (AsyncLockLLM, create_async_openai) APIs. Use async for I/O-bound workloads and better concurrency. Use sync for simple scripts and synchronous applications.
Link to section: What type hint support is available?What type hint support is available?
The SDK includes full type hints with a py.typed marker for mypy. It supports mypy strict mode, provides IDE autocomplete, and includes type stubs for all APIs. Python 3.8+ type hints are used throughout.
Link to section: Which AI providers are supported?Which AI providers are supported?
17+ providers are supported with both sync and async variants: OpenAI, Anthropic, Groq, DeepSeek, Perplexity, Mistral, OpenRouter, Together AI, xAI (Grok), Fireworks AI, Anyscale, Hugging Face, Google Gemini, Cohere, Azure OpenAI, AWS Bedrock, Google Vertex AI. All providers support custom endpoint URLs for self-hosted and private deployments.
Link to section: How do I handle errors?How do I handle errors?
The SDK provides 11 typed exception classes: AuthenticationError, RateLimitError, PromptInjectionError, PolicyViolationError, AbuseDetectedError, PIIDetectedError, InsufficientCreditsError, UpstreamError, ConfigurationError, NetworkError, and base LockLLMError. Use try-except blocks with specific exception types for proper error handling.
Link to section: Does the SDK support streaming?Does the SDK support streaming?
Yes. All provider wrappers fully support streaming responses in both sync and async modes. Use stream=True in your requests and iterate with for (sync) or async for (async) loops.
Link to section: What are scan modes?What are scan modes?
Scan modes control which security checks are performed on your requests. Use "normal" for core injection detection only, "policy_only" for custom content policies only, or "combined" (default) for maximum protection with both. Set the mode via scan_mode parameter on the scan API or via ProxyOptions for wrapper functions.
Link to section: How does smart routing work?How does smart routing work?
Smart routing automatically selects the best AI model based on task type and complexity to optimize cost and quality. Enable it with route_action="auto" in ProxyOptions. You can also configure custom routing rules in the dashboard and use route_action="custom". Routing is available only in proxy mode.
Link to section: Can I cache LLM responses?Can I cache LLM responses?
Yes. Response caching is enabled by default in proxy mode to reduce costs and latency for repeated queries. Use ProxyOptions(cache_response=False) to disable it, or set a custom TTL with cache_ttl (max 86400 seconds / 24 hours). Streaming requests are not cached.
Link to section: How does PII detection work?How does PII detection work?
PII (Personally Identifiable Information) detection is an opt-in feature that scans prompts for sensitive data such as names, email addresses, phone numbers, Social Security numbers, credit card numbers, and other personal information. Enable it by setting pii_action to "strip" (replaces PII with placeholders), "block" (blocks the request), or "allow_with_warning" (allows with PII metadata in response). PII detection works in both the scan API and proxy mode. See PII Detection for details and supported entity types.
Link to section: Is there a free tier?Is there a free tier?
Yes. LockLLM offers a free tier with scanning included. The free tier includes 300 requests per minute. Higher tiers with increased rate limits and free monthly credits are available based on usage. See pricing for details.
Link to section: GitHub RepositoryGitHub Repository
View the source code, report issues, and contribute:
Repository: https://github.com/lockllm/lockllm-pip
PyPI Package: https://pypi.org/project/lockllm/
Link to section: Next StepsNext Steps
- View Python SDK integration page for more examples
- Read API Reference for REST API details
- Explore Best Practices for production deployments
- Check out Proxy Mode for alternative integration
- Configure Webhooks for security alerts
- Browse Dashboard documentation