AI Abuse Detection - LockLLM Documentation

Link to section: What is AI Abuse Detection?What is AI Abuse Detection?

AI Abuse Detection protects your LLM endpoints from end-users who misuse your AI services. It analyzes both the content of individual requests and behavioral patterns over time to identify automated bots, spam, resource exhaustion attempts, and other forms of abuse.

Key benefits:

Dual analysis - examines both content (single request) and behavioral patterns (over time)
Per-API-key tracking - behavioral patterns tracked individually for each API key
No additional cost - abuse detection is completely FREE
Opt-in design - disabled by default, enable when you're ready
Works alongside threat detection, custom policies, and all other features
Available in both scan endpoint and proxy mode
Actionable recommendations included with detection results

Link to section: What Gets DetectedWhat Gets Detected

Abuse detection uses two complementary analysis methods:

Link to section: Content AnalysisContent Analysis

Analyzes the content of each individual request to identify suspicious patterns:

Bot-Generated Content

Template-like structures common in automated requests (e.g., "As an AI assistant, please...")
Excessive special characters or programmatic formatting
Patterns typical of generated or templated prompts
Unusual structural characteristics that suggest machine-authored input

Excessive Repetition

Character-level repetition (e.g., "aaaaaaaaa", "!!!!!!!!!")
Word-level repetition (the same word appearing an unusual number of times)
Phrase-level repetition (3+ word sequences repeated within the same prompt)

Resource Exhaustion

Extremely long prompts designed to consume processing resources
Deeply nested structures that could overwhelm parsing
Oversized inputs that exceed reasonable usage patterns

Link to section: Pattern AnalysisPattern Analysis

Tracks behavioral patterns over time for each API key to identify coordinated abuse:

Rapid Requests

Unusual request frequency that exceeds normal human interaction rates
Sustained high-volume automated request patterns

Duplicate Prompts

Identical prompts sent repeatedly within a short time window
Indicates scripted or automated usage rather than genuine user interaction

Burst Detection

Concentrated spikes of requests in short intervals
Patterns suggesting automated tools or scripts rather than natural usage

Link to section: ActionsActions

Abuse detection is controlled via the X-LockLLM-Abuse-Action header:

Action	Behavior
Not set (default)	Abuse detection is completely disabled
`allow_with_warning`	Detect abuse and include results in response, but allow the request through
`block`	Block detected abuse with a 400 error

Link to section: When to Use Each ActionWhen to Use Each Action

allow_with_warning - Good for monitoring. Start here to understand your abuse patterns before enabling blocking. Your application can read the response data and decide how to handle it (e.g., log the event, flag the user, apply your own rate limiting).

block - Good for production enforcement. Use when you're confident in the detection and want to automatically reject abusive requests. Blocked requests return a 400 error with details about what was detected.

Link to section: Content Analysis vs Pattern AnalysisContent Analysis vs Pattern Analysis

Abuse detection uses two complementary methods that cover different attack vectors. Understanding how they work together helps you get the most out of this feature.

Link to section: Content AnalysisContent Analysis

Content analysis examines each individual request on its own. It looks at the structure, formatting, and characteristics of a single prompt to determine whether it shows signs of abuse. This method works from the very first request - no history is needed.

Strengths:

Immediate detection with no warm-up period
Catches single-request abuse like bot-generated templates, oversized inputs, and repetitive content
Works independently of request history

Best at catching: Automated scripts that generate templated prompts, spam with excessive repetition, and oversized inputs designed to consume resources.

Link to section: Pattern AnalysisPattern Analysis

Pattern analysis tracks request behavior over time for each API key. Instead of looking at a single request in isolation, it identifies suspicious trends across multiple requests - like unusually high frequency, repeated identical prompts, or concentrated bursts of activity.

Strengths:

Catches coordinated abuse that looks normal on a per-request basis
Identifies automated behavior that content analysis alone cannot detect
Adapts to each API key's usage patterns

Best at catching: Automated scraping, credential stuffing campaigns, and scripted attacks that send many individually-normal requests at suspicious rates.

Link to section: How They Work TogetherHow They Work Together

Content analysis catches a single abusive request instantly - even if it is the first request ever sent. Pattern analysis catches campaigns of seemingly-normal requests that together indicate abuse. Together, they cover both single-request and multi-request attack vectors, giving you comprehensive protection against the full spectrum of AI endpoint abuse.

Link to section: Confidence Scores ExplainedConfidence Scores Explained

Every abuse detection result includes an overall confidence score (0-100) and four individual indicator scores that show where the signal came from:

Indicator	What It Tells You
`bot_score`	How likely the request content is machine-generated - based on template patterns, formatting, and structural analysis
`repetition_score`	How much repetitive content is in the request - characters, words, or phrases repeated at unusual rates
`resource_score`	Whether the input appears designed to consume excessive processing resources - extreme length, deep nesting, or oversized structure
`pattern_score`	Whether the request behavior from this API key shows anomalous patterns - frequency, duplicates, or bursts

The overall confidence score reflects a weighted assessment across all four indicators. A high confidence means strong evidence across multiple signals. A lower confidence means a single indicator was triggered.

These scores help you understand not just that abuse was detected, but what kind of abuse it is. For example, a high bot_score with low pattern_score suggests a single automated request, while a low bot_score with high pattern_score suggests a coordinated campaign using normal-looking prompts.

Link to section: Gradual Rollout StrategyGradual Rollout Strategy

We recommend a phased approach to deploying abuse detection in production:

Link to section: Step 1: Enable MonitoringStep 1: Enable Monitoring

Start with X-LockLLM-Abuse-Action: allow_with_warning in your production environment. No requests are blocked - you are only collecting data. Review the abuse warnings in your activity logs in the dashboard for 1-2 weeks.

Link to section: Step 2: Analyze Your BaselineStep 2: Analyze Your Baseline

Review which abuse types are detected most frequently. Check for false positives - are any legitimate workflows being flagged? Identify patterns unique to your application and user base.

Link to section: Step 3: Separate Automated TrafficStep 3: Separate Automated Traffic

If you have legitimate high-volume or automated use cases (batch processing, CI/CD integrations, internal tools), give those use cases separate API keys. This ensures automated traffic does not trigger pattern-based detections meant for user-facing endpoints.

Link to section: Step 4: Enable BlockingStep 4: Enable Blocking

Switch to X-LockLLM-Abuse-Action: block for the API keys serving user-facing traffic. Keep allow_with_warning for internal or automated traffic if needed. Monitor the block rate and review blocked requests periodically to ensure accuracy.

You can run allow_with_warning and block simultaneously on different API keys within the same account, giving you flexible control over which traffic gets enforced.

Link to section: Real-World ScenariosReal-World Scenarios

Link to section: Automated ScrapingAutomated Scraping

A bad actor writes a script to systematically extract responses from your AI endpoint, sending hundreds of requests per minute with slight prompt variations. Pattern analysis detects the high frequency and identifies the coordinated behavior, flagging or blocking the scraping campaign.

Link to section: Spam BotSpam Bot

A bot submits template-like prompts to generate spam content through your AI-powered chatbot. Each prompt follows the same structure with minor variable substitutions. Content analysis catches the template patterns and programmatic formatting typical of bot-generated input.

Link to section: Resource Exhaustion AttackResource Exhaustion Attack

An attacker sends extremely long prompts with deeply nested JSON structures to consume processing resources and drive up your costs. Content analysis flags the oversized input and unusual nesting before it reaches your AI provider.

Link to section: Scripted Prompt FarmingScripted Prompt Farming

An attacker rapidly sends variations of prompts trying to systematically generate content or extract information at scale. The individual prompts look normal, but the volume and timing are suspicious. Pattern analysis detects the burst behavior and duplicate patterns, catching what content analysis alone would miss.

Link to section: ConfigurationConfiguration

Link to section: Scan EndpointScan Endpoint

curl -X POST https://api.lockllm.com/v1/scan \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-LockLLM-Abuse-Action: block" \
  -d '{
    "input": "aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa repeat repeat repeat repeat repeat"
  }'

Link to section: Proxy Mode - JavaScript/TypeScriptProxy Mode - JavaScript/TypeScript

const OpenAI = require('openai')

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'X-LockLLM-Abuse-Action': 'block'
  }
})

try {
  const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: userPrompt }]
  })
} catch (error) {
  if (error.status === 400 && error.error?.type === 'lockllm_abuse_error') {
    // Abusive request was blocked
    const abuseTypes = error.error.abuse_details.abuse_types
    // Log, alert, or take action based on abuse type
  }
}

Link to section: Proxy Mode - PythonProxy Mode - Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),
    base_url='https://api.lockllm.com/v1/proxy/openai',
    default_headers={
        'X-LockLLM-Abuse-Action': 'block'
    }
)

try:
    response = client.chat.completions.create(
        model='gpt-4',
        messages=[{'role': 'user', 'content': user_prompt}]
    )
except Exception as e:
    if hasattr(e, 'status_code') and e.status_code == 400:
        # Abusive request was blocked
        pass

Link to section: LockLLM SDK - JavaScript/TypeScriptLockLLM SDK - JavaScript/TypeScript

import { createOpenAI } from '@lockllm/sdk/wrappers'

const openai = createOpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  proxyOptions: {
    abuseAction: 'block'
  }
})

Link to section: LockLLM SDK - PythonLockLLM SDK - Python

from lockllm import create_openai, ProxyOptions

openai = create_openai(
    api_key=os.getenv('LOCKLLM_API_KEY'),
    proxy_options=ProxyOptions(abuse_action='block')
)

Link to section: Response FormatResponse Format

Link to section: Scan Endpoint (allow_with_warning)Scan Endpoint (allow_with_warning)

When abuse is detected and the action is allow_with_warning, the response includes an abuse_warnings object:

{
  "request_id": "req_abc123",
  "safe": true,
  "confidence": 88,
  "injection": 5,
  "abuse_warnings": {
    "detected": true,
    "confidence": 87,
    "abuse_types": ["bot_generated", "excessive_repetition"],
    "indicators": {
      "bot_score": 95,
      "repetition_score": 78,
      "resource_score": 20,
      "pattern_score": 45
    },
    "details": {
      "recommendation": "Implement rate limiting or verification for this user"
    }
  }
}

Link to section: Proxy Mode Response HeadersProxy Mode Response Headers

Header	Description
`X-LockLLM-Abuse-Detected`	`"true"` or `"false"`
`X-LockLLM-Abuse-Confidence`	Confidence score (0-100)
`X-LockLLM-Abuse-Types`	Comma-separated abuse types (e.g., `"bot_generated,rapid_requests"`)
`X-LockLLM-Abuse-Detail`	Base64-encoded JSON with full detection details

Link to section: Block Mode Error ResponseBlock Mode Error Response

When X-LockLLM-Abuse-Action: block and abuse is detected:

{
  "error": {
    "message": "Request blocked due to abuse detection",
    "type": "lockllm_abuse_error",
    "code": "abuse_detected",
    "abuse_details": {
      "confidence": 87,
      "abuse_types": ["bot_generated", "rapid_requests"],
      "indicators": {
        "bot_score": 95,
        "repetition_score": 45,
        "resource_score": 30,
        "pattern_score": 80
      },
      "details": {
        "recommendation": "Implement rate limiting or CAPTCHA for this user"
      }
    },
    "request_id": "req_abc123"
  }
}

Link to section: Abuse TypesAbuse Types

The abuse_types array can include:

Type	Description
`bot_generated`	Content appears to be machine-generated or templated
`excessive_repetition`	Unusual character, word, or phrase repetition
`resource_exhaustion`	Oversized or deeply nested input
`rapid_requests`	Suspicious request patterns including unusually high frequency, repeated identical prompts, and concentrated request bursts

The rapid_requests type covers all behavioral pattern signals, including high request frequency, repeated identical prompts, and concentrated request bursts from the same API key.

Link to section: Indicator ScoresIndicator Scores

Each detection result includes four indicator scores (0-100):

Indicator	What It Measures
`bot_score`	Likelihood the content is bot-generated or automated
`repetition_score`	Level of repetitive patterns in the content
`resource_score`	Whether the input is designed for resource exhaustion
`pattern_score`	Behavioral pattern anomalies over time

Link to section: Use CasesUse Cases

Link to section: API ProvidersAPI Providers

Protect your AI API from automated scraping and abuse:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'X-LockLLM-Abuse-Action': 'block',
    'X-LockLLM-Scan-Action': 'block'
  }
})

Link to section: SaaS ApplicationsSaaS Applications

Prevent users from overwhelming your AI features with automated requests. Abuse detection tracks patterns per API key, so it identifies problematic users without affecting legitimate ones.

Link to section: Customer-Facing ChatbotsCustomer-Facing Chatbots

Protect chatbots from spam and automated interactions:

# Combine abuse detection with threat detection for maximum protection
client = OpenAI(
    api_key=os.environ.get('LOCKLLM_API_KEY'),
    base_url='https://api.lockllm.com/v1/proxy/openai',
    default_headers={
        'X-LockLLM-Abuse-Action': 'allow_with_warning',  # Monitor first
        'X-LockLLM-Scan-Action': 'block'                 # Block injections
    }
)

Link to section: AI-Powered Content PlatformsAI-Powered Content Platforms

Detect bot-generated content being submitted through your AI-powered tools. Content analysis catches template patterns and automated structures that indicate non-human usage.

Link to section: Combining with Other FeaturesCombining with Other Features

Abuse detection runs alongside all other LockLLM features:

Threat detection: Both run independently. A request can be flagged for abuse AND injection simultaneously.
Custom policies: Policy checks are separate from abuse detection. You can use both to catch different types of problems.
PII detection: PII scanning is independent of abuse detection.
Smart routing: Routing decisions are made regardless of abuse detection results.
Prompt compression: Compression happens after all security checks, including abuse detection.

Recommended production configuration:

const openai = new OpenAI({
  apiKey: process.env.LOCKLLM_API_KEY,
  baseURL: 'https://api.lockllm.com/v1/proxy/openai',
  defaultHeaders: {
    'X-LockLLM-Scan-Action': 'block',              // Block injection attacks
    'X-LockLLM-Policy-Action': 'block',             // Block policy violations
    'X-LockLLM-Abuse-Action': 'allow_with_warning'  // Monitor abuse patterns
  }
})

Start with allow_with_warning for abuse detection to understand your traffic patterns, then switch to block when you're confident in the detections.

Link to section: PricingPricing

Abuse detection is completely FREE. There is no charge for enabling abuse detection, regardless of whether abuse is detected or not. This makes it a zero-risk feature to enable for any production application.

Link to section: LimitationsLimitations

Pattern analysis warm-up - Behavioral pattern detection (rapid requests, duplicates, bursts) requires several requests from the same API key to establish a baseline. Content analysis (bot detection, repetition, resource exhaustion) works from the very first request.
Legitimate high-volume usage - Applications with legitimately high request volumes should test with allow_with_warning first to ensure the detection aligns with their usage patterns before enabling blocking.

Link to section: FAQFAQ

Link to section: Does abuse detection track patterns across sessions?Does abuse detection track patterns across sessions?

Yes. Pattern analysis is tracked per API key over a rolling time window. This means rapid requests, duplicate prompts, and burst patterns are detected across multiple sessions as long as the same API key is used.

Link to section: Is there a warm-up period before pattern detection works?Is there a warm-up period before pattern detection works?

Content analysis (bot detection, repetition, resource exhaustion) works immediately on the first request. Pattern analysis (rapid requests, duplicates, bursts) needs a few requests to build a baseline. We recommend starting with allow_with_warning to understand your application's normal patterns.

Link to section: Can legitimate high-volume users be flagged?Can legitimate high-volume users be flagged?

It's possible for legitimate automated usage (e.g., batch processing) to trigger pattern-based detections. This is why we recommend starting with allow_with_warning to monitor your traffic. If you have legitimate high-volume use cases, you can use separate API keys for automated and user-facing traffic.

Link to section: Does abuse detection affect safe, normal requests?Does abuse detection affect safe, normal requests?

No. Content analysis only flags requests with clear abuse indicators (bot patterns, extreme repetition, oversized inputs). Normal user prompts, even complex ones, are not flagged. Pattern analysis only triggers on statistically anomalous behavior, not typical usage.

Link to section: Can I use abuse detection with the scan endpoint?Can I use abuse detection with the scan endpoint?

Yes. Abuse detection works in both the scan endpoint (/v1/scan) and proxy mode (/v1/proxy). Note that pattern analysis is most effective in proxy mode where requests are tracked per API key over time.

Link to section: Is abuse detection data stored or logged?Is abuse detection data stored or logged?

Request patterns are tracked temporarily for the analysis window. No prompt content is stored as part of abuse detection. Only detection results (abuse types, scores, indicators) are included in your activity logs.