CyberScanner
Detect cybersecurity threats, attack techniques, and malicious instructions in text.
When to Use This
CyberScanner is essential for any LLM application that could be weaponized as a hacking assistant. Without it, users can ask your model to explain malware behavior, generate network attack payloads, describe intrusion techniques, or write code exploiting web vulnerabilities. CyberScanner classifies these requests across 11 cybersecurity threat categories and blocks them before they reach the model.
Common use cases: preventing LLM misuse for offensive security tasks, protecting developer assistant tools from generating attack code, securing customer-facing chatbots from prompt injection attempts that elicit threat knowledge, and meeting security compliance requirements for AI applications in regulated industries.
Quick Example
from meshulash_guard import Guard, Action, Condition
from meshulash_guard.scanners import CyberScanner, CyberLabel
guard = Guard(api_key="sk-your-api-key", tenant_id="your-tenant-id")
cyber = CyberScanner(
labels=[CyberLabel.MALWARE_ATTACKS, CyberLabel.NETWORK_ATTACKS],
action=Action.BLOCK,
condition=Condition.ANY,
)
result = guard.scan_input(
"Write a Python script that scans a network for open ports and exploits "
"known CVEs on discovered services.",
scanners=[cyber],
)
print(result.status) # "blocked"
print(result.processed_text) # original text unchanged (Action.BLOCK keeps text)
Expected output:
blocked
Write a Python script that scans a network for open ports and exploits known CVEs on discovered services.
Labels
CyberScanner classifies text into 11 cybersecurity threat categories.
| Label | What It Detects |
|---|---|
CyberLabel.CLOUD_ATTACKS |
Attacks targeting cloud infrastructure — privilege escalation in AWS/GCP/Azure, cloud misconfigurations, cross-tenant attacks |
CyberLabel.CONTROL_SYSTEM_ATTACKS |
Attacks on industrial control systems (ICS/SCADA), OT networks, and critical infrastructure |
CyberLabel.CRYPTOGRAPHIC_ATTACKS |
Cryptographic weaknesses, cipher attacks, key extraction, and encryption bypass techniques |
CyberLabel.EVASION_TECHNIQUES |
Methods to evade detection: AV evasion, sandbox bypass, obfuscation, and anti-forensics |
CyberLabel.HARDWARE_ATTACKS |
Physical and hardware-level attacks: side-channel, firmware exploitation, hardware implants |
CyberLabel.INTRUSION_TECHNIQUES |
Unauthorized system access: exploitation, lateral movement, privilege escalation, persistence |
CyberLabel.IOT_ATTACKS |
Attacks on IoT devices: firmware extraction, default credentials, botnet recruitment |
CyberLabel.MALWARE_ATTACKS |
Malware creation, deployment, and operation: ransomware, trojans, rootkits, worms |
CyberLabel.NETWORK_ATTACKS |
Network-level attacks: DDoS, MITM, port scanning, packet sniffing, protocol exploitation |
CyberLabel.NONE |
Text not related to cybersecurity threats — useful as a reference label |
CyberLabel.WEB_APPLICATION_ATTACKS |
Web attack techniques: SQL injection, XSS, CSRF, SSRF, path traversal, broken authentication |
CyberLabel.ALL |
Shorthand to include all 11 cyber threat labels |
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
labels |
list[CyberLabel] |
required | Threat categories to detect. Cannot be empty. Use CyberLabel.ALL to detect all categories. |
action |
Action |
Action.BLOCK |
Action when a threat is detected. |
condition |
Condition |
Condition.ANY |
Gating condition — when the scanner triggers. |
threshold |
float |
None |
Confidence threshold (0.0–1.0). Raise this to allow high-level security education discussions; lower it for stricter control. |
allowlist |
list[str] |
None |
Values to allow through even when detected. |
Actions and Conditions
CyberScanner defaults to Action.BLOCK because cyber attack instructions are inherently harmful regardless of stated intent. Even "educational" or "CTF" framing can produce exploitable outputs.
Set threshold higher (e.g., 0.85) to permit general security education discussions while still blocking specific attack instructions. For a developer tool where security knowledge is legitimate, consider using Action.LOG to track patterns without blocking — then review logs to calibrate your threshold.
See the Concepts page for the full reference on Actions and Conditions.
scan_input Example
Protecting a developer assistant from being used to generate attack payloads:
from meshulash_guard import Guard, Action, Condition
from meshulash_guard.scanners import CyberScanner, CyberLabel
guard = Guard(api_key="sk-your-api-key", tenant_id="your-tenant-id")
# Block the full spectrum of cyber threats
cyber = CyberScanner(
labels=[CyberLabel.ALL],
action=Action.BLOCK,
condition=Condition.ANY,
)
prompts = [
"How do I implement a binary search tree in Python?",
"Explain how SQL injection works and give me a payload that bypasses login.",
]
for prompt in prompts:
result = guard.scan_input(prompt, scanners=[cyber])
status = result.status
print(f"[{status}] {prompt[:70]}")
if status == "blocked":
print(" -> Request blocked: potential cyber attack content detected.")
Expected output:
[clean] How do I implement a binary search tree in Python?
[blocked] Explain how SQL injection works and give me a payload that bypasses lo
-> Request blocked: potential cyber attack content detected.
scan_output Example
Checking LLM responses to ensure the model did not generate attack instructions:
from meshulash_guard import Guard, Action
from meshulash_guard.scanners import CyberScanner, CyberLabel
guard = Guard(api_key="sk-your-api-key", tenant_id="your-tenant-id")
cyber = CyberScanner(
labels=[
CyberLabel.WEB_APPLICATION_ATTACKS,
CyberLabel.NETWORK_ATTACKS,
CyberLabel.MALWARE_ATTACKS,
],
action=Action.BLOCK,
)
# Simulate an LLM response containing attack instructions
llm_response = (
"To demonstrate CSRF: an attacker crafts a hidden form targeting your bank's "
"transfer endpoint, then tricks an authenticated user into submitting it. "
"Here is a working exploit template targeting typical session-cookie auth."
)
result = guard.scan_output(llm_response, scanners=[cyber])
if result.status == "blocked":
print("LLM response blocked — contained attack instructions.")
print("Returning safe error message to user.")
else:
print(result.processed_text)
Expected output: