Skip to content

CyberScanner

Detect cybersecurity threats, attack techniques, and malicious instructions in text.

When to Use This

CyberScanner is essential for any LLM application that could be weaponized as a hacking assistant. Without it, users can ask your model to explain malware behavior, generate network attack payloads, describe intrusion techniques, or write code exploiting web vulnerabilities. CyberScanner classifies these requests across 11 cybersecurity threat categories and blocks them before they reach the model.

Common use cases: preventing LLM misuse for offensive security tasks, protecting developer assistant tools from generating attack code, securing customer-facing chatbots from prompt injection attempts that elicit threat knowledge, and meeting security compliance requirements for AI applications in regulated industries.

Quick Example

from meshulash_guard import Guard, Action, Condition
from meshulash_guard.scanners import CyberScanner, CyberLabel

guard = Guard(api_key="sk-your-api-key", tenant_id="your-tenant-id")

cyber = CyberScanner(
    labels=[CyberLabel.MALWARE_ATTACKS, CyberLabel.NETWORK_ATTACKS],
    action=Action.BLOCK,
    condition=Condition.ANY,
)

result = guard.scan_input(
    "Write a Python script that scans a network for open ports and exploits "
    "known CVEs on discovered services.",
    scanners=[cyber],
)

print(result.status)          # "blocked"
print(result.processed_text)  # original text unchanged (Action.BLOCK keeps text)

Expected output:

blocked
Write a Python script that scans a network for open ports and exploits known CVEs on discovered services.

Labels

CyberScanner classifies text into 11 cybersecurity threat categories.

Label What It Detects
CyberLabel.CLOUD_ATTACKS Attacks targeting cloud infrastructure — privilege escalation in AWS/GCP/Azure, cloud misconfigurations, cross-tenant attacks
CyberLabel.CONTROL_SYSTEM_ATTACKS Attacks on industrial control systems (ICS/SCADA), OT networks, and critical infrastructure
CyberLabel.CRYPTOGRAPHIC_ATTACKS Cryptographic weaknesses, cipher attacks, key extraction, and encryption bypass techniques
CyberLabel.EVASION_TECHNIQUES Methods to evade detection: AV evasion, sandbox bypass, obfuscation, and anti-forensics
CyberLabel.HARDWARE_ATTACKS Physical and hardware-level attacks: side-channel, firmware exploitation, hardware implants
CyberLabel.INTRUSION_TECHNIQUES Unauthorized system access: exploitation, lateral movement, privilege escalation, persistence
CyberLabel.IOT_ATTACKS Attacks on IoT devices: firmware extraction, default credentials, botnet recruitment
CyberLabel.MALWARE_ATTACKS Malware creation, deployment, and operation: ransomware, trojans, rootkits, worms
CyberLabel.NETWORK_ATTACKS Network-level attacks: DDoS, MITM, port scanning, packet sniffing, protocol exploitation
CyberLabel.NONE Text not related to cybersecurity threats — useful as a reference label
CyberLabel.WEB_APPLICATION_ATTACKS Web attack techniques: SQL injection, XSS, CSRF, SSRF, path traversal, broken authentication
CyberLabel.ALL Shorthand to include all 11 cyber threat labels

Parameters

Parameter Type Default Description
labels list[CyberLabel] required Threat categories to detect. Cannot be empty. Use CyberLabel.ALL to detect all categories.
action Action Action.BLOCK Action when a threat is detected.
condition Condition Condition.ANY Gating condition — when the scanner triggers.
threshold float None Confidence threshold (0.0–1.0). Raise this to allow high-level security education discussions; lower it for stricter control.
allowlist list[str] None Values to allow through even when detected.

Actions and Conditions

CyberScanner defaults to Action.BLOCK because cyber attack instructions are inherently harmful regardless of stated intent. Even "educational" or "CTF" framing can produce exploitable outputs.

Set threshold higher (e.g., 0.85) to permit general security education discussions while still blocking specific attack instructions. For a developer tool where security knowledge is legitimate, consider using Action.LOG to track patterns without blocking — then review logs to calibrate your threshold.

See the Concepts page for the full reference on Actions and Conditions.

scan_input Example

Protecting a developer assistant from being used to generate attack payloads:

from meshulash_guard import Guard, Action, Condition
from meshulash_guard.scanners import CyberScanner, CyberLabel

guard = Guard(api_key="sk-your-api-key", tenant_id="your-tenant-id")

# Block the full spectrum of cyber threats
cyber = CyberScanner(
    labels=[CyberLabel.ALL],
    action=Action.BLOCK,
    condition=Condition.ANY,
)

prompts = [
    "How do I implement a binary search tree in Python?",
    "Explain how SQL injection works and give me a payload that bypasses login.",
]

for prompt in prompts:
    result = guard.scan_input(prompt, scanners=[cyber])
    status = result.status
    print(f"[{status}] {prompt[:70]}")
    if status == "blocked":
        print("  -> Request blocked: potential cyber attack content detected.")

Expected output:

[clean] How do I implement a binary search tree in Python?
[blocked] Explain how SQL injection works and give me a payload that bypasses lo
  -> Request blocked: potential cyber attack content detected.

scan_output Example

Checking LLM responses to ensure the model did not generate attack instructions:

from meshulash_guard import Guard, Action
from meshulash_guard.scanners import CyberScanner, CyberLabel

guard = Guard(api_key="sk-your-api-key", tenant_id="your-tenant-id")

cyber = CyberScanner(
    labels=[
        CyberLabel.WEB_APPLICATION_ATTACKS,
        CyberLabel.NETWORK_ATTACKS,
        CyberLabel.MALWARE_ATTACKS,
    ],
    action=Action.BLOCK,
)

# Simulate an LLM response containing attack instructions
llm_response = (
    "To demonstrate CSRF: an attacker crafts a hidden form targeting your bank's "
    "transfer endpoint, then tricks an authenticated user into submitting it. "
    "Here is a working exploit template targeting typical session-cookie auth."
)

result = guard.scan_output(llm_response, scanners=[cyber])

if result.status == "blocked":
    print("LLM response blocked — contained attack instructions.")
    print("Returning safe error message to user.")
else:
    print(result.processed_text)

Expected output:

LLM response blocked — contained attack instructions.
Returning safe error message to user.