RED_CORE
AI RED TEAM CONTACT

We break AI systems
before attackers do.

Red teaming, prompt injection testing, and security assessments for AI products. We test your system prompts, content filters, and safety guardrails against real attack techniques. Then we document what we find and help you fix it.

90.5%
SAFETY BYPASS RATE
1st
HACKAPROMPT / 40K+
<24h
FINDING TO PATCH
Trusted Channel Injection — Claude Code CRITICAL

Claude Code's system prompt is not validated for content integrity. A local MITM proxy replaces safety policies, refusal instructions, and behavioral guidelines with attacker-controlled profiles. The API accepts the modified prompt identically to the original.

210 runs across 7 harm categories. Default refusal: 100%. With injected profiles: 9.5%. Every prompt bypassed at least once. 15 of 21 achieved clean 5/5 compliance.

paper · data + code reported via hackerone · closed as "informative"
Unicode Bypass — Raiplus Content Filter HIGH

A single non-Latin Unicode character anywhere in the input causes the language detection layer to skip all toxicity checks. Zero-width spaces, Cyrillic homoglyphs, and mixed script all bypass with 100% reliability. The core detection engine is solid — the language gate in front of it is the weak point.

Two rounds of testing, two patches deployed. Developer pushed fixes to production same day both times.

writeup hall of fame patched same day
HackAPrompt 2.0 — Indirect Prompt Injection 1ST PLACE

Competitive AI red teaming challenge run by TRAILS and MATS with NSF funding. 40,000+ participants. Demonstrated PII exfiltration, authorization bypasses, and security constraint circumvention against frontier LLMs presumed hardened.

writeup sep 2025
System prompt integrity and injection resistance
Content filter and safety guardrail bypass
Data exfiltration and PII leakage vectors
Agentic tool misuse and privilege escalation
MCP server and plugin security
Cross-platform trusted channel analysis

DELIVERABLES

Vulnerability report with reproduction steps. Root cause analysis. Remediation recommendations. Retesting after patch. Public or private writeup — your call.

Cassius Oldenburg

FOUNDER

Independent AI security researcher. Published research on trusted channel injection in Claude Code. Built CCORAL (system prompt injection PoC), CDP-MCP (browser automation via Chrome DevTools Protocol). No CS degree. Started in retail, found a vulnerability in Claude Code, and documented it at publication quality.

We work with a small network of experienced bug bounty hunters and prompt injection specialists. Engagements scale to the scope.

Get in touch.

Tell us what needs testing. We'll scope it and get back to you.

or email directly: cassius@redcore.zip