THE CROSSING — field notes
About Crosswalk Vault Field notes crows-nest.tech ↗
est. 2026 · solo curated ● live free to read · always

Still in the old
world. Crossing anyway.

Active duty, 10+ years of traditional red team, one AI task order under the LLC, OSAI in progress, and 12 months left before I can go all-in. This site documents the crossing in real time — not after the fact.

§ The Crossing

Mid-crossing. Eyes open.

StatusActive duty · 12 mo out
BackgroundTraditional red team, 10+ yrs
AI work1 task order · OSAI in progress
VoiceFirst-person, opinionated
StanceVendor-neutral
CadenceAs I learn

I'm still active duty. Traditional red team is still my day job — physical and logical, planning and execution, the full ops cycle. The AI work is real but early: one task order through the LLC, OSAI underway, about 12 months before I can shift the balance.

That's exactly why this site exists now, not later. The crossing is more useful to document in progress than in retrospect. The friction points — where traditional tradecraft transfers cleanly, where it misleads you, where AI systems have no equivalent — are sharpest when you're living in both worlds at once.

Most AI security content comes from people who've already crossed. What's missing is the view from the bridge itself. That's what this is. If you're somewhere in the same transition — strong traditional background, starting to take AI systems seriously as targets — take what's useful.

All views are the author's own and do not represent any current or past employer. Content is published in a personal capacity.

§ 01

The Crosswalk.

trad ↔ ai · 16 of 30+ mapped

Same kill chain. Different substrate. The mental models that built strong network red teamers translate directly to attacking AI systems — once you know which knob maps to which. This is the living index. Each entry expands into its own deep-dive over time.

Traditional
AI Equivalent
01
Reconnaissancein progress
Port scans, OSINT, banner grabbing
Model fingerprinting
Probing for base model, system prompt leakage, embedding model ID
02
Exploitationin progress
Buffer overflows, SQLi, RCE
Prompt injection
Direct & indirect injection, jailbreaks, context window smuggling
03
Local exploits, token theft, sudo abuse
Tool / role escalation
Coercing agents into restricted tool calls or system roles
04
Lateral movementin progress
Pass-the-hash, RDP pivoting
Agent-to-agent pivoting
Compromising one agent to reach another via shared tools or memory
05
Persistencein progress
Backdoors, scheduled tasks, rootkits
Memory & weight poisoning
Long-term memory injection, training-data backdoors, RAG corpus seeding
06
DNS tunneling, covert channels
Model & data extraction
Training data leakage, model stealing, embedding inversion
07
Phishing, pretexting
Persona & roleplay attacks
DAN-style framing, authority spoofing, fictional-context bypass
08
SYN floods, resource exhaustion
Sponge & cost attacks
Token-burning prompts, infinite loops in agentic systems
09
Defense evasionin progress
Log tampering, obfuscation, living-off-the-land
Guardrail bypass
Adversarial suffixes, token smuggling, encoding tricks, multilingual pivots
10
Mimikatz, keylogging, hash dumping
System prompt extraction
Leaking instructions, API keys, and config embedded in context
11
Dependency confusion, malicious packages, poisoned build pipelines
Model & plugin supply chain
Poisoned HuggingFace models, malicious MCP servers, backdoored fine-tune datasets
12
Deliveryin progress
Phishing attachments, drive-by downloads, malicious docs
Indirect injection delivery
Payloads embedded in documents, emails, or web pages that get ingested by a RAG pipeline
13
Collectionin progress
Keylogging, screen capture, file staging
Context window harvesting
Extracting conversation history, injected data, or inferred KB contents from model responses
14
Beaconing, C2 frameworks, covert channels
Covert LLM channels
Using an LLM as a C2 relay; steganographic output encoding; exfil via model responses
15
Compromising a trusted third party to reach the target
Tool & integration abuse
Abusing trusted tool calls, MCP integrations, or orchestrator permissions an agent inherits
16
Automated input mutation, crash analysis, coverage-guided fuzzing
Automated adversarial probing
LLM-assisted jailbreak generation, systematic guardrail enumeration, red-team-as-code
§ 02

The Vault.

six shelves · curated, not aggregated
// 01

Frameworks, side-by-side.

The standards I actually reference, mapped against each other so the seams show. What each one covers, what it misses, and when to reach for it.

  • MITRE ATT&CK ↔ ATLAS
  • OWASP Top 10 ↔ LLM Top 10
  • PTES ↔ NIST AI 100-1
  • NIST AI RMF — governance layer
  • EU AI Act — risk tiers & scope
  • OWASP Agentic Security — in progress
  • BSIMM AI — maturity benchmarking
// 02

Tooling, opinionated.

What earns a place in my toolkit and what doesn't. Both worlds, no vendor pitches.

  • Intercept: Burp Suite Pro, Caido
  • LLM scanning: Garak, PyRIT, Promptfoo
  • Evals: Inspect (UK AISI), HarmBench, CyberSecEval
  • Guardrail testing: NeMo Guardrails, LLM Guard
  • Recon: Nuclei + custom LLM templates
  • Agentic / RAG: roll-your-own — no mature tooling yet
// 03

Labs & ranges.

Where to actually break things. Free first, difficulty honest, time-to-flag noted. Self-hosted beats guided every time for building real muscle memory.

  • Gandalf (Lakera) — prompt injection ladder, free
  • HackTheBox AI tracks — flag-based, pairs with CAISA
  • OWASP AI Goat — LLM Top 10 scenarios, self-hosted
  • Crucible (Dreadnode) — CTF-style, real model endpoints
  • DEF CON AI Village CTF — archive worth running off-season
  • Damn Vulnerable LLM Agent — agent-specific attack surface
  • Self-hosted RAG stack — ChromaDB + Ollama + LangChain
  • Vulnerable MCP server — indirect injection via tool responses
// 04

Reading list.

Papers, distilled for operators. What matters this week, what mattered last year.

  • arXiv cs.CR — weekly firehose; prompt injection, agent attacks, LLM tradecraft before it hits blogs
// 05

Crossing guides.

For traditional pentesters going AI, and ML folks learning to think adversarially. Written from the middle of the crossing, not the other side.

  • OSAI (OffSec) — practitioner-grade, in progress
  • HTB CAISA — lab-heavy, lower cost than OffSec
  • SANS SEC595 / GAISC — defensive angle, reads on contracts
  • TCM Security AI — practical, no cert, fast-updated
  • Skip: EC-Council AI — marketing, not practitioner-grade
  • OSCP transfers: methodology, report discipline, proof of exploitation
  • No cert covers yet: agentic attacks, RAG poisoning, embedding recon
  • Reading order: ATLAS → OWASP LLM Top 10 → arXiv cs.CR → Garak → lab
// 06

Field writeups.

Sanitized observations and methodology notes. Technique over name-dropping.

  • Engagement patterns
  • Novel technique notes
  • Tool builds & teardowns
  • What broke. Why it broke.
§ 03

Field notes.

latest writing
essay 5 min read NEW

Why AI red teaming isn't pentesting (and why the muscle memory still helps).

The reflexes that make a strong network red teamer don't transfer cleanly. They transfer well — once you know what to recalibrate. A first attempt at naming the deltas.

Read note →
teardown coming soon

Garak v0.10 — what's actually new for working operators.

Skipping the changelog summary. What I tried, what worked, what's marketing, and where it still has gaps for engagement-grade probing.

In progress
mapping coming soon

ATT&CK persistence → memory poisoning: the analogy and where it breaks.

Persistence in classic ops is about staying in. In agentic systems, it's about staying influential. Same instinct, different substrate. Worked example with a vector store.

In progress
paper coming soon

Indirect prompt injection in MCP servers — the operator's reading.

Distilled for people who have to actually exploit or defend this in the next 30 days, not the next conference cycle.

In progress
§ The operator behind this site

Crow's Nest.

The writing here is personal. The formal work — engagements, task orders, assessments — lives at Crow's Nest, an LLC focused on offensive security across traditional and AI systems. Not actively taking new clients right now, but the work is real.

crows-nest.tech ↗
crowsnest.tech
Offensive security. Traditional and AI.
  • Traditional red team
  • AI & agentic system red teaming
  • 1 AI task order completed
  • Accepting work post-transition
the crossing — field notes one operator · both sides always free to read vendor-neutral · opinion-strong crows-nest.tech the crossing — field notes one operator · both sides always free to read vendor-neutral · opinion-strong crows-nest.tech