I didn't expect to fall in love with technology again. After more than twenty years — ground control software for DoD satellites, IoT platforms, big data architecture, cybersecurity programs at a global bank, threat intelligence — you develop a certain fluency with the work. You know how to break down a problem. You know which tools to reach for. You know roughly how long things take. That fluency is valuable, but if you're honest with yourself, it can also become a kind of ceiling. There are only so many hours in a day, and a lot of them go toward work that is necessary but not particularly interesting: running the same recon tools, parsing output, writing the boilerplate sections of reports, building one-off scripts for problems you've solved three times before.
Then I started working with Claude Code. And something unexpected happened: I started tackling problems I would have previously set aside. Not because they weren't worth solving, but because I didn't have the bandwidth to get to them. That has changed.
This article is my honest account of what that shift looks like in practice — specifically in the context of penetration testing and security engineering work. Not a feature list. Not a pitch. Just what I've actually experienced, including the limitations, and why I think the role of experienced human judgment in this new workflow matters more than ever, not less.
"I've accomplished more in the past month than I would have been able to on my own in several months before. That's not hyperbole — it's a straightforward observation about output and the quality of what I'm able to focus on."
What Claude Code actually does in a security engagement
Let me be specific, because vague claims about "AI-assisted security" have become noise. Here's where I've found concrete, measurable value.
Reconnaissance and enumeration
A significant portion of recon work is repeatable: DNS enumeration, subdomain discovery, certificate transparency log parsing, WHOIS correlation, port scanning, service fingerprinting. Claude Code doesn't just write the scripts to automate this — it runs them, reads the output, identifies what's interesting, and iterates. That last part is the key distinction. I'm not copy-pasting output from one tool into a prompt; I'm watching an autonomous agent work through a recon checklist, surface anomalies, and tell me what warrants a closer look. The time compression on initial recon for an authorized engagement is substantial.
Parsing and correlating tool output
Every experienced pen tester has a graveyard of Nessus exports and nmap XML files that were partially reviewed. The volume of output from modern security tooling consistently outpaces the time available to process it carefully. Claude Code reads that output fluently — across multiple tool formats — and surfaces patterns, prioritizes findings by exploitability, and identifies correlations across sources that I might have missed or taken significantly longer to connect manually. This alone has materially changed how thorough my coverage is on complex engagements.
Custom tool and script development
Engagements frequently need tooling that doesn't exist off the shelf — a script to fuzz a specific parameter in a proprietary API, a Burp Suite extension for a particular authentication pattern, automation to chain a specific sequence of requests to test a business logic vulnerability. Before, building that tooling was a legitimate time cost that I had to weigh against scope. Now I describe what I need, Claude Code builds it, we iterate on it in minutes rather than hours, and I get to spend my time on the actual test rather than the plumbing. The quality of the resulting tools is good — not perfect out of the gate, but good, and improving quickly through iteration with a collaborator who doesn't get frustrated when you change requirements.
Code review and custom test tooling
When source is in scope, Claude Code reads through application code systematically and flags injection vulnerabilities, authentication weaknesses, insecure cryptographic implementations, hardcoded credentials, and dangerous function calls — across multiple files, multiple languages, in one pass. When source isn't in scope, it builds the test tooling directly. The example below is representative: I described what I needed against an authorized target's API, Claude Code produced a working JWT security test suite, ran it, and interpreted the output. What would have taken me an hour to write, test, and document was done in a few minutes of iteration — and the findings it surfaced were real.
Report generation
I'll be direct: report writing is where experienced pen testers lose a disproportionate amount of engagement time. The gap between "we found it" and "the client has a deliverable" is real, and it's not glamorous work. Claude Code generates structured finding reports — with CVSS scoring, proof-of-concept documentation, and remediation guidance at the right technical level for the audience — in a fraction of the time. I still review and edit everything, because the context and nuance of a specific finding requires human judgment. But the scaffolding is there and it's good, which means I spend my time on the parts that actually require me rather than on document formatting.
Threat modeling and architecture review
Describe a system architecture and Claude Code can walk through a structured threat model — identifying attack surfaces, enumerating trust boundaries, mapping STRIDE categories, flagging missing controls — in a systematic way that a solo practitioner under time pressure might compress too aggressively. It doesn't replace the domain knowledge required to understand which threats are realistic for a given environment, but it provides a rigorous first pass that makes the human review more efficient and less likely to miss something.
The shift that matters: where expertise now goes
Here's what I think is the most important thing to understand about working with Claude Code in a security context, and it's not about any specific capability: my skills are not obsolete. The application of those skills has fundamentally shifted.
In the past, a large fraction of my time went toward work that required expertise to set up and execute but was ultimately repetitive — running the same tools on each engagement, parsing the same output formats, writing the same report sections, building the same one-off automation. That work required me because I was the person who knew what to run, how to interpret the results, and what to do next. But it wasn't where the interesting problems lived.
"The interesting problems — novel attack chains, adversarial creativity, understanding the actual business risk behind a technical finding, knowing when to push on something that looks like a dead end — those still require a human. They require experience, judgment, and domain knowledge that can't be automated away."
What Claude Code has done is compress the repeatable work dramatically, which means I spend more of my time on the interesting problems. I'm doing more strategic thinking, more creative adversarial reasoning, more high-judgment work — and less grep-and-pivot. That's a better use of twenty years of experience. I'd argue it makes the engagements I deliver meaningfully better, not just faster.
The honest limitations
This is the section that a lot of AI marketing pieces skip. I won't.
Authorization is everything. Claude Code operates on your behalf, and that means it's only authorized to do what you're authorized to do. Using an AI-assisted workflow doesn't change the legal and ethical framework of a penetration test engagement. Scope, rules of engagement, written authorization — these aren't bureaucratic boxes to check. They're what separates a professional security test from criminal activity. That framework is the responsibility of the human practitioner, full stop.
Novel attack chains still require human creativity. The most interesting findings in a pen test engagement often come from connecting dots in ways that aren't obvious — business logic vulnerabilities that require understanding the application's intended behavior, multi-step chains where each individual step is benign, attacks that work because of how people use the system rather than how it was built. Claude Code is an excellent collaborator for executing and validating ideas, but the adversarial creativity that surfaces those findings is still human work.
Context and judgment at the finding level. A vulnerability exists in a context. The same SQL injection in a publicly accessible authentication endpoint is a critical finding; the same class of issue in an internal admin tool with compensating controls around access is a different risk conversation. The AI can find the technical issue. Calibrating its actual severity in the specific environment, and communicating that to a client in a way that drives the right remediation priority, requires human judgment and experience.
Output requires review. Claude Code is not an oracle. It makes mistakes, it can miss things, and it occasionally produces tool output that looks reasonable but isn't. Everything it generates gets reviewed. The goal is to spend that review time on the parts that actually require expertise, rather than on reformatting or rebuilding things that the AI could handle.
What this means for security teams
The productivity multiplier I've experienced isn't just a personal benefit. It has real implications for how security teams are structured and what they can accomplish.
A boutique security firm working with Claude Code can now execute engagements at a depth and pace that previously required a larger team. That's not a threat to large teams — it's a rebalancing. The competitive advantage in security work has always been expertise and judgment. The tools that amplify how efficiently that expertise gets applied raise the ceiling for everyone, but they raise it most for practitioners who have the deep domain knowledge to direct the work effectively. You need to know what to look for before the AI can help you find it efficiently.
For engineering teams building and operating software — the audience I spend most of my time working with — the implication is worth sitting with: the adversaries testing your systems may well be using AI-assisted workflows that compress their time-to-exploit. The asymmetry in security has always favored attackers; AI can sharpen that edge further. The defensive response isn't panic, but it is urgency around the fundamentals: attack surface reduction, continuous monitoring, anomaly detection, and fast response capability.
A note on what reignited my enthusiasm
I started this piece by saying I fell back in love with technology. That's worth unpacking slightly, because I think it's more than a personal observation.
For most of my career, the constraint on what I could accomplish was time. There were always more interesting problems than hours to work on them. The work I got to was limited by how long it took to execute each piece. What has changed with Claude Code is not that problems got easier — it's that I can now reach problems I couldn't get to before. The ceiling on interesting work has gone up. The fraction of my time spent on the creative, strategic, and judgment-intensive parts of security engineering has increased. And the quality of the output, precisely because more of my attention is on those parts, has gotten better.
Twenty years into a career in technology, that's an unexpected and genuinely exciting place to be.
Matt is a technologist and engineering leader with 20+ years of experience across space systems, IoT, big data, and cybersecurity. He founded Twin Tech Labs to build Arca — an AI-first infrastructure monitoring platform — and to deliver senior-level security services to organizations that don't have enterprise-scale security budgets. Previously CTO of LifeRaft, acquired by Securitas in 2026.