There is always a moment before the mistake.
It is the meeting where someone says the model works.
The dashboard looks clean.
The demo lands.
The room nods.
People start talking about speed, efficiency, transformation.
Then comes the dangerous sentence.
Let’s connect it.
That is where the real story begins.
In defense environments, artificial intelligence is not risky because it is futuristic. It is risky because it is useful. Useful things get connected. Connected things touch data. Data touches missions. Missions touch consequences. And in the Department of Defense, consequences do not stay on slides very long.
This is why AI cannot be handled like a convenience tool dropped into the workflow and “secured later.” Not on NIPRNet. Not on SIPRNet. Not on JWICS. Not anywhere that information, access, trust, and operational decisions carry weight. In these environments, AI has to be treated as a mission system from the start, one that must be bounded, monitored, authorized, and continuously governed. That is exactly where zero trust architecture, the Risk Management Framework, and AI-specific cybersecurity guidance begin to matter in practical, operational terms (Department of Defense Chief Information Officer [DoD CIO], 2022; Joint Task Force, 2018; DoD CIO, 2025).
The old security model was built around a comforting fiction: once inside the perimeter, trust had already been earned. But AI breaks that fiction apart. AI does not simply sit in one place and do one thing. It pulls from multiple sources, responds to unpredictable human prompts, connects to tools, changes through model updates, and creates outputs people may trust too quickly. It turns one bad assumption into ten bad actions unless the architecture around it is disciplined enough to say no, log the event, constrain the retrieval, enforce the label, and verify the identity again.
That is the central point of this article. Zero trust is not some extra security layer bolted onto AI after procurement. It is the operating model that makes AI survivable inside defense networks.
The First Lie: Thinking the Model Is the System
The first mistake in defense AI implementation is also the most common. People look at the model and think they are looking at the system.
They are not.
The model is only the visible part, the sharp end of the spear, the polished object in the demonstration. The actual AI system includes the user interface, identity services, orchestration logic, retrieval layer, connected repositories, APIs, logging architecture, service accounts, policy enforcement points, tuning pipeline, update pathway, and administrative controls. The DoD AI Cybersecurity Risk Management Tailoring Guide makes this clear by addressing AI risk across the life cycle rather than isolating the model as if it exists on its own (DoD CIO, 2025).
That distinction matters because most real failures will not come from the model alone. They will come from everything wrapped around it.
What this looks like technically
Take a notional maintenance assistant deployed on NIPRNet. The visible function is simple: ask a question about a technical order or maintenance issue and receive a summarized answer. But underneath that simplicity is a web of dependencies. The assistant may pull from technical manuals, historical work orders, local unit guidance, engineering notes, equipment databases, and workflow tools. It may rely on a service account to retrieve content, on role-based access control to separate users, on document tags to filter sensitive material, and on logs to reconstruct how an answer was produced.
If one service account has overly broad read permissions, the system can expose information a user should not have seen. If the retrieval layer ignores policy tags, the model may generate an answer from mixed-permission content. If administrators can modify sources or system prompts without approval, trust in the system becomes trust in whoever changed it last.
What this means for leaders
For executives and commanders, the lesson is direct: never approve “the AI tool” as though it were a single object. Ask what data it touches, what repositories it searches, what actions it can take, what identities it uses, what logs it creates, and how it changes over time. If nobody can explain that clearly, then the organization is not evaluating a capability. It is gambling on a black box.
And black boxes are seductive. They make everything feel easy right before it becomes expensive.
Start With the Mission, or the Mission Will Punish You Later
The second mistake is starting with the technology instead of the mission.
That sounds harmless. It rarely is.
The wrong opening question is, “How do we deploy an LLM in our unit?”
The right opening question is, “What mission problem are we solving, for whom, on which network, using what data, with what level of human oversight?”
Risk Management Framework (RMF) begins with preparation and categorization for this exact reason. Systems cannot be secured intelligently until the organization understands their purpose, environment, and impact (Joint Task Force, 2018). NIST’s AI RMF reinforces this by emphasizing governance and context, since AI risk depends heavily on how the system is used rather than on the model alone (National Institute of Standards and Technology [NIST], 2023).
A practical example
Consider a cyber defense squadron that wants to use AI to help analysts triage alerts faster. One version of that project is shallow and dangerous: “Deploy AI to summarize security alerts.” Another version is disciplined: “Deploy AI in the unclassified enclave to summarize approved SIEM alert data, correlate it against adjudicated historical tickets, and retrieve only approved response playbooks for analyst review. The AI may suggest likely severity, but final classification remains with the analyst.”
Those two statements do not just sound different. They produce different architectures, different access boundaries, different testing requirements, and different authorization outcomes.
The executive view
Leaders should insist on precision at this stage because precision is what turns security into something concrete. A vague use case creates vague boundaries. A precise use case creates a design target. It gives the cybersecurity team, the mission owner, and the authorizing chain something real to engineer toward.
Mission clarity is not paperwork. It is defensive architecture in sentence form.
Draw the Boundary Before You Plug in the Power
This is where serious work begins. Before the AI connects to anything, the organization must define two boundaries: the system boundary and the data boundary.
That sounds administrative. It is actually strategic.
The system boundary determines what components fall inside the authorized capability. The data boundary determines what information the system is allowed to ingest, retrieve, process, store, and output. DoD zero trust guidance emphasizes that security must be built around data-centric access decisions and conditional trust, not around broad location-based assumptions (DoD CIO, 2022; DoD CIO, 2022b).
The technical reality
A defense AI implementation should answer questions like these before a single connector is enabled:
- What repositories may the system query?
- Which identities are used for retrieval?
- How are documents tagged and filtered?
- What tools can the model invoke?
- What outputs are logged?
- What admins can change model settings or add new data sources?
- What model versions are approved on which enclave?
- How does the system prove that a given answer came from approved sources?
This is where zero trust stops being a slogan and becomes a wiring diagram.
A realistic operational scenario
Imagine an intelligence support organization wants a retrieval assistant on SIPRNet to help analysts search approved summaries and internal analytical references. The quick-and-dirty design is to give the service account broad read access to the document store and let the model retrieve “whatever seems relevant.” The disciplined design is different. Retrieval passes through a policy enforcement layer that evaluates document tags, user role, organizational access attributes, and mission need before content is ever packaged into the model’s context window.
The first design is convenient.
The second design is defensible.
What leaders need to understand
If the AI can search “everything,” then security is already broken at the design level. Leaders should demand a clear explanation of what the system may touch and what it may never touch. If that answer is fuzzy, then the architecture is fuzzy. And fuzzy architecture is how otherwise smart organizations end up explaining obvious failures after the fact.
RMF Is the Skeleton That Keeps AI From Collapsing Into Hype
In some circles, RMF is treated like a bureaucratic tax. In reality, RMF is one of the few disciplines standing between a useful pilot and an operational embarrassment.
NIST SP 800-37 Rev. 2 defines RMF as a system life cycle process for managing security and privacy risk through preparation, categorization, control selection, implementation, assessment, authorization, and continuous monitoring (Joint Task Force, 2018). The DoD AI Cybersecurity Risk Management Tailoring Guide does not replace RMF. It sharpens it for AI, recognizing that AI systems introduce unique dependencies and dynamic behaviors that demand additional scrutiny (DoD CIO, 2025).
Why this matters technically
RMF forces units to confront uncomfortable but necessary questions:
- What is the impact if the output is wrong but plausible?
- What happens if the retrieval corpus is poisoned or stale?
- What if a privileged user modifies the system prompt?
- What if the model changes behavior after an update?
- What if the AI uses a service account with more access than the user asking the question?
- What if the audit logs do not explain how a sensitive answer was produced?
These are not future problems. These are present-tense engineering realities.
Why this matters for leadership
For leaders, RMF is not the thing slowing innovation down. It is the thing separating innovation from irresponsibility. It creates a disciplined record of what the system is, what it depends on, what risks it carries, how those risks are treated, and what level of trust is justified.
Without that structure, organizations do not really have adoption. They have improvisation.
Control Tailoring: Where Secure AI Stops Sounding Good and Starts Becoming Real
General policy statements are easy. Tailored control implementation is where the real work lives.
The DoD AI Cybersecurity Risk Management Tailoring Guide exists because AI systems do not fail exactly like traditional software. They depend on training data, model versions, retrieval behavior, connected tools, and changing contexts. That means traditional baseline controls remain necessary, but AI systems also require deeper control tailoring around access, data flow, model management, evaluation, and monitoring (DoD CIO, 2025).
Technical example: a knowledge assistant with action capability
Suppose a headquarters staff deploys an internal AI assistant that can answer policy questions, draft correspondence, and open workflow tickets. On paper, this sounds efficient. But now the system does more than retrieve. It acts.
That shifts the risk profile immediately.
A secure design would separate read-only retrieval functions from action-based functions. It would require explicit authorization checks for tool use, enforce role-based permissions per function, log every invocation, and potentially require human approval before executing high-impact actions. It would also segment administrative privileges so that no single person can quietly expand the assistant’s authority without oversight.
Executive explanation
This is the point leaders need to hear clearly: an AI that can search is one category of risk. An AI that can search, decide, and initiate action is another category entirely. Governance must scale with capability. Otherwise, convenience becomes delegation without accountability.
Data Tagging: The Boring Thing That Saves You
Nobody gets excited about data labeling meetings. Nobody writes glowing memos about metadata hygiene. But in defense AI, tagging is one of the places where trust is either built or destroyed.
DoD zero trust guidance emphasizes data-centric protection, policy enforcement, and the need to make access decisions based on well-governed information rather than loose assumptions (DoD CIO, 2022; DoD CIO, 2022b). For AI systems, this matters because the retrieval layer is often the bridge between the user and the information store.
The technical problem
In a retrieval-augmented system, the sequence is straightforward:
- the user submits a query,
- the retrieval layer searches approved sources,
- selected content is inserted into the prompt context,
- the model generates the answer.
If the source material is mislabeled, mixed with content of different access levels, or retrieved by a broadly privileged service account, the model may receive content the user should never have seen. At that point, the security failure has already occurred upstream of the answer.
A grounded example
Picture a wing-level AI assistant meant to answer local policy and process questions. Over time, the system is connected to shared drives, collaboration sites, locally maintained folders, and archived documents. Some documents are properly labeled. Some are not. Some are outdated. Some include draft guidance never meant for broad distribution. The retrieval service account is granted broad access because narrowing the scope is inconvenient.
Then someone asks a simple question. The answer looks polished. It sounds authoritative. It cites content that should never have been in scope.
That is not a model problem. That is a governance failure wearing a model’s face.
Executive explanation
For leaders, the lesson is blunt: data tagging is not busywork. It is one of the security mechanisms that decides whether AI operates within policy or erodes it quietly.
Tool Access: The Moment AI Stops Being a Chatbot and Becomes a Force Multiplier
This is where many organizations underestimate the risk. They focus on what the AI says, not what the AI can do.
That is backwards.
Modern AI systems increasingly connect to tools, APIs, data stores, workflow engines, and automation functions. In defense environments, those connections can transform an assistant into a genuine operational multiplier. They can also transform it into a problem multiplier if permissions, approvals, and logging are weak.
Technical explanation
A retrieval assistant with read-only access to a constrained knowledge base presents one set of risks. An agent that can query dashboards, create tickets, call scripts, update records, or trigger workflows presents another. Under zero trust, each tool invocation must be governed through explicit authorization, scoped privileges, and auditable control points rather than inherited trust from the application itself (DoD CIO, 2022).
A realistic example
Imagine a network operations center deploying an AI assistant to help with incident workflows. At first, it only summarizes reports and recommends playbooks. Later, for efficiency, the team gives it the ability to draft tickets. Then it is allowed to query operational metrics. Then to submit change requests. None of these decisions feels dramatic in isolation.
But eventually the system is no longer “just assisting.” It has become an action surface.
Executive explanation
Leaders should think about AI permissions the way they think about delegation of authority. If the system can influence or initiate downstream activity, then governance must cover not only what it knows, but what it is permitted to trigger. Authority without controls is not innovation. It is drift.
Testing AI Means Testing Behavior Under Pressure
Traditional vulnerability scanning, configuration review, and patch management remain necessary. They are not sufficient.
NIST’s AI RMF emphasizes that AI risk must be measured and managed across context, behavior, trustworthiness, and resilience, not merely through conventional software assurance practices (NIST, 2023). This is especially important in defense systems, where outputs may influence mission decisions even when the system is not the final decision-maker.
A useful way to think about testing
A strong defense AI testing program should work in three lanes:
Infrastructure testing asks whether the environment is hardened, segmented, patched, and protected.
Behavior testing asks how the model responds to adversarial prompts, malformed inputs, prompt injection attempts, retrieval manipulation, unsafe instructions, or ambiguous contexts.
Mission testing asks whether the system behaves acceptably in realistic operational workflows with stressed users, incomplete information, conflicting data, and degraded upstream dependencies.
Example
A cyber analysis assistant performs well in controlled demos, summarizing suspicious activity quickly and cleanly. Then evaluators intentionally craft prompts that try to override formatting rules, suppress caveats, or push the system into false confidence. The system starts sounding decisive where it should sound cautious.
That is a real operational issue. Analysts under pressure are especially vulnerable to polished overconfidence. So the mitigation may include stricter output formats, mandatory confidence language, better prompt hardening, and user training that teaches analysts to challenge machine certainty rather than absorb it.
Executive explanation
For leaders, the headline is simple: a successful demo proves almost nothing by itself. The question is not whether the AI works when conditions are ideal. The question is what it does when the environment is messy, adversarial, rushed, ambiguous, or wrong.
That is where trust is earned.
Continuous Monito ring: Because the System You Approved Will Not Stay the Same
AI systems change. Quietly. Gradually. Then all at once.
A new model version is swapped in. A retrieval source is added. A prompt is revised. A connector is expanded. A team broadens the user base. An admin makes a reasonable-seeming adjustment that changes how the system behaves at scale. NIST RMF treats continuous monitoring as essential for ongoing risk management, and DoD’s AI-specific guidance reinforces the need for reassessment as systems evolve (Joint Task Force, 2018; DoD CIO, 2025).
Technical explanation
For AI, monitoring should capture more than network and authentication telemetry. It should include:
- model version changes,
- connector changes,
- data source changes,
- prompt policy changes,
- tool invocation anomalies,
- retrieval patterns,
- admin activity,
- output policy violations,
- and signs of performance drift.
This is because AI can change materially without looking dramatically different on the surface.
Real-world style example
A support organization fields an AI assistant for internal knowledge retrieval. Six months later, it still “looks the same” to the end user. But under the hood, the model version has changed, new repositories have been connected, new admins have access, and the original logging configuration has drifted from baseline. The capability now operates under different assumptions than the one originally reviewed.
That is not a small issue. That is a different system wearing the same name badge.
Executive explanation
For leaders, continuous monitoring is what turns authorization from a static document into a living risk decision. If the system changes and the organization cannot see those changes clearly, then confidence in the original authorization becomes a memory rather than a control.
Responsible AI and Secure AI Are the Same Fight
The Department of Defense has adopted ethical AI principles that call for systems to be responsible, equitable, traceable, reliable, and governable (U.S. Department of Defense, 2020). Later guidance stresses implementing those principles in practical ways (Department of Defense, 2021).
This matters because those principles map directly to security and operational control.
A traceable system must have logs, provenance, and version accountability.
A reliable system must be tested, monitored, and bounded.
A governable system must be interruptible, reviewable, and subject to human authority.
That is not ethics as decoration. That is implementation discipline.
Executive explanation
For senior leaders, responsible AI is not a separate conversation that happens after the cyber review. It is part of the same architecture of trust. In defense environments, if a system cannot be understood, bounded, overridden, and audited, then it is neither responsible nor secure enough for mission reliance.
What Squadrons and Units Should Actually Do Next
The path forward does not require solving enterprise AI all at once. It requires a disciplined first move, followed by a disciplined second move, and so on until the capability is real enough to trust.
Start with one mission-bounded use case.
Define the system and data boundaries.
Apply RMF early.
Tailor controls to the AI life cycle.
Enforce zero trust across identities, data access, connectors, and tool usage.
Test infrastructure, behavior, and mission performance.
Authorize the capability as an operational system.
Then monitor it continuously and reassess when it changes (DoD CIO, 2022; Joint Task Force, 2018; NIST, 2023; DoD CIO, 2025).
It will feel slower than a commercial pilot project. It is also how you avoid explaining later why a “helpful assistant” became an ungoverned path into sensitive workflows.
In defense, speed matters. But survivability matters more.
Conclusion: The Future of Defense AI Will Belong to the Disciplined
There is always pressure to move faster.
Someone will always say the tool is good enough.
Someone will always want the connector turned on.
Someone will always want the pilot expanded before the governance catches up.
That pressure is not going away. Neither is AI.
So the question for defense organizations is no longer whether AI will enter sensitive environments. It already is. The question is whether it will enter those environments through disciplined architecture, bounded permissions, enforceable policy, clear authorization, and persistent monitoring, or whether it will arrive the way too many technologies arrive: fast, useful, under-governed, and trusted beyond the evidence.
That is what zero trust answers.
That is what RMF organizes.
That is what secure implementation demands.
Because in defense networks, the real danger is not that AI is too powerful.
It is that it is powerful enough to be useful before it is governed enough to be safe.
References
Department of Defense. (2021). Implementing responsible artificial intelligence in the Department of Defense. https://media.defense.gov/2021/May/27/2002730593/-1/-1/0/IMPLEMENTING-RESPONSIBLE-ARTIFICIAL-INTELLIGENCE-IN-THE-DEPARTMENT-OF-DEFENSE.PDF
Department of Defense Chief Information Officer. (2022). DoD zero trust strategy. https://dodcio.defense.gov/Portals/0/Documents/Library/DoD-ZTStrategy.pdf
Department of Defense Chief Information Officer. (2022b). Department of Defense zero trust reference architecture (Version 2.0). https://dodcio.defense.gov/Portals/0/Documents/Library/%28U%29ZT_RA_v2.0%28U%29_Sep22.pdf
Department of Defense Chief Information Officer. (2025). AI cybersecurity risk management tailoring guide. https://dodcio.defense.gov/Portals/0/Documents/Library/AI-CybersecurityRMTailoringGuide.pdf
Joint Task Force. (2018). Risk management framework for information systems and organizations: A system life cycle approach for security and privacy (NIST Special Publication 800-37, Rev. 2). National Institute of Standards and Technology. https://csrc.nist.gov/pubs/sp/800/37/r2/final
National Institute of Standards and Technology. (2023). Artificial intelligence risk management framework (AI RMF 1.0) (NIST AI 100-1). https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf
U.S. Department of Defense. (2020, February 25). DOD adopts 5 principles of artificial intelligence ethics. https://www.defense.gov/News/News-Stories/article/article/2094085/dod-adopts-5-principles-of-artificial-intelligence-ethics/
About the Author
Joe Guerra, M.S.-Computer Science, M.S.-Software Engineering, CASP+, CCSP, is a technology and cybersecurity professional committed to advancing secure digital transformation across government and defense missions. His background in software engineering, cybersecurity, artificial intelligence, and technical leadership positions him to contribute to the development of secure, mission-aligned solutions that meet the operational realities of today’s government environment. Through his work with FEDITC, LLC, Joe is part of an organization that supports critical missions worldwide and delivers specialized capabilities in cybersecurity, cloud services, engineering, software, health IT, and infrastructure. FEDITC distinguishes itself through its focus on secure operational execution, including enterprise cybersecurity program support, RMF-aligned implementation, vulnerability management, DevSecOps, mission application development, and continuous improvement practices designed to help units and squadrons field resilient, compliant, and effective technology solutions.
(FEDITC: https://feditc.com/) EMAIL: [email protected]

Leave A Comment