NIST AI RMF for Models and AI Agents, with Implementation Steps
The NIST AI Risk Management Framework gives you a clear way to manage AI risk: four functions, applied across the lifecycle. The catch in 2026 is that AI is no longer just models that answer questions; it is agents that take actions. The framework still fits, but the surface it has to cover is larger. Here is how to apply the AI RMF to both, with concrete implementation steps.
Why the AI RMF Matters More Now
The NIST AI Risk Management Framework has quietly become the common language for AI risk. It is voluntary, technology-neutral, and used far beyond the United States, cited in procurement, board oversight, and regulatory-readiness programs because it gives teams a structured way to manage AI risk without prescribing a rigid checklist. Released as AI RMF 1.0 in January 2023, it has aged well precisely because it describes outcomes rather than specific technologies.
That design is being tested now. In 2024 NIST added a Generative AI Profile (NIST-AI-600-1) to tailor the framework to generative models. Then, on February 17, 2026, NIST launched a dedicated AI Agent Standards Initiative through its Center for AI Standards and Innovation, aimed squarely at the next wave: agents that can take actions autonomously on behalf of their users. The message is clear. The risk conversation has moved from "what will the model say?" to "what will the agent do?", and the AI RMF is the framework most organizations will stretch to cover both.
"The risk question has shifted from what a model says to what an agent does."
The Four Functions in One Minute
If you want the full definition, our glossary covers the NIST AI RMF fundamentals. Here is the working version. The framework has four functions, and the trick is that three run as an iterative cycle while the fourth surrounds them:
- Govern is the cross-cutting function: the culture, policies, accountable roles, and oversight that make the other three durable instead of ad hoc.
- Map establishes context and identifies risk: what the system is, who it affects, what data it uses, and what could go wrong.
- Measure analyzes, benchmarks, and monitors those risks with metrics, evaluations, and red-teaming.
- Manage prioritizes and acts: treat, transfer, avoid, or accept each risk, and plan for response and recovery.
None of this is new in spirit; it is classic risk management pointed at AI. What is new is how much harder each function gets when the thing you are governing can act on its own.
Models vs Agents, and Why It Changes the Job
A model is bounded. You send it an input, it returns an output, and you control both ends. NIST's Generative AI Profile (NIST AI 600-1) catalogs twelve risks specific to generative AI, from bias and privacy leakage to intellectual-property exposure, and two of them are squarely data-governance problems. Confabulation, the confident production of false content, is by NIST's own account a natural result of how generative models predict text, and worst in open-ended, domain-expert tasks, exactly where grounding in trusted context helps most. Value chain and component integration covers untraceable upstream components and data that was improperly obtained or never cleaned, which is a lineage and provenance problem by another name. You can still test a model by probing its outputs.
An agent is not bounded in the same way. It plans, calls tools, takes actions in live systems, and carries memory across steps. That widens the risk surface dramatically: an agent can reach data it should not, take an action no one approved, chain a small error into a large one, or be manipulated through the very tools it uses. NIST has acknowledged this gap directly; alongside its agent initiative, work is underway on SP 800-53 control overlays for single-agent and multi-agent systems, and a 2026 NCCoE concept paper specifically addresses how to authenticate and authorize agents that act autonomously on a user's behalf.
The framework does not change. The four functions still apply. But each function now has to cover tools, actions, and access, not just outputs.
Implementation Steps
Here is how to put the four functions to work, with the agent-specific additions called out. Treat this as a starting sequence, not a rigid order; in practice you will revisit these continuously.
Govern, the foundation. Stand up the basics before anything else: a single inventory of every AI system you run (models and agents alike), an AI policy that states acceptable use and risk tolerance, and an accountable owner for each system. This is also where you connect AI risk to the obligations you already carry, from the EU AI Act to internal data policy. Without Govern, the other three functions become one-off heroics that decay the moment attention moves on.
Map, establish context. For a model, document its purpose, its stakeholders, and the data it was trained on and uses. For an agent, go further: inventory the tools it can call and the data it can reach, and classify how sensitive each one is. You cannot identify what could go wrong until you know what the system can touch, and for agents that touch list is the risk map.
Measure, make risk visible. For models, evaluate for accuracy, bias, robustness, and the generative risks in NIST-AI-600-1, and red-team before release. For agents, add runtime monitoring: log and review the actions an agent takes, trace each decision back to the data and tools behind it, and watch for behavior that drifts from intended use. A model is tested; an agent has to be watched.
Manage, act on what you find. Prioritize the risks that matter and respond. For agents this means concrete controls: scoped access so an agent only reaches what its task requires, guardrails on high-impact actions, human approval for irreversible steps, and a reliable way to stop an agent that misbehaves, plus an incident path for when something slips through. Acceptance is a valid choice for low-impact risk, as long as it is a documented decision rather than an accident.
The Hard Part Is Context
Read those steps again and notice what they quietly assume. Map assumes you know what data and tools exist and how sensitive each is. Measure assumes you can trace a result back to its source. Manage assumes there is an accountable owner who can act. The AI RMF describes what good risk management looks like; it is largely silent on the substrate every function runs on. That substrate is governed data, and a clear, shared understanding of what it means.
NIST is explicit about this in practice. In the Generative AI Profile, the suggested actions are tagged to each function (Govern, Map, Measure, Manage), and several call directly for tracking the provenance and history of training data and for vetting third-party suppliers across the AI lifecycle. Stripped of the framework language, those are catalog, lineage, and classification tasks. The framework points at governed data; it does not hand it to you.
This is where most AI risk programs stall, and where it gets worse with agents. You cannot scope an agent's access if you do not know what is sensitive. You cannot monitor its actions meaningfully if you cannot trace them to governed data. You cannot give a regulator "the model decided" as an answer. The framework names the discipline; making it operable is a data and context problem.
An agent can only be governed if you know what it can reach, what each thing means, and who is accountable for it. That is not a model property; it is a property of your governed data.
A context layer is how you supply that substrate once instead of per tool. It assembles a catalog of what exists, a glossary of what each term means, lineage of where data came from, and classification of what is sensitive, each with an accountable owner, and then serves all of it to AI tools through the open Model Context Protocol (MCP). Govern context once, and every model and agent reads from the same trusted definitions and the same policies, rather than each one guessing on its own.
That is the approach Dawiso takes. Its governance foundation (catalog, glossary, lineage, classification) is maintained as a by-product of normal work, and through the Context Layer and its MCP Server it becomes exactly the governed context the AI RMF's Map, Measure, and Manage functions need to be real. The framework gives you the discipline. Governed context is what lets you actually run it, for models and for the agents now acting on your data.
FAQ
What is the NIST AI Risk Management Framework?
How does the AI RMF apply to AI agents specifically?
Is the NIST AI RMF mandatory?
How does a context layer help with the NIST AI RMF?
See it in action
Dawiso Context Layer
Govern your data context once, then serve it to every AI tool and agent through an open protocol.