Skip to main content

NIST AI RMF for Models and AI Agents, with Implementation Steps

Samuel Nagy
Samuel Nagy
VP of Strategic Growth

The NIST AI Risk Management Framework gives you a clear way to manage AI risk: four functions, applied across the lifecycle. The catch in 2026 is that AI is no longer just models that answer questions; it is agents that take actions. The framework still fits, but the surface it has to cover is larger. Here is how to apply the AI RMF to both, with concrete implementation steps.

Why the AI RMF Matters More Now

The NIST AI Risk Management Framework has quietly become the common language for AI risk. It is voluntary, technology-neutral, and used far beyond the United States, cited in procurement, board oversight, and regulatory-readiness programs because it gives teams a structured way to manage AI risk without prescribing a rigid checklist. Released as AI RMF 1.0 in January 2023, it has aged well precisely because it describes outcomes rather than specific technologies.

That design is being tested now. In 2024 NIST added a Generative AI Profile (NIST-AI-600-1) to tailor the framework to generative models. Then, on February 17, 2026, NIST launched a dedicated AI Agent Standards Initiative through its Center for AI Standards and Innovation, aimed squarely at the next wave: agents that can take actions autonomously on behalf of their users. The message is clear. The risk conversation has moved from "what will the model say?" to "what will the agent do?", and the AI RMF is the framework most organizations will stretch to cover both.

"The risk question has shifted from what a model says to what an agent does."

The Four Functions in One Minute

If you want the full definition, our glossary covers the NIST AI RMF fundamentals. Here is the working version. The framework has four functions, and the trick is that three run as an iterative cycle while the fourth surrounds them:

  • Govern is the cross-cutting function: the culture, policies, accountable roles, and oversight that make the other three durable instead of ad hoc.
  • Map establishes context and identifies risk: what the system is, who it affects, what data it uses, and what could go wrong.
  • Measure analyzes, benchmarks, and monitors those risks with metrics, evaluations, and red-teaming.
  • Manage prioritizes and acts: treat, transfer, avoid, or accept each risk, and plan for response and recovery.

None of this is new in spirit; it is classic risk management pointed at AI. What is new is how much harder each function gets when the thing you are governing can act on its own.

Models vs Agents, and Why It Changes the Job

A model is bounded. You send it an input, it returns an output, and you control both ends. NIST's Generative AI Profile (NIST AI 600-1) catalogs twelve risks specific to generative AI, from bias and privacy leakage to intellectual-property exposure, and two of them are squarely data-governance problems. Confabulation, the confident production of false content, is by NIST's own account a natural result of how generative models predict text, and worst in open-ended, domain-expert tasks, exactly where grounding in trusted context helps most. Value chain and component integration covers untraceable upstream components and data that was improperly obtained or never cleaned, which is a lineage and provenance problem by another name. You can still test a model by probing its outputs.

An agent is not bounded in the same way. It plans, calls tools, takes actions in live systems, and carries memory across steps. That widens the risk surface dramatically: an agent can reach data it should not, take an action no one approved, chain a small error into a large one, or be manipulated through the very tools it uses. NIST has acknowledged this gap directly; alongside its agent initiative, work is underway on SP 800-53 control overlays for single-agent and multi-agent systems, and a 2026 NCCoE concept paper specifically addresses how to authenticate and authorize agents that act autonomously on a user's behalf.

The framework does not change. The four functions still apply. But each function now has to cover tools, actions, and access, not just outputs.

Models vs Agents - the Widening Risk Surface SAME FOUR FUNCTIONS, A LARGER SURFACE MODEL Bounded: you control input and output Input Model Output Risks to manage Bias · hallucination · privacy leakage Harmful content · IP exposure Test by probing outputs AGENT Unbounded: plans and acts in live systems Plan Tools Actions Memory Adds, on top of model risks Unauthorized actions · data exfiltration Chained errors · tool manipulation Monitor actions, access, and traces Map, Measure, and Manage must now cover the data and tools an agent can reach
Click to enlarge

Implementation Steps

Here is how to put the four functions to work, with the agent-specific additions called out. Treat this as a starting sequence, not a rigid order; in practice you will revisit these continuously.

Govern, the foundation. Stand up the basics before anything else: a single inventory of every AI system you run (models and agents alike), an AI policy that states acceptable use and risk tolerance, and an accountable owner for each system. This is also where you connect AI risk to the obligations you already carry, from the EU AI Act to internal data policy. Without Govern, the other three functions become one-off heroics that decay the moment attention moves on.

Map, establish context. For a model, document its purpose, its stakeholders, and the data it was trained on and uses. For an agent, go further: inventory the tools it can call and the data it can reach, and classify how sensitive each one is. You cannot identify what could go wrong until you know what the system can touch, and for agents that touch list is the risk map.

Measure, make risk visible. For models, evaluate for accuracy, bias, robustness, and the generative risks in NIST-AI-600-1, and red-team before release. For agents, add runtime monitoring: log and review the actions an agent takes, trace each decision back to the data and tools behind it, and watch for behavior that drifts from intended use. A model is tested; an agent has to be watched.

Manage, act on what you find. Prioritize the risks that matter and respond. For agents this means concrete controls: scoped access so an agent only reaches what its task requires, guardrails on high-impact actions, human approval for irreversible steps, and a reliable way to stop an agent that misbehaves, plus an incident path for when something slips through. Acceptance is a valid choice for low-impact risk, as long as it is a documented decision rather than an accident.

Implementation Steps Across the Four Functions FOUR FUNCTIONS, FOR MODELS AND AGENTS GOVERN AI system inventory · policy & risk tolerance · accountable owners · oversight MAP Models Purpose, stakeholders, data used Agents (add) Inventory tools and data the agent can reach MEASURE Models Evaluate bias, accuracy, robustness; red-team Agents (add) Monitor actions, trace decisions, watch drift MANAGE Models Mitigate, document accepted risk Agents (add) Scoped access, guardrails, stop control, incident path FOUNDATION - GOVERNED CONTEXT Data catalog · business glossary · lineage · classification · ownership Served to any MCP-compatible tool, so every model and agent reads the same trusted context
Click to enlarge

The Hard Part Is Context

Read those steps again and notice what they quietly assume. Map assumes you know what data and tools exist and how sensitive each is. Measure assumes you can trace a result back to its source. Manage assumes there is an accountable owner who can act. The AI RMF describes what good risk management looks like; it is largely silent on the substrate every function runs on. That substrate is governed data, and a clear, shared understanding of what it means.

NIST is explicit about this in practice. In the Generative AI Profile, the suggested actions are tagged to each function (Govern, Map, Measure, Manage), and several call directly for tracking the provenance and history of training data and for vetting third-party suppliers across the AI lifecycle. Stripped of the framework language, those are catalog, lineage, and classification tasks. The framework points at governed data; it does not hand it to you.

This is where most AI risk programs stall, and where it gets worse with agents. You cannot scope an agent's access if you do not know what is sensitive. You cannot monitor its actions meaningfully if you cannot trace them to governed data. You cannot give a regulator "the model decided" as an answer. The framework names the discipline; making it operable is a data and context problem.

An agent can only be governed if you know what it can reach, what each thing means, and who is accountable for it. That is not a model property; it is a property of your governed data.

A context layer is how you supply that substrate once instead of per tool. It assembles a catalog of what exists, a glossary of what each term means, lineage of where data came from, and classification of what is sensitive, each with an accountable owner, and then serves all of it to AI tools through the open Model Context Protocol (MCP). Govern context once, and every model and agent reads from the same trusted definitions and the same policies, rather than each one guessing on its own.

That is the approach Dawiso takes. Its governance foundation (catalog, glossary, lineage, classification) is maintained as a by-product of normal work, and through the Context Layer and its MCP Server it becomes exactly the governed context the AI RMF's Map, Measure, and Manage functions need to be real. The framework gives you the discipline. Governed context is what lets you actually run it, for models and for the agents now acting on your data.

FAQ

What is the NIST AI Risk Management Framework?
The NIST AI Risk Management Framework (AI RMF) is a voluntary, technology-neutral framework published by the U.S. National Institute of Standards and Technology for managing the risks of AI across its lifecycle. Released as AI RMF 1.0 in January 2023, it is organized around four functions: Govern, Map, Measure, and Manage. It is not a law and carries no penalties; organizations adopt it because it gives them a structured, repeatable practice for building trustworthy AI. For a full definition see our glossary article on the NIST AI RMF.
How does the AI RMF apply to AI agents specifically?
The four functions are technology-neutral, so they apply to agents as much as to models. What changes is the surface they have to cover. An agent does not just produce an output; it plans, calls tools, takes actions in live systems, and keeps memory across steps. So Map has to inventory the data and tools an agent can reach, Measure has to monitor its actions and traces rather than just test its outputs, and Manage has to include access controls, guardrails, and the ability to stop an agent that misbehaves. NIST signaled the importance of this in February 2026 when it launched a dedicated AI Agent Standards Initiative.
Is the NIST AI RMF mandatory?
No. The AI RMF is voluntary and is not enforced by any regulator. That is the key difference from the EU AI Act, which is binding law with penalties. In practice the two are complementary: an AI RMF practice prepares you for the AI Act, because many of the Act obligations (risk management, data governance, documentation, human oversight) map closely onto the four functions. Many organizations use the AI RMF as their conceptual backbone and then formalize it through a certifiable standard like ISO 42001.
How does a context layer help with the NIST AI RMF?
Every function of the AI RMF depends on knowing and trusting the data and tools behind your AI, which the framework itself does not provide. A governed context layer does: a catalog of what data and assets exist, a glossary of what each term means, lineage of where data came from, and classification of what is sensitive, with an accountable owner for each. That governed context is what Map, Measure, and Manage need to be real rather than aspirational, and it is what makes an agent governable at all. Dawiso assembles that layer and serves it to any MCP-compatible AI tool through its Context Layer and MCP Server.

See it in action

Dawiso Context Layer

Govern your data context once, then serve it to every AI tool and agent through an open protocol.