Skip to content

Policies

A capability says “the Authority decided this agent may attempt this kind of action”. A policy says “given the current world, is this specific attempt OK”. OpenFirma uses Cedar for both — but it splits them across two evaluation surfaces, and understanding that split is the difference between a tractable policy strategy and a confusing one.

This page assumes you’ve read Capabilities and Action classes. It covers the two policy surfaces, the entity model, what’s in the runtime context, and the basic pattern language.

OpenFirma has two distinct sets of Cedar policies:

SurfaceRun byWhenDecides
IssuanceAuthorityPre-flight, when minting a token”Should this agent ever be allowed to attempt this class?”
RuntimeSidecarOn every outbound call, in Stage 2”Given current conditions, is this specific call OK right now?”

In the reference Authority and demo, they live in two separate directories:

examples/demo/
issuance-policies/ # evaluated by the Authority
issuance.cedar
policies/ # evaluated by the Sidecar (Stage 2)
default.cedar
example-deny.cedar
fixture-deny.cedar
schema.cedarschema

The split matters because the two surfaces answer fundamentally different questions:

  • Issuance policies are about agent identity and mission. “This agent is supposed to do model.inference.chat and nothing else” is an issuance policy. It is evaluated rarely (once per session), so it can afford expensive checks.
  • Runtime policies are about the current world. “Don’t allow payment.transfer over $1,000 today” is a runtime policy. It runs on every request, so it must be cheap and deterministic.

A capability that an issuance policy approved is still subject to runtime policy. The two layers compose.

Every Cedar policy in OpenFirma speaks the same three-entity language:

principal: Firma::Agent::"<agent_id>"
action: Firma::Action::"<action_class>"
resource: Firma::Resource::"<host><path>"

A simple permit looks like:

permit (
principal == Firma::Agent::"example-agent",
action == Firma::Action::"communication.internal.send",
resource
);

Cedar’s default is deny — if no policy explicitly permits an action, it’s denied. And a forbid rule overrides any matching permit. This is “default deny, forbid wins”, and it’s exactly what you want at a security boundary: removing a permit narrows what’s allowed; adding a forbid narrows further; nothing about adding a rule can accidentally grant new permissions.

The schema is shared between Authority and Sidecar — examples/demo/policies/schema.cedarschema is the source of truth. The 15 base FEP action classes plus extensions are declared there. If you reference a class that isn’t in the schema, Cedar refuses to compile the policy at all — a malformed bundle never reaches the hot path.

When Stage 2 evaluates a runtime policy, it builds a Cedar context with exactly these fields (see crates/firma-sidecar/src/enforcement/constraint_enforcement.rs and schema.cedarschema):

FieldTypeMeaning
session_idStringSession this call belongs to
timestamp_msLongWall clock at evaluation, Unix epoch milliseconds
paramsStringJSON-serialized intent.params (the request body)
risk_scoreLongStatic or pre-computed; V1 always emits 0
budget_remainingLongRemaining budget from the capability ceiling
session_duration_sLongSeconds since claims.issued_at
action_countLongMonotonic per-session counter, 1-based
raw_transportString"http" or "https"

These are the only signals available to a runtime policy. There is intentionally no live agent telemetry, no upstream response data, no LLM signal. That keeps Stage 2 deterministic and under its 200 µs budget.

A policy that uses context looks like:

permit (
principal,
action == Firma::Action::"communication.external.send",
resource
) when {
context.risk_score < 60
};
forbid (
principal,
action == Firma::Action::"communication.external.send",
resource
) when {
context.risk_score >= 80
};

The forbid overrides the permit at high risk, which is the canonical pattern for graduated controls.

A few rules cover most of what you’ll write.

Permit a specific class for a specific agent

Section titled “Permit a specific class for a specific agent”
permit (
principal == Firma::Agent::"example-agent",
action == Firma::Action::"model.inference.chat",
resource
);

This is the building block for least-privilege agents. Bind principal to the exact agent id and only enumerate the classes that agent’s mission requires.

forbid (
principal,
action == Firma::Action::"communication.external.send",
resource == Firma::Resource::"paste.rs/"
);

forbid rules without a specific principal apply to all agents. This is the right shape for “we never want anything to talk to this host”, regardless of which agent is misbehaving today. Note the paste.rs/ form — the resource UID is the host plus the normalized path, and an empty path is /.

The full request body is available to Cedar as context.params (a JSON string). Useful for limits like:

forbid (
principal,
action == Firma::Action::"payment.transfer",
resource
) when {
context.params has "amount" &&
context.params.amount > 100000
};

This pattern is what examples/policies/payment.cedar builds on for the payment-splitting example, where cumulative counters are pre-computed off the hot path and exposed as additional context fields.

Multiple classes can be matched in one rule:

forbid (
principal,
action in [
Firma::Action::"credential.read",
Firma::Action::"credential.write"
],
resource
) when {
context.risk_score >= 60
};

Use this for category-wide forbids — the rule is shorter and harder to drift.

The Sidecar holds the policy bundle in memory and reloads it from the Authority’s WatchPolicyBundle gRPC stream. Two configuration knobs in firma.toml govern freshness:

[sidecar.constraint_enforcement]
bundle_ttl_seconds = 60 # bundles older than this are considered stale
enforcement_timeout_ms = 50 # max time Stage 2 will spend evaluating

If the bundle hasn’t been refreshed within bundle_ttl_seconds (because the Authority is unreachable, say), Stage 2 returns PolicyBundleStale — every request denies. This is fail-closed by design: stale policy is not better than no policy, because the world might have changed in ways the stale policy doesn’t reflect.

When the Authority comes back, the next bundle update atomically swaps the evaluator. There’s no flush-and-reload window.

The reference Authority’s issuance policy can be as permissive as permit(principal, action, resource); for development (this is what examples/demo/issuance-policies/issuance.cedar does). In production you’d want it to enforce the agent’s mission boundary — for example, refusing to mint payment.transfer capabilities for a research agent.

Because issuance is rare (once per session), you can afford richer checks here than in runtime policy. Some patterns this enables:

  • “Only mint capabilities for a known list of agent_ids.”
  • “Only mint filesystem.write if the agent is also covered by a recently approved review ticket.” (The principal and requested_actions are available; you bring the rest as Cedar context.)
  • “Refuse to mint credential.write outright; force humans to provision.”

Anything you can express in Cedar at runtime, you can also express at issuance — but the Authority’s perspective is different (no params, no action_count, no risk_score), so issuance rules tend to be coarser.