Translate a Policy

The sasy-translate cloud service takes a natural-language policy spec plus a snapshot of your agent’s codebase and, via a two-stage pipeline (agent analysis → policy generation), produces:

a Soufflé Datalog policy (.dl)
a companion C++ functors file (_functors.cpp) for content checks
a Markdown summary of the agent analysis
validation output from the server-side compiler

Install the SDK

pip install sasy

Write Your English Policy

Create policy_english.md describing your authorization rules in plain language. Here’s an abridged version of the tau2-airline spec used in the benchmark:

# Airline cancellation & modification policy

## Cancellation
- A reservation may be cancelled if any of the following are true:
  - the cabin class is not "basic_economy",
  - the reservation was created within the last 24 hours, OR
  - the reservation has insurance AND the user cites a
    serious event (medical emergency, bereavement, severe weather).
- Trivial social reasons (birthdays, weddings, etc.) never justify
  a cancellation.

## Flight modification
- Basic-economy reservations cannot have flights modified.
- Other cabin classes may be modified freely.

Translate It

import logging
from sasy.policy import translate

# See progress while the job runs.
logging.basicConfig(format="%(message)s")
logging.getLogger("sasy").setLevel(logging.INFO)

with open("policy_english.md") as f:
    policy = f.read()

result = translate(
    policy,
    codebase_paths=["tau2-bench/src/tau2"],
    # requirements the translator can't infer from code or the input policy
    instructions="<custom instructions>",
)

result.print_summary()
result.save_all("output/", base_name="airline")

This writes output/airline_policy.dl and output/agent_summary.md, plus output/airline_functors.cpp when the policy needs custom C++ helpers — content matching, date arithmetic, NLP classifiers, etc. The simple Quickstart policy is pure attribute matching, so Soufflé handles every rule natively and no functors file is emitted; the richer tau2 airline benchmark policy does produce functors. See benchmarks/tau2-airline/run.py for the concrete instructions used to produce the benchmark results.

Key inputs

Parameter	Description
`policy`	Natural-language policy spec (10–100k chars)
`codebase_paths`	Source directories the agent analyzer should read
`codebase_root`	Optional. Path prefix stripped when zipping — controls how paths appear in `agent_summary.md`. Defaults to `Path.cwd()` when every codebase path is under cwd (preserves your repo-relative paths); falls back to the common ancestor otherwise.
`instructions`	Domain-specific guidance (agent name, functor preference, etc.)
`model`	`haiku`, `sonnet` (default), or `opus`
`focus_paths`	Optional globs to narrow stage-1 analysis

What the service does

Stage 1 — an agent reads your codebase and produces agent_summary.md describing instrumentation points, tool calls, and message flow.
Stage 2 — a second agent, armed with the summary and the policy-compiler skill, writes the Datalog policy + functors.
Server-side validation — the artifacts are compiled against the toolkit’s common_policy.dl before being returned.

What You Get Back

Artifact	Access	Description
Datalog policy	`result.policy_dl`	Generated Soufflé rules (ready to upload)
C++ functors	`result.functors_cpp`	String-matching / date helpers
Agent summary	`result.agent_summary_md`	Stage-1 output
Validation	`result.validation`	`ok`, lint errors, Soufflé output
Errors	`result.errors`	Any stage or validation failures
Cost	`result.cost_usd`	Anthropic API usage for this job

Upload and Enforce

from sasy.policy import upload_policy_file

resp = upload_policy_file("output/airline_policy.dl")
print("Uploaded!" if resp.accepted else resp.error_output)

Or via the Makefile:

make upload-translated   # uploads output/airline_policy.dl

The policy is now live. Every tool call is checked against it in real time — see How Enforcement Works.

Variance & Robustness

Translation is non-deterministic. For robustness studies, run N independent translations from the original spec plus N from paraphrased versions:

uv run python benchmarks/tau2-airline/run.py variance --n 3

This produces output/original_{1..N}/ and output/paraphrase_{1..N}/, each with its own airline_policy.dl. The tau2 benchmark page shows how these compare to hand-tuned and LLM-judge baselines.

Experimental: extended policy validation

A second translator, sasy.policy.write_policy, focuses on validating the translation itself rather than targeting it at your agent code. It emits a truth table and adversarial verifier checks alongside the Datalog — see the confidence report for how to read those artifacts. write_policy is a better fit for short, self-contained policies where the assurance artifacts are the point. It doesn’t read a codebase, so translate remains the right choice for any policy that needs to reference real tool names, field names, or message-graph predicates.

from sasy.policy import write_policy  # experimental

result = write_policy(policy=policy)
result.save_datalog("output/airline_policy_experimental.dl")