Confidence Report (experimental)
SASY provides two policy-translation pipelines with complementary goals:
- The primary
translate()flow takes an English spec plus your agent’s codebase and produces a Datalog policy targeted at your code. Reach for it when you’re authoring a policy against real tool names, field names, and message-graph predicates. - The experimental
write_policy()flow focuses on validating the translation itself. It generates a truth table and runs adversarial verifier checks so you can gain confidence that the English → Datalog mapping is faithful. It doesn’t read your codebase, so it’s a better fit for short self-contained policies where the assurance artifacts are the point.
This page describes how to read the output of the
write_policy() flow. Its result object carries a
print_summary() method that renders a verification
report — here’s how to interpret each section.
The Summary
Section titled “The Summary”result.print_summary()Policy Compilation Result========================================Status: SUCCESS (336s)
Structured Spec 10 clauses (C1-C10)
Verification Truth table: 25 scenarios 11 ALLOW | 3 DENY | 11 GUARD_DENY Assertions: 25/25 passed Independent checks: 3/3 passed Round-trip audit: PASSWhat Each Number Means
Section titled “What Each Number Means”Truth Table (25 scenarios)
Section titled “Truth Table (25 scenarios)”Every combination of membership tier, insurance status, and cabin class was tested. Each row shows what the policy decides for that scenario.
- ALLOW — the action is authorized
- DENY — the action is explicitly blocked (with a reason)
- GUARD_DENY — the policy can’t decide because required data hasn’t been looked up yet
Abridged view of the generated table for the airline demo policy:
| Action | Membership | Insurance | Cabin | Decision |
|---|---|---|---|---|
| cancel | gold | no | — | ALLOW |
| cancel | gold | yes | — | ALLOW |
| cancel | silver | yes | — | ALLOW |
| cancel | silver | no | — | DENY |
| cancel | regular | yes | — | ALLOW |
| cancel | regular | no | — | DENY |
| cancel | unknown | no | — | GUARD_DENY |
| modify | any | — | economy | ALLOW |
| modify | silver | — | basic_economy | ALLOW |
| modify | gold | — | basic_economy | ALLOW |
| modify | regular | — | basic_economy | DENY |
| modify | unknown | — | unknown | GUARD_DENY |
Assertions (25/25 passed)
Section titled “Assertions (25/25 passed)”The generated Datalog was uploaded to the live SASY policy engine and evaluated against every truth table row. 25/25 means the Datalog produces the exact same decisions as the specification.
If any assertion fails, it means the Datalog translation has a bug — a specific scenario where the policy behaves differently than intended.
Independent Checks (3/3 passed)
Section titled “Independent Checks (3/3 passed)”Three separate reviewers examined the policy specification for correctness:
| Reviewer | What it checks |
|---|---|
| Omission hunter | Did we miss any requirement from the English? |
| Boundary adversary | Do edge cases (unknown values, missing data) behave correctly? |
| Guard auditor | Are prerequisite lookups properly enforced? |
Round-Trip Audit (PASS)
Section titled “Round-Trip Audit (PASS)”The Datalog was read back without seeing the original English, and the reconstructed policy intent was compared against the specification.
PASS means nothing was lost in translation — every rule in the Datalog maps to a clause in the specification, and vice versa.
Prerequisite Guards
Section titled “Prerequisite Guards”Some authorization decisions depend on data that
only exists after a prior tool call. The demo’s
cancellation rules reference insurance and
membership — fields that only appear after
get_reservation_details runs.
write_policy handles this automatically: when the
English spec references fields from prior tool
results, the translator emits a guard rule that
denies the call while those fields aren’t yet
visible. For the demo’s cancellation spec, that
becomes C5 in the generated policy:
Unauthorized(idx) :- Actions(idx, a), IsTool(a, "cancel_reservation"), !ToolResultField("get_reservation_details", "insurance", _), !ToolResultField("get_reservation_details", "membership", "gold").!ToolResultField(...) is Datalog negation — the
rule fires exactly when the fields haven’t been
populated. From the agent’s perspective, the
behavior is a natural retry loop:
Agent: cancel_reservation(RKLA42)SASY: ✗ DENIED — look up reservation details firstAgent: get_reservation_details(RKLA42)SASY: ✓ AUTHORIZEDAgent: cancel_reservation(RKLA42)SASY: ✓ AUTHORIZED (insurance=yes)In the truth table these scenarios surface as the
GUARD_DENY rows shown above; the
Guard Auditor
independent check verifies the ordering is
actually enforced.
Programmatic Access
Section titled “Programmatic Access”All verification data is available on the result object:
v = result.verification
# Truth table shapev.truth_table.rows # 25v.truth_table.allow # 11v.truth_table.deny # 3v.truth_table.guard_deny # 11
# Assertion resultsv.assertions.total # 25v.assertions.passed # 25v.assertions.mismatches # 0
# Checksv.independent_checks_passed # 3v.independent_checks_total # 3v.round_trip_audit # "PASS"When Something Fails
Section titled “When Something Fails”If a translation produces assertion mismatches, the service automatically retries — fixing the Datalog and re-evaluating until all assertions pass or a timeout is reached.
If independent checks fail, the structured specification may need adjustment. Review the check results and update the English policy to resolve any ambiguities.
Running the experimental flow end-to-end
Section titled “Running the experimental flow end-to-end”The demo repo ships Makefile targets for the
write_policy path. They translate
policy_english.md, save the artifacts under
output/, and run the 9-scenario demo against the
translated policy.
# English → Datalog (writes output/airline_policy_experimental.dl# + output/truth_table.tsv)make translate-experimental
# Upload + run the 9 scenarios against the experimental policymake demo-translated-experimental
# Interactive stepwise variantmake demo-translated-experimental-step
# Upload the experimental policy only (no demo run)make upload-translated-experimentalmake demo-translated-experimental passes
output/airline_policy_experimental.dl as the
--policy-file argument — the tracked policy.dl
is not modified. Use make demo to run against the
hand-written reference policy that ships with the
repo.