Your first compliant system

🟡 Partial — The arc runs on the repo fixtures; the Annex IV is later assembled by the cloud and is still partial.

This tutorial guides you through the golden path of Venturalítica’s git-native specialization of risk-driven development (Risk-Driven Development, RDD): you build loan — a high-risk credit classifier under EU AI Act Annex III §5 — from configuration files to the signed evidence bundle. By the end you will have completed a full ISO 23894 §6.4–§6.5 cycle with versioned evidence and dual conformance (prEN 18228 + ISO 23894). The Annex IV (Art. 11) is then assembled by the cloud from that signed bundle.

The files you will use are in crates/seigarrena-cli/tests/resources/loan/. This is the same scenario the integration test runs; here you walk it manually to understand what each step does.

What you will learn:

Declare risks, measures, and risk appetite in the Art. 9 risk programme (the risk: section of sei.yaml)
Compile the program to OSCAL and reproduce the pipeline with sei run
Read the red/green gate and understand what UNDERPOWERED means
Commit the treatment as a normative act that closes the ISO 23894 loop
Project the signed bundle to two standards with sei conformance
Register management approval and reconstruct the cycle with sei reconstruct

Project context

loan is a binary logistic classifier trained on the German Credit dataset (1,000 applicants) to approve or deny consumer loans. The system:

Is high-risk (EU AI Act Annex III §5 — creditworthiness assessment systems)
Is operated by a financial entity, so DORA (EU Reg. 2022/2554) also applies
Has an AssuranceProgram (the risk: section of sei.yaml) with 5 risks and 16 measures → 10 ex-ante controls
Uses a multi-stage DVC pipeline (featurize → evaluate) with an agnostic evaluator in compliance_eval.py

The main bias metric is demographic parity difference by gender (demographic_parity_diff). Model V1 (unmitigated) fails the blocking control; treatment V2 (fairlearn ExponentiatedGradient + DemographicParity) closes it.

Step 1 — Set up the environment

Verify that sei is compiled and available:
Terminal
```
sei --help
```
All 16 subcommands should be listed (init, run, status, verify, compile, reconstruct, assess, soa, conformance, impact, request, approve, review, reject, retire, pubkey). If not, follow the installation guide.
Create a working directory and copy the scenario files:

This tutorial uses the example resources from the engine repository (contributor access). Once the code is published (open-core strategy), these resources will be available to everyone.
Terminal
```
mkdir /tmp/loan && cd /tmp/loan
cp -r /path/to/seigarrena/crates/seigarrena-cli/tests/resources/loan/. .
```

Initialize git and DVC:

git init
git add sei.yaml compliance_eval.py train.py \
        featurize.py evaluate.py dvc.yaml params.yaml
dvc init
dvc add data/german_credit.csv
git add data/german_credit.csv.dvc .gitignore .dvc/

Create the Python environment with uv:
Terminal
```
uv venv .venv
uv pip install \
  "venturalitica==0.6.11" \
  "mlcroissant>=1.0" \
  "dvc>=3" \
  "pyarrow" \
  "scikit-learn==1.8.0" \
  "fairlearn==0.13.0"
```
The versions of scikit-learn and fairlearn are pinned: metric values must be deterministic because sei.lock anchors them in the signed evidence.

Step 2 — Project files

Before running anything, examine the structure you just copied:

loan/
├── sei.yaml                           # System manifest (includes the Art. 9 risk programme)
├── compliance_eval.py                 # Agnostic evaluator (venturalitica-sdk + Croissant)
├── train.py  → train_unaware.py       # Active treatment (V1 at start)
├── featurize.py                       # DVC stage: CSV → features.parquet
├── evaluate.py                        # DVC stage: train + eval + model.pkl
├── dvc.yaml                           # DAG pipeline featurize → evaluate
├── params.yaml                        # seed: 42
└── data/
    ├── german_credit.csv              # German Credit dataset (1,000 applicants)
    └── german_credit.croissant.json   # Data governance §2 (how data is loaded)

sei.yaml — the manifest

Below is an illustrative, abbreviated version of the manifest (the risks: block is summarised; the scenario file ships the full one). It is meant to convey the shape of a sei.yaml:

apiVersion: seigarrena.dev/v1alpha1
kind: AISystem
system:
  name: loan-scoring
  intended_purpose: "Credit scoring for consumer loan approval."
task:
  modality: tabular
  type: classification
eval: { script: train.py }
pipeline: { tool: dvc, metrics: metrics.json }
oscal: { assessment_plan: shared_data/policies/assessment_plan.oscal.yaml }
dataset: { croissant: data/german_credit.croissant.json }
artifacts:
  model: { kind: logreg, seed: 42 }
risk:
  appetite: { individual: MEDIUM, society: MEDIUM, organization: HIGH }
  criteria: { scale: "5x5" }
  overall_residual_criterion: HIGH
  risks:
    # 5 risks with their measures — see the blocking control excerpt below

Key fields: eval.script points to the active treatment (the only thing that changes V1↔V2); pipeline.tool: dvc declares the reproduction seam; risk.appetite sets the per-risk appetite for ISO 23894 §6.4.4 evaluation. The risk: block is the Art. 9 risk programme (the AssuranceProgram): it lives here, in sei.yaml, not in a separate file.

See the sei.yaml reference for the full schema.

The `risk:` section — the Art. 9 risk programme

The AssuranceProgram (the risk: section of sei.yaml) declares 5 risks and 16 measures (from which sei compile derives 10 ex-ante controls). The blocking control is:

- id: unfair-credit-exclusion
  metric: demographic_parity_diff
  constraint: "< 0.03"
  severity: high
  enforcement: gate
  lifecycle: [validation]
  article: "15"
  frameworks: [eu/dora@2022#art-6]
  standard_clauses: ["eu/pren-18228@2026#9.2", "eu/pren-18228@2026#7", "iso/23894@2023#6.5"]

enforcement: gate means that if the measured metric does not satisfy the constraint, sei run returns exit ≠ 0. The DORA Art. 6 framework is added because the model is an ICT asset of a financial entity.

The remaining risks —risk.data-governance (4 data measures), risk.opacity (performance + oversight), risk.model-robustness-security (robustness + integrity), and risk.insufficient-human-oversight— contribute enforcement: audit and declared-oversight measures: they are measured and reported, but do not block the gate the way unfair-credit-exclusion does.

Step 3 — `sei compile`: AssuranceProgram → OSCAL

sei compile

sei compile reads the risk: section of sei.yaml, builds the AssuranceProgram in memory, and generates the OSCAL assessment plan at the path declared in sei.yaml (shared_data/policies/assessment_plan.oscal.yaml). This file is a dep of the evaluate stage in the DVC DAG; its digest enters dvc.lock.

Make the initial commit:

git add shared_data/policies/assessment_plan.oscal.yaml
git commit -m "init loan: AssuranceProgram compiled to OSCAL"

See the sei CLI reference for full subcommand details.

Step 4 — T0: `sei run` with V1 → RED gate

sei run

sei run reproduces the DVC pipeline (dvc repro) over the featurize → evaluate DAG:

featurize — loads German Credit via Croissant (§2), materialises data/features.parquet (cached out, content-addressed)
evaluate — trains model V1 with compliance_eval.py (10 ex-ante controls via vl.enforce) and writes metrics.json and model.pkl (cached out)

The engine judges results against the OSCAL assessment plan and applies the gate.

Expected result: sei run returns exit ≠ 0. The blocking control fails:

control: unfair-credit-exclusion
  metric: demographic_parity_diff
  measured value: ~0.06
  threshold: 0.03
  status: FAILS (enforcement: gate)
  frameworks: eu/dora@2022#art-6

Risk analysis shows:

risk.unfair-credit-exclusion: inherent HIGH, cycle OPEN
The bundle .sei/bundle.json is still signed and anchored (evidence of failure is evidence)

Verify the signature and check status:

sei verify      # OK — ECDSA-P256+DSSE+in-toto signature valid
sei status      # RED — gate failing on unfair-credit-exclusion

Commit the T0 evidence:

git add sei.lock .sei/bundle.json .sei/bundle.json.sig metrics.json
git commit -m "T0: V1 evidence (gate RED; dvc.lock + metrics.json anchored)"

Read more about the red/green loop and statistical reliability in The red/green loop.

Step 5 — Explore the treatment candidate with `dvc exp`

Before committing the treatment, confirm that V2 closes the gate. Copy the mitigated file without committing:

cp train_mitigated.py train.py
dvc exp run
dvc exp show

dvc exp run executes V2 as a DVC experiment (git-stashed, not entering the history). dvc exp show compares the unfair-credit-exclusion metric of the candidate against the V1 baseline:

Experiment     unfair-credit-exclusion   accuracy_score
baseline V1    0.060                     0.82
candidate V2   0.015                     0.80

The candidate brings demographic parity difference below 0.03 → it would close the gate. This is the search for the treatment, not the treatment itself: the experiment is not yet committed to git.

What V2 does (train_mitigated.py): applies ExponentiatedGradient(DemographicParity(), eps=0.01) from fairlearn on the sensitive feature gender. The reduction learns an ensemble whose selection rate is approximately independent of gender (in-processing), producing a realistic residual demographic parity difference of ~0.015.

Step 6 — Commit the treatment: the normative act

The exploration confirmed that V2 closes the gate. Now promote the candidate to treatment:

git add train.py
git commit -m "T1: treatment — promotes V2 (mitigated train.py, candidate that closes the gate)"

This commit is the ISO 23894 §6.5 treatment: a versioned change to which the FAIL→PASS arc can be attributed. sei reconstruct will locate it via git log -S unfair-credit-exclusion and reconstruct the cycle from it.

Only train.py changed. featurize.py, the data, and the risk classification are reused — this is a class-B drift (Art. 15): the model changes, but the purpose and dataset do not. DVC staleness reflects exactly this.

See The three treatment modalities (modality 1 = code change) and git closes the loop.

Step 7 — `sei status`: typed drift after the treatment

sei status

The engine detects that train.py changed relative to the last anchor. It reports typed drift:

Section	Status	Why
measurement-model	stale (class B, Art. 15)	`train.py` is a dep of the model phase; its digest changed
measurement-data	reused	Data and Croissant did not change
data_governance	reused	Croissant (§2) is the same
classification	reused	Purpose and context are the same

sei status returns exit ≠ 0 because the model section is stale: the pipeline needs to recompute.

Read about drift classification in Typed drift.

Step 8 — T1: `sei run` with V2 → GREEN gate

sei run

dvc repro detects that train.py (a dep of the evaluate stage) changed and recomputes only that stage. The featurize stage remains cached — it does not re-execute because the data did not change. This is DVC selective staleness, which maps exactly to the engine’s typed drift.

Expected result: sei run returns exit 0. The blocking control passes:

control: unfair-credit-exclusion
  measured value: ~0.015
  threshold: 0.03
  status: PASSES

The engine records a treatment event anchored to the commit: unfair-credit-exclusion FAILS→PASSES. Risk analysis shows:

risk.unfair-credit-exclusion: inherent HIGH → residual MEDIUM, cycle CLOSED

Although the fairness control passes and the risk.unfair-credit-exclusion cycle closes, the overall system residual remains CRITICAL. The risk risk.opacity (ALMOST_CERTAIN × HIGH) has no control that mitigates it sufficiently, and the overall_residual_criterion: HIGH declared in sei.yaml — the prEN 18228 cl. 10 threshold — is exceeded. The bundle reports overall_residual: CRITICAL / exceeds, with risk.opacity as the main contributor.

This is correct behavior: the per-risk gate can be green while the system-level residual exceeds the criterion. These are two distinct evaluation levels, as established by prEN 18228 cl. 10 versus the per-risk appetite of ISO 23894 §6.4.4. This tension requires an additional treatment of risk.opacity (explainability improvement) to resolve.

Step 9 — `sei conformance`: dual standard projection

The signed bundle from T1 can be projected onto two standards without re-annotating the scenario. Venturalítica implements conformance by projection: one AssuranceProgram → N reports.

sei conformance --standard eu/pren-18228@2026 --out

Projection onto prEN 18228 (the harmonised risk management standard, Art. 9 EU AI Act):

Clause	Result	Why
cl. 9.2 (control measures)	Covered	`unfair-credit-exclusion` is a blocking control measure aligned with cl. 9.2
cl. 10 (overall residual)	Gap	The CRITICAL aggregate residual exceeds the `HIGH` criterion; `opacity` contributes

sei conformance --standard iso/23894@2023

Projection onto ISO 23894 (AI risk management):

Clause	Result
cl. 6.5 (treatment and residual)	Covered — `risk.unfair-credit-exclusion` cycle CLOSED with FAIL→PASS arc

The --out file writes .sei/conformance/eu_pren-18228_2026.json + .sig to the repository. The same signed bundle produces both reports with no additional annotation.

Read more about the two gates (DORA / MDR) in Two gates and the CLI in sei CLI reference.

Step 10 — `sei approve`: management approval

sei approve --by "Jane Roe <jane@acme.example>"

Creates an empty git commit with the trailer Sei-Approved-by: Jane Roe <jane@acme.example>. This act satisfies ISO/IEC 42001 §6.1.3 (management approval of the treatment plan and acceptance of the residual): it is attributable, dated, and separate from the evaluator’s act.

sei reconstruct picks up this commit as the approval when reconstructing the cycle.

Step 11 — `sei reconstruct`: replay of the ISO 23894 cycle

sei reconstruct --out

sei reconstruct traverses the git history of the bundle (.sei/bundle.json) and reconstructs the ISO 23894 cycle per risk, deterministic and without an LLM:

Phase	What it reconstructs
① Identification	The commit that introduced `risk.unfair-credit-exclusion` in the AssuranceProgram
② Analysis	Likelihood × impact → inherent level HIGH (5×5 matrix ISO 23894 §6.4.3)
③ Evaluation	Inherent HIGH vs appetite MEDIUM → EXCEEDS, treatment required
④ Treatment	The V2 commit: `train.py` FAILS→PASSES (empirical arc from the bundle)
⑤ Residual	Inherent HIGH → residual MEDIUM, cycle CLOSED (§6.5)

The output also shows Jane Roe’s approval and the UNDERPOWERED warning with its CI:

unfair-credit-exclusion: 0.0151 < 0.03 PASSES
  bootstrap CI [0.001, 0.072] — n=1000 — UNDERPOWERED
  insufficient power, more samples needed (Art. 10(5))

The artifact .sei/reconstruct.json + .sig is signed. See git closes the loop for the normative interpretation.

Step 12 — The Annex IV (Art. 11) is assembled by the cloud

The engine does not emit the Annex IV: it only leaves the signed evidence (.sei/bundle.json and friends). The Annex IV (EU AI Act Art. 11) is assembled and rendered by the cloud (control plane) from that signed bundle.json, with provenance per field, and delivered as a PDF. There is no sei annex-iv subcommand and no annex-iv.json artifact to commit.

When the cloud reads the signed bundle from the repository, it tags the provenance of each Annex IV field:

Declared — fields from sei.yaml (name, purpose, system type)
Derived from bundle — model digest, control results, dvc.lock
Derived from git — commit history, management approval

What you built

By completing this tutorial you have run a full RDD cycle:

An AssuranceProgram (Art. 9) with 5 risks, 16 measures (10 ex-ante controls), and a blocking control aligned with prEN 18228 cl. 9.2 and ISO 23894 §6.5
A reproducible DVC pipeline (featurize → evaluate) with selective staleness reflecting typed drift
A verifiable red/green loop: T0 gate RED → treatment V2 (commit) → T1 gate GREEN
A signed evidence bundle (ECDSA-P256+DSSE+in-toto) anchoring code, model, and data
Dual conformance projected: prEN 18228 + ISO 23894 from the same bundle, no re-annotation
A reconstructed ISO 23894 cycle via git replay, deterministic, with management approval
The signed evidence from which the cloud assembles the Annex IV (partial; in this scenario only §3 and §9 stay PENDING — §7/§8 never) with per-field provenance

What remains pending

Gap	Why it matters
`risk.opacity` without effective treatment	Overall residual exceeds `overall_residual_criterion: HIGH` (prEN 18228 cl. 10)
`unfair-credit-exclusion` UNDERPOWERED	n=1,000 insufficient; CI [0.001, 0.072] crosses threshold (Art. 10(5))
Annex IV §3/§9 PENDING in this scenario	§2/§3/§4/§9 can be pending if their input is missing; here, with data/controls present, only §3 (no Art.14 means or residuals in the bundle) and §9 (no post-market measures); §7/§8 never pending
Keyless identity anchoring (Sigstore)	ECDSA-P256+DSSE signature is done; external trust root is not

See State and incompletions for the full gap map and What is Risk-Driven Development for the normative background.

Your first compliant system

Project context

Step 1 — Set up the environment

Step 2 — Project files

sei.yaml — the manifest

The risk: section — the Art. 9 risk programme

Step 3 — sei compile: AssuranceProgram → OSCAL

Step 4 — T0: sei run with V1 → RED gate

Step 5 — Explore the treatment candidate with dvc exp