Your first compliant system
🟡 Partial — The arc runs on the repo fixtures; the Annex IV is later assembled by the cloud and is still partial.
This tutorial guides you through the golden path of Venturalítica’s git-native specialization of risk-driven development (Risk-Driven Development, RDD): you build loan — a high-risk credit classifier under EU AI Act Annex III §5 — from configuration files to the signed evidence bundle. By the end you will have completed a full ISO 23894 §6.4–§6.5 cycle with versioned evidence and dual conformance (prEN 18228 + ISO 23894). The Annex IV (Art. 11) is then assembled by the cloud from that signed bundle.
The files you will use are in crates/seigarrena-cli/tests/resources/loan/. This is the same scenario the integration test runs; here you walk it manually to understand what each step does.
What you will learn:
- Declare risks, measures, and risk appetite in the Art. 9 risk programme (the
risk:section ofsei.yaml) - Compile the program to OSCAL and reproduce the pipeline with
sei run - Read the red/green gate and understand what UNDERPOWERED means
- Commit the treatment as a normative act that closes the ISO 23894 loop
- Project the signed bundle to two standards with
sei conformance - Register management approval and reconstruct the cycle with
sei reconstruct
Project context
Section titled “Project context”loan is a binary logistic classifier trained on the German Credit dataset (1,000 applicants) to approve or deny consumer loans. The system:
- Is high-risk (EU AI Act Annex III §5 — creditworthiness assessment systems)
- Is operated by a financial entity, so DORA (EU Reg. 2022/2554) also applies
- Has an AssuranceProgram (the
risk:section ofsei.yaml) with 5 risks and 16 measures → 10 ex-ante controls - Uses a multi-stage DVC pipeline (
featurize → evaluate) with an agnostic evaluator incompliance_eval.py
The main bias metric is demographic parity difference by gender (demographic_parity_diff). Model V1 (unmitigated) fails the blocking control; treatment V2 (fairlearn ExponentiatedGradient + DemographicParity) closes it.
Step 1 — Set up the environment
Section titled “Step 1 — Set up the environment”-
Verify that
seiis compiled and available:Terminal sei --helpAll 16 subcommands should be listed (
init,run,status,verify,compile,reconstruct,assess,soa,conformance,impact,request,approve,review,reject,retire,pubkey). If not, follow the installation guide. -
Create a working directory and copy the scenario files:
Terminal mkdir /tmp/loan && cd /tmp/loancp -r /path/to/seigarrena/crates/seigarrena-cli/tests/resources/loan/. . -
Initialize git and DVC:
Terminal git initgit add sei.yaml compliance_eval.py train.py \featurize.py evaluate.py dvc.yaml params.yamldvc initdvc add data/german_credit.csvgit add data/german_credit.csv.dvc .gitignore .dvc/ -
Create the Python environment with
uv:Terminal uv venv .venvuv pip install \"venturalitica==0.6.11" \"mlcroissant>=1.0" \"dvc>=3" \"pyarrow" \"scikit-learn==1.8.0" \"fairlearn==0.13.0"The versions of
scikit-learnandfairlearnare pinned: metric values must be deterministic becausesei.lockanchors them in the signed evidence.
Step 2 — Project files
Section titled “Step 2 — Project files”Before running anything, examine the structure you just copied:
loan/├── sei.yaml # System manifest (includes the Art. 9 risk programme)├── compliance_eval.py # Agnostic evaluator (venturalitica-sdk + Croissant)├── train.py → train_unaware.py # Active treatment (V1 at start)├── featurize.py # DVC stage: CSV → features.parquet├── evaluate.py # DVC stage: train + eval + model.pkl├── dvc.yaml # DAG pipeline featurize → evaluate├── params.yaml # seed: 42└── data/ ├── german_credit.csv # German Credit dataset (1,000 applicants) └── german_credit.croissant.json # Data governance §2 (how data is loaded)sei.yaml — the manifest
Section titled “sei.yaml — the manifest”Below is an illustrative, abbreviated version of the manifest (the risks: block is summarised; the scenario file ships the full one). It is meant to convey the shape of a sei.yaml:
apiVersion: seigarrena.dev/v1alpha1kind: AISystemsystem: name: loan-scoring intended_purpose: "Credit scoring for consumer loan approval."task: modality: tabular type: classificationeval: { script: train.py }pipeline: { tool: dvc, metrics: metrics.json }oscal: { assessment_plan: shared_data/policies/assessment_plan.oscal.yaml }dataset: { croissant: data/german_credit.croissant.json }artifacts: model: { kind: logreg, seed: 42 }risk: appetite: { individual: MEDIUM, society: MEDIUM, organization: HIGH } criteria: { scale: "5x5" } overall_residual_criterion: HIGH risks: # 5 risks with their measures — see the blocking control excerpt belowKey fields: eval.script points to the active treatment (the only thing that changes V1↔V2); pipeline.tool: dvc declares the reproduction seam; risk.appetite sets the per-risk appetite for ISO 23894 §6.4.4 evaluation. The risk: block is the Art. 9 risk programme (the AssuranceProgram): it lives here, in sei.yaml, not in a separate file.
See the sei.yaml reference for the full schema.
The risk: section — the Art. 9 risk programme
Section titled “The risk: section — the Art. 9 risk programme”The AssuranceProgram (the risk: section of sei.yaml) declares 5 risks and 16 measures (from which sei compile derives 10 ex-ante controls). The blocking control is:
- id: unfair-credit-exclusion metric: demographic_parity_diff constraint: "< 0.03" severity: high enforcement: gate lifecycle: [validation] article: "15" frameworks: [eu/dora@2022#art-6] standard_clauses: ["eu/pren-18228@2026#9.2", "eu/pren-18228@2026#7", "iso/23894@2023#6.5"]enforcement: gate means that if the measured metric does not satisfy the constraint, sei run returns exit ≠ 0. The DORA Art. 6 framework is added because the model is an ICT asset of a financial entity.
The remaining risks —risk.data-governance (4 data measures), risk.opacity (performance + oversight), risk.model-robustness-security (robustness + integrity), and risk.insufficient-human-oversight— contribute enforcement: audit and declared-oversight measures: they are measured and reported, but do not block the gate the way unfair-credit-exclusion does.
Step 3 — sei compile: AssuranceProgram → OSCAL
Section titled “Step 3 — sei compile: AssuranceProgram → OSCAL”sei compilesei compile reads the risk: section of sei.yaml, builds the AssuranceProgram in memory, and generates the OSCAL assessment plan at the path declared in sei.yaml (shared_data/policies/assessment_plan.oscal.yaml). This file is a dep of the evaluate stage in the DVC DAG; its digest enters dvc.lock.
Make the initial commit:
git add shared_data/policies/assessment_plan.oscal.yamlgit commit -m "init loan: AssuranceProgram compiled to OSCAL"See the sei CLI reference for full subcommand details.
Step 4 — T0: sei run with V1 → RED gate
Section titled “Step 4 — T0: sei run with V1 → RED gate”sei runsei run reproduces the DVC pipeline (dvc repro) over the featurize → evaluate DAG:
featurize— loads German Credit via Croissant (§2), materialisesdata/features.parquet(cached out, content-addressed)evaluate— trains model V1 withcompliance_eval.py(10 ex-ante controls viavl.enforce) and writesmetrics.jsonandmodel.pkl(cached out)
The engine judges results against the OSCAL assessment plan and applies the gate.
Expected result: sei run returns exit ≠ 0. The blocking control fails:
control: unfair-credit-exclusion metric: demographic_parity_diff measured value: ~0.06 threshold: 0.03 status: FAILS (enforcement: gate) frameworks: eu/dora@2022#art-6Risk analysis shows:
risk.unfair-credit-exclusion: inherent HIGH, cycle OPEN- The bundle
.sei/bundle.jsonis still signed and anchored (evidence of failure is evidence)
Verify the signature and check status:
sei verify # OK — ECDSA-P256+DSSE+in-toto signature validsei status # RED — gate failing on unfair-credit-exclusionCommit the T0 evidence:
git add sei.lock .sei/bundle.json .sei/bundle.json.sig metrics.jsongit commit -m "T0: V1 evidence (gate RED; dvc.lock + metrics.json anchored)"Read more about the red/green loop and statistical reliability in The red/green loop.
Step 5 — Explore the treatment candidate with dvc exp
Section titled “Step 5 — Explore the treatment candidate with dvc exp”Before committing the treatment, confirm that V2 closes the gate. Copy the mitigated file without committing:
cp train_mitigated.py train.pydvc exp rundvc exp showdvc exp run executes V2 as a DVC experiment (git-stashed, not entering the history). dvc exp show compares the unfair-credit-exclusion metric of the candidate against the V1 baseline:
Experiment unfair-credit-exclusion accuracy_scorebaseline V1 0.060 0.82candidate V2 0.015 0.80The candidate brings demographic parity difference below 0.03 → it would close the gate. This is the search for the treatment, not the treatment itself: the experiment is not yet committed to git.
What V2 does (train_mitigated.py): applies ExponentiatedGradient(DemographicParity(), eps=0.01) from fairlearn on the sensitive feature gender. The reduction learns an ensemble whose selection rate is approximately independent of gender (in-processing), producing a realistic residual demographic parity difference of ~0.015.
Step 6 — Commit the treatment: the normative act
Section titled “Step 6 — Commit the treatment: the normative act”The exploration confirmed that V2 closes the gate. Now promote the candidate to treatment:
git add train.pygit commit -m "T1: treatment — promotes V2 (mitigated train.py, candidate that closes the gate)"This commit is the ISO 23894 §6.5 treatment: a versioned change to which the FAIL→PASS arc can be attributed. sei reconstruct will locate it via git log -S unfair-credit-exclusion and reconstruct the cycle from it.
Only train.py changed. featurize.py, the data, and the risk classification are reused — this is a class-B drift (Art. 15): the model changes, but the purpose and dataset do not. DVC staleness reflects exactly this.
See The three treatment modalities (modality 1 = code change) and git closes the loop.
Step 7 — sei status: typed drift after the treatment
Section titled “Step 7 — sei status: typed drift after the treatment”sei statusThe engine detects that train.py changed relative to the last anchor. It reports typed drift:
| Section | Status | Why |
|---|---|---|
| measurement-model | stale (class B, Art. 15) | train.py is a dep of the model phase; its digest changed |
| measurement-data | reused | Data and Croissant did not change |
| data_governance | reused | Croissant (§2) is the same |
| classification | reused | Purpose and context are the same |
sei status returns exit ≠ 0 because the model section is stale: the pipeline needs to recompute.
Read about drift classification in Typed drift.
Step 8 — T1: sei run with V2 → GREEN gate
Section titled “Step 8 — T1: sei run with V2 → GREEN gate”sei rundvc repro detects that train.py (a dep of the evaluate stage) changed and recomputes only that stage. The featurize stage remains cached — it does not re-execute because the data did not change. This is DVC selective staleness, which maps exactly to the engine’s typed drift.
Expected result: sei run returns exit 0. The blocking control passes:
control: unfair-credit-exclusion measured value: ~0.015 threshold: 0.03 status: PASSESThe engine records a treatment event anchored to the commit: unfair-credit-exclusion FAILS→PASSES. Risk analysis shows:
risk.unfair-credit-exclusion: inherent HIGH → residual MEDIUM, cycle CLOSED
Step 9 — sei conformance: dual standard projection
Section titled “Step 9 — sei conformance: dual standard projection”The signed bundle from T1 can be projected onto two standards without re-annotating the scenario. Venturalítica implements conformance by projection: one AssuranceProgram → N reports.
sei conformance --standard eu/pren-18228@2026 --outProjection onto prEN 18228 (the harmonised risk management standard, Art. 9 EU AI Act):
| Clause | Result | Why |
|---|---|---|
| cl. 9.2 (control measures) | Covered | unfair-credit-exclusion is a blocking control measure aligned with cl. 9.2 |
| cl. 10 (overall residual) | Gap | The CRITICAL aggregate residual exceeds the HIGH criterion; opacity contributes |
sei conformance --standard iso/23894@2023Projection onto ISO 23894 (AI risk management):
| Clause | Result |
|---|---|
| cl. 6.5 (treatment and residual) | Covered — risk.unfair-credit-exclusion cycle CLOSED with FAIL→PASS arc |
The --out file writes .sei/conformance/eu_pren-18228_2026.json + .sig to the repository. The same signed bundle produces both reports with no additional annotation.
Read more about the two gates (DORA / MDR) in Two gates and the CLI in sei CLI reference.
Step 10 — sei approve: management approval
Section titled “Step 10 — sei approve: management approval”sei approve --by "Jane Roe <jane@acme.example>"Creates an empty git commit with the trailer Sei-Approved-by: Jane Roe <jane@acme.example>. This act satisfies ISO/IEC 42001 §6.1.3 (management approval of the treatment plan and acceptance of the residual): it is attributable, dated, and separate from the evaluator’s act.
sei reconstruct picks up this commit as the approval when reconstructing the cycle.
Step 11 — sei reconstruct: replay of the ISO 23894 cycle
Section titled “Step 11 — sei reconstruct: replay of the ISO 23894 cycle”sei reconstruct --outsei reconstruct traverses the git history of the bundle (.sei/bundle.json) and reconstructs the ISO 23894 cycle per risk, deterministic and without an LLM:
| Phase | What it reconstructs |
|---|---|
| ① Identification | The commit that introduced risk.unfair-credit-exclusion in the AssuranceProgram |
| ② Analysis | Likelihood × impact → inherent level HIGH (5×5 matrix ISO 23894 §6.4.3) |
| ③ Evaluation | Inherent HIGH vs appetite MEDIUM → EXCEEDS, treatment required |
| ④ Treatment | The V2 commit: train.py FAILS→PASSES (empirical arc from the bundle) |
| ⑤ Residual | Inherent HIGH → residual MEDIUM, cycle CLOSED (§6.5) |
The output also shows Jane Roe’s approval and the UNDERPOWERED warning with its CI:
unfair-credit-exclusion: 0.0151 < 0.03 PASSES bootstrap CI [0.001, 0.072] — n=1000 — UNDERPOWERED insufficient power, more samples needed (Art. 10(5))The artifact .sei/reconstruct.json + .sig is signed. See git closes the loop for the normative interpretation.
Step 12 — The Annex IV (Art. 11) is assembled by the cloud
Section titled “Step 12 — The Annex IV (Art. 11) is assembled by the cloud”The engine does not emit the Annex IV: it only leaves the signed evidence (.sei/bundle.json and friends). The Annex IV (EU AI Act Art. 11) is assembled and rendered by the cloud (control plane) from that signed bundle.json, with provenance per field, and delivered as a PDF. There is no sei annex-iv subcommand and no annex-iv.json artifact to commit.
When the cloud reads the signed bundle from the repository, it tags the provenance of each Annex IV field:
- Declared — fields from
sei.yaml(name, purpose, system type) - Derived from bundle — model digest, control results, dvc.lock
- Derived from git — commit history, management approval
What you built
Section titled “What you built”By completing this tutorial you have run a full RDD cycle:
- An AssuranceProgram (Art. 9) with 5 risks, 16 measures (10 ex-ante controls), and a blocking control aligned with prEN 18228 cl. 9.2 and ISO 23894 §6.5
- A reproducible DVC pipeline (
featurize → evaluate) with selective staleness reflecting typed drift - A verifiable red/green loop: T0 gate RED → treatment V2 (commit) → T1 gate GREEN
- A signed evidence bundle (ECDSA-P256+DSSE+in-toto) anchoring code, model, and data
- Dual conformance projected: prEN 18228 + ISO 23894 from the same bundle, no re-annotation
- A reconstructed ISO 23894 cycle via git replay, deterministic, with management approval
- The signed evidence from which the cloud assembles the Annex IV (partial; in this scenario only §3 and §9 stay PENDING — §7/§8 never) with per-field provenance
What remains pending
Section titled “What remains pending”| Gap | Why it matters |
|---|---|
risk.opacity without effective treatment | Overall residual exceeds overall_residual_criterion: HIGH (prEN 18228 cl. 10) |
unfair-credit-exclusion UNDERPOWERED | n=1,000 insufficient; CI [0.001, 0.072] crosses threshold (Art. 10(5)) |
| Annex IV §3/§9 PENDING in this scenario | §2/§3/§4/§9 can be pending if their input is missing; here, with data/controls present, only §3 (no Art.14 means or residuals in the bundle) and §9 (no post-market measures); §7/§8 never pending |
| Keyless identity anchoring (Sigstore) | ECDSA-P256+DSSE signature is done; external trust root is not |
See State and incompletions for the full gap map and What is Risk-Driven Development for the normative background.