Skip to content

Your first compliant system

🟡 Partial — The arc runs on the repo fixtures; the Annex IV is later assembled by the cloud and is still partial.

This tutorial guides you through the golden path of Venturalítica’s git-native specialization of risk-driven development (Risk-Driven Development, RDD): you build loan — a high-risk credit classifier under EU AI Act Annex III §5 — from configuration files to the signed evidence bundle. By the end you will have completed a full ISO 23894 §6.4–§6.5 cycle with versioned evidence and dual conformance (prEN 18228 + ISO 23894). The Annex IV (Art. 11) is then assembled by the cloud from that signed bundle.

The files you will use are in crates/seigarrena-cli/tests/resources/loan/. This is the same scenario the integration test runs; here you walk it manually to understand what each step does.

What you will learn:

  • Declare risks, measures, and risk appetite in the Art. 9 risk programme (the risk: section of sei.yaml)
  • Compile the program to OSCAL and reproduce the pipeline with sei run
  • Read the red/green gate and understand what UNDERPOWERED means
  • Commit the treatment as a normative act that closes the ISO 23894 loop
  • Project the signed bundle to two standards with sei conformance
  • Register management approval and reconstruct the cycle with sei reconstruct

loan is a binary logistic classifier trained on the German Credit dataset (1,000 applicants) to approve or deny consumer loans. The system:

  • Is high-risk (EU AI Act Annex III §5 — creditworthiness assessment systems)
  • Is operated by a financial entity, so DORA (EU Reg. 2022/2554) also applies
  • Has an AssuranceProgram (the risk: section of sei.yaml) with 5 risks and 16 measures → 10 ex-ante controls
  • Uses a multi-stage DVC pipeline (featurize → evaluate) with an agnostic evaluator in compliance_eval.py

The main bias metric is demographic parity difference by gender (demographic_parity_diff). Model V1 (unmitigated) fails the blocking control; treatment V2 (fairlearn ExponentiatedGradient + DemographicParity) closes it.


  1. Verify that sei is compiled and available:

    Terminal
    sei --help

    All 16 subcommands should be listed (init, run, status, verify, compile, reconstruct, assess, soa, conformance, impact, request, approve, review, reject, retire, pubkey). If not, follow the installation guide.

  2. Create a working directory and copy the scenario files:

    Terminal
    mkdir /tmp/loan && cd /tmp/loan
    cp -r /path/to/seigarrena/crates/seigarrena-cli/tests/resources/loan/. .
  3. Initialize git and DVC:

    Terminal
    git init
    git add sei.yaml compliance_eval.py train.py \
    featurize.py evaluate.py dvc.yaml params.yaml
    dvc init
    dvc add data/german_credit.csv
    git add data/german_credit.csv.dvc .gitignore .dvc/
  4. Create the Python environment with uv:

    Terminal
    uv venv .venv
    uv pip install \
    "venturalitica==0.6.11" \
    "mlcroissant>=1.0" \
    "dvc>=3" \
    "pyarrow" \
    "scikit-learn==1.8.0" \
    "fairlearn==0.13.0"

    The versions of scikit-learn and fairlearn are pinned: metric values must be deterministic because sei.lock anchors them in the signed evidence.


Before running anything, examine the structure you just copied:

loan/
├── sei.yaml # System manifest (includes the Art. 9 risk programme)
├── compliance_eval.py # Agnostic evaluator (venturalitica-sdk + Croissant)
├── train.py → train_unaware.py # Active treatment (V1 at start)
├── featurize.py # DVC stage: CSV → features.parquet
├── evaluate.py # DVC stage: train + eval + model.pkl
├── dvc.yaml # DAG pipeline featurize → evaluate
├── params.yaml # seed: 42
└── data/
├── german_credit.csv # German Credit dataset (1,000 applicants)
└── german_credit.croissant.json # Data governance §2 (how data is loaded)

Below is an illustrative, abbreviated version of the manifest (the risks: block is summarised; the scenario file ships the full one). It is meant to convey the shape of a sei.yaml:

sei.yaml (abbreviated, illustrative)
apiVersion: seigarrena.dev/v1alpha1
kind: AISystem
system:
name: loan-scoring
intended_purpose: "Credit scoring for consumer loan approval."
task:
modality: tabular
type: classification
eval: { script: train.py }
pipeline: { tool: dvc, metrics: metrics.json }
oscal: { assessment_plan: shared_data/policies/assessment_plan.oscal.yaml }
dataset: { croissant: data/german_credit.croissant.json }
artifacts:
model: { kind: logreg, seed: 42 }
risk:
appetite: { individual: MEDIUM, society: MEDIUM, organization: HIGH }
criteria: { scale: "5x5" }
overall_residual_criterion: HIGH
risks:
# 5 risks with their measures — see the blocking control excerpt below

Key fields: eval.script points to the active treatment (the only thing that changes V1↔V2); pipeline.tool: dvc declares the reproduction seam; risk.appetite sets the per-risk appetite for ISO 23894 §6.4.4 evaluation. The risk: block is the Art. 9 risk programme (the AssuranceProgram): it lives here, in sei.yaml, not in a separate file.

See the sei.yaml reference for the full schema.

The risk: section — the Art. 9 risk programme

Section titled “The risk: section — the Art. 9 risk programme”

The AssuranceProgram (the risk: section of sei.yaml) declares 5 risks and 16 measures (from which sei compile derives 10 ex-ante controls). The blocking control is:

sei.yaml — blocking control excerpt (risk.risks[].treat[].measures[])
- id: unfair-credit-exclusion
metric: demographic_parity_diff
constraint: "< 0.03"
severity: high
enforcement: gate
lifecycle: [validation]
article: "15"
frameworks: [eu/dora@2022#art-6]
standard_clauses: ["eu/pren-18228@2026#9.2", "eu/pren-18228@2026#7", "iso/23894@2023#6.5"]

enforcement: gate means that if the measured metric does not satisfy the constraint, sei run returns exit ≠ 0. The DORA Art. 6 framework is added because the model is an ICT asset of a financial entity.

The remaining risks —risk.data-governance (4 data measures), risk.opacity (performance + oversight), risk.model-robustness-security (robustness + integrity), and risk.insufficient-human-oversight— contribute enforcement: audit and declared-oversight measures: they are measured and reported, but do not block the gate the way unfair-credit-exclusion does.


Step 3 — sei compile: AssuranceProgram → OSCAL

Section titled “Step 3 — sei compile: AssuranceProgram → OSCAL”
Terminal
sei compile

sei compile reads the risk: section of sei.yaml, builds the AssuranceProgram in memory, and generates the OSCAL assessment plan at the path declared in sei.yaml (shared_data/policies/assessment_plan.oscal.yaml). This file is a dep of the evaluate stage in the DVC DAG; its digest enters dvc.lock.

Make the initial commit:

Terminal
git add shared_data/policies/assessment_plan.oscal.yaml
git commit -m "init loan: AssuranceProgram compiled to OSCAL"

See the sei CLI reference for full subcommand details.


Step 4 — T0: sei run with V1 → RED gate

Section titled “Step 4 — T0: sei run with V1 → RED gate”
Terminal
sei run

sei run reproduces the DVC pipeline (dvc repro) over the featurize → evaluate DAG:

  1. featurize — loads German Credit via Croissant (§2), materialises data/features.parquet (cached out, content-addressed)
  2. evaluate — trains model V1 with compliance_eval.py (10 ex-ante controls via vl.enforce) and writes metrics.json and model.pkl (cached out)

The engine judges results against the OSCAL assessment plan and applies the gate.

Expected result: sei run returns exit ≠ 0. The blocking control fails:

control: unfair-credit-exclusion
metric: demographic_parity_diff
measured value: ~0.06
threshold: 0.03
status: FAILS (enforcement: gate)
frameworks: eu/dora@2022#art-6

Risk analysis shows:

  • risk.unfair-credit-exclusion: inherent HIGH, cycle OPEN
  • The bundle .sei/bundle.json is still signed and anchored (evidence of failure is evidence)

Verify the signature and check status:

Terminal
sei verify # OK — ECDSA-P256+DSSE+in-toto signature valid
sei status # RED — gate failing on unfair-credit-exclusion

Commit the T0 evidence:

Terminal
git add sei.lock .sei/bundle.json .sei/bundle.json.sig metrics.json
git commit -m "T0: V1 evidence (gate RED; dvc.lock + metrics.json anchored)"

Read more about the red/green loop and statistical reliability in The red/green loop.


Step 5 — Explore the treatment candidate with dvc exp

Section titled “Step 5 — Explore the treatment candidate with dvc exp”

Before committing the treatment, confirm that V2 closes the gate. Copy the mitigated file without committing:

Terminal
cp train_mitigated.py train.py
dvc exp run
dvc exp show

dvc exp run executes V2 as a DVC experiment (git-stashed, not entering the history). dvc exp show compares the unfair-credit-exclusion metric of the candidate against the V1 baseline:

Experiment unfair-credit-exclusion accuracy_score
baseline V1 0.060 0.82
candidate V2 0.015 0.80

The candidate brings demographic parity difference below 0.03 → it would close the gate. This is the search for the treatment, not the treatment itself: the experiment is not yet committed to git.

What V2 does (train_mitigated.py): applies ExponentiatedGradient(DemographicParity(), eps=0.01) from fairlearn on the sensitive feature gender. The reduction learns an ensemble whose selection rate is approximately independent of gender (in-processing), producing a realistic residual demographic parity difference of ~0.015.


Step 6 — Commit the treatment: the normative act

Section titled “Step 6 — Commit the treatment: the normative act”

The exploration confirmed that V2 closes the gate. Now promote the candidate to treatment:

Terminal
git add train.py
git commit -m "T1: treatment — promotes V2 (mitigated train.py, candidate that closes the gate)"

This commit is the ISO 23894 §6.5 treatment: a versioned change to which the FAIL→PASS arc can be attributed. sei reconstruct will locate it via git log -S unfair-credit-exclusion and reconstruct the cycle from it.

Only train.py changed. featurize.py, the data, and the risk classification are reused — this is a class-B drift (Art. 15): the model changes, but the purpose and dataset do not. DVC staleness reflects exactly this.

See The three treatment modalities (modality 1 = code change) and git closes the loop.


Step 7 — sei status: typed drift after the treatment

Section titled “Step 7 — sei status: typed drift after the treatment”
Terminal
sei status

The engine detects that train.py changed relative to the last anchor. It reports typed drift:

SectionStatusWhy
measurement-modelstale (class B, Art. 15)train.py is a dep of the model phase; its digest changed
measurement-datareusedData and Croissant did not change
data_governancereusedCroissant (§2) is the same
classificationreusedPurpose and context are the same

sei status returns exit ≠ 0 because the model section is stale: the pipeline needs to recompute.

Read about drift classification in Typed drift.


Step 8 — T1: sei run with V2 → GREEN gate

Section titled “Step 8 — T1: sei run with V2 → GREEN gate”
Terminal
sei run

dvc repro detects that train.py (a dep of the evaluate stage) changed and recomputes only that stage. The featurize stage remains cached — it does not re-execute because the data did not change. This is DVC selective staleness, which maps exactly to the engine’s typed drift.

Expected result: sei run returns exit 0. The blocking control passes:

control: unfair-credit-exclusion
measured value: ~0.015
threshold: 0.03
status: PASSES

The engine records a treatment event anchored to the commit: unfair-credit-exclusion FAILS→PASSES. Risk analysis shows:

  • risk.unfair-credit-exclusion: inherent HIGH → residual MEDIUM, cycle CLOSED

Step 9 — sei conformance: dual standard projection

Section titled “Step 9 — sei conformance: dual standard projection”

The signed bundle from T1 can be projected onto two standards without re-annotating the scenario. Venturalítica implements conformance by projection: one AssuranceProgram → N reports.

Terminal
sei conformance --standard eu/pren-18228@2026 --out

Projection onto prEN 18228 (the harmonised risk management standard, Art. 9 EU AI Act):

ClauseResultWhy
cl. 9.2 (control measures)Coveredunfair-credit-exclusion is a blocking control measure aligned with cl. 9.2
cl. 10 (overall residual)GapThe CRITICAL aggregate residual exceeds the HIGH criterion; opacity contributes
Terminal
sei conformance --standard iso/23894@2023

Projection onto ISO 23894 (AI risk management):

ClauseResult
cl. 6.5 (treatment and residual)Covered — risk.unfair-credit-exclusion cycle CLOSED with FAIL→PASS arc

The --out file writes .sei/conformance/eu_pren-18228_2026.json + .sig to the repository. The same signed bundle produces both reports with no additional annotation.

Read more about the two gates (DORA / MDR) in Two gates and the CLI in sei CLI reference.


Step 10 — sei approve: management approval

Section titled “Step 10 — sei approve: management approval”
Terminal
sei approve --by "Jane Roe <jane@acme.example>"

Creates an empty git commit with the trailer Sei-Approved-by: Jane Roe <jane@acme.example>. This act satisfies ISO/IEC 42001 §6.1.3 (management approval of the treatment plan and acceptance of the residual): it is attributable, dated, and separate from the evaluator’s act.

sei reconstruct picks up this commit as the approval when reconstructing the cycle.


Step 11 — sei reconstruct: replay of the ISO 23894 cycle

Section titled “Step 11 — sei reconstruct: replay of the ISO 23894 cycle”
Terminal
sei reconstruct --out

sei reconstruct traverses the git history of the bundle (.sei/bundle.json) and reconstructs the ISO 23894 cycle per risk, deterministic and without an LLM:

PhaseWhat it reconstructs
① IdentificationThe commit that introduced risk.unfair-credit-exclusion in the AssuranceProgram
② AnalysisLikelihood × impact → inherent level HIGH (5×5 matrix ISO 23894 §6.4.3)
③ EvaluationInherent HIGH vs appetite MEDIUM → EXCEEDS, treatment required
④ TreatmentThe V2 commit: train.py FAILS→PASSES (empirical arc from the bundle)
⑤ ResidualInherent HIGH → residual MEDIUM, cycle CLOSED (§6.5)

The output also shows Jane Roe’s approval and the UNDERPOWERED warning with its CI:

unfair-credit-exclusion: 0.0151 < 0.03 PASSES
bootstrap CI [0.001, 0.072] — n=1000 — UNDERPOWERED
insufficient power, more samples needed (Art. 10(5))

The artifact .sei/reconstruct.json + .sig is signed. See git closes the loop for the normative interpretation.


Step 12 — The Annex IV (Art. 11) is assembled by the cloud

Section titled “Step 12 — The Annex IV (Art. 11) is assembled by the cloud”

The engine does not emit the Annex IV: it only leaves the signed evidence (.sei/bundle.json and friends). The Annex IV (EU AI Act Art. 11) is assembled and rendered by the cloud (control plane) from that signed bundle.json, with provenance per field, and delivered as a PDF. There is no sei annex-iv subcommand and no annex-iv.json artifact to commit.

When the cloud reads the signed bundle from the repository, it tags the provenance of each Annex IV field:

  • Declared — fields from sei.yaml (name, purpose, system type)
  • Derived from bundle — model digest, control results, dvc.lock
  • Derived from git — commit history, management approval

By completing this tutorial you have run a full RDD cycle:

  1. An AssuranceProgram (Art. 9) with 5 risks, 16 measures (10 ex-ante controls), and a blocking control aligned with prEN 18228 cl. 9.2 and ISO 23894 §6.5
  2. A reproducible DVC pipeline (featurize → evaluate) with selective staleness reflecting typed drift
  3. A verifiable red/green loop: T0 gate RED → treatment V2 (commit) → T1 gate GREEN
  4. A signed evidence bundle (ECDSA-P256+DSSE+in-toto) anchoring code, model, and data
  5. Dual conformance projected: prEN 18228 + ISO 23894 from the same bundle, no re-annotation
  6. A reconstructed ISO 23894 cycle via git replay, deterministic, with management approval
  7. The signed evidence from which the cloud assembles the Annex IV (partial; in this scenario only §3 and §9 stay PENDING — §7/§8 never) with per-field provenance
GapWhy it matters
risk.opacity without effective treatmentOverall residual exceeds overall_residual_criterion: HIGH (prEN 18228 cl. 10)
unfair-credit-exclusion UNDERPOWEREDn=1,000 insufficient; CI [0.001, 0.072] crosses threshold (Art. 10(5))
Annex IV §3/§9 PENDING in this scenario§2/§3/§4/§9 can be pending if their input is missing; here, with data/controls present, only §3 (no Art.14 means or residuals in the bundle) and §9 (no post-market measures); §7/§8 never pending
Keyless identity anchoring (Sigstore)ECDSA-P256+DSSE signature is done; external trust root is not

See State and incompletions for the full gap map and What is Risk-Driven Development for the normative background.