Version 1.0 · June 2026
DALCO
Data Acquisition, Lifecycle & Consumption Operations
The regulation-agnostic framework that makes AI pipelines trustworthy.
By Patryk Okreglicki · dalco.dev · CC-BY 4.0
The Problem
95% of enterprise AI pilots fail.
The models work. The data governance beneath them does not.
No data provenance
Teams cannot explain what data trained their AI model, or whether that data was permitted for AI use under GDPR or the EU AI Act.
Inconsistent feature definitions
The 'customer churn' feature means something different in every model. Both models train on different populations. Both are wrong.
Silent model drift
The data environment changes. Model performance degrades. Nobody detects it until a business outcome fails or a regulator asks questions.
Compliance impossible to demonstrate
EU AI Act Art. 10 requires training data documentation. NIST AI RMF requires provenance. It does not exist. The audit is a crisis.
Data scientists bottlenecked
60-80% of a data scientist's time is spent finding and cleaning data rather than building models. Governance is a talent problem too.
The Framework
Six DALCO layers.
Each is a technical discipline and a cultural commitment.
Foundation
Taxonomy & Ontology
Shared language enforced as code. Version-controlled dictionary, schema registry, ontology graph. Defined before any pipeline or model is built.
Foundation
Data Capture & Provenance
How data is born. Capture standards, provenance metadata, schema versioning. Every record carries who created it, when, and under what version.
Operations
Data Lake Governance
Bronze - Silver - Gold with named owners and SLAs. Formal promotion criteria. AI training draws only from Gold-tier data.
Compliance
Security & Compliance
Risk embedded at every layer. Data classification at capture, AI risk tagging, GDPR/CCPA automation, global inference audit trail.
AI-Critical
AI & Analyst Experience
The trust layer. Feature store, experiment tracking, model registry, human oversight interface. The layer 95% of AI pilots fail without.
AI-Critical
Observability & Feedback
Data drift detection, model performance monitoring, feedback loop governance. Prevents recursive bias in AI systems.
The Gap
Why existing frameworks are not enough.
| Framework | What it covers | What it misses | AI-ready? |
|---|---|---|---|
| DevOps | Software delivery, CI/CD, infrastructure as code | Data lifecycle entirely out of scope | No |
| DataOps | Pipeline automation, data quality, engineering velocity | Taxonomy, capture standards, scientist experience, AI governance | No |
| DataSecOps | Security embedded in data governance | Never a complete lifecycle framework | No |
| Data Mesh | Domain ownership topology, data as a product | How data is governed within domains - practices not structure | No |
| DALCO | Complete lifecycle: taxonomy through AI observability | Nothing - this is the gap DALCO fills | Yes - Layers 05 & 06 |
Maturity Model
Where are you today? Where do you need to be for AI?
Ad hoc
AI deployment: impossible
No taxonomy. Data swamp. No provenance. Security reactive. 95% of organisations start here.
Repeatable
AI deployment: prototypes only
Basic dictionary (not enforced). Some capture standards. Informal zone ownership. No model observability.
Defined
AI deployment: controlled production
Taxonomy-as-code enforced. Full provenance. Bronze/Silver/Gold SLAs. EU AI Act Art. 10 demonstrably satisfied.
Managed
AI deployment: scaled and measured
Full lineage source to AI training set. Multi-jurisdiction compliance automated. Feature store live. Drift detection running.
Optimising
AI deployment: strategic advantage
Self-healing pipelines. Feedback loop governance. AI decisions fully traceable. Regulatory audits completed in hours.
Regulatory Coverage
One framework. All jurisdictions.
EU AI Act, NIST AI RMF, ISO 42001, and US state laws independently arrived at the same requirements. Different language - same underlying capabilities. DALCO provides those capabilities once.
EU AI Act
2024 · enforcement 2026
NIST AI RMF
US de facto AI standard
ISO/IEC 42001
International AI mgmt standard
US State Laws
Colorado, CA, IL, NY · 2026
Real Examples
What DALCO looks like in practice.
Three industries. Three compliance challenges. One framework.
Healthcare · NHS Trust: AI diagnostics
Challenge
The trust wants to deploy AI that flags high-risk patients. Nobody can confirm whether training data was de-identified consistently, whether it covered all demographic groups, or whether feature definitions match live data.
DALCO solution
Layer 01 standardises clinical data dictionary across all source systems. Layer 02 stamps provenance at point of capture. Layer 04 validates de-identification at ingest. Layer 05 tracks the exact training dataset version and demographic composition. Layer 06 monitors for demographic performance drift. MHRA can see the complete audit trail from raw EHR to deployed decision in under 2 hours.
Financial services · UK bank: credit decisioning AI
Challenge
Every rejected applicant has a right to a meaningful explanation under GDPR Art. 22 and FCA expectations. The bank cannot provide one because nobody recorded which data version or model version made the decision 18 months ago.
DALCO solution
Layer 05 links every production credit decision to the exact model version, training dataset snapshot, and feature definitions used at inference time. Layer 04 logs every inference with timestamp, input data hash, model version, confidence score, and decision output. Layer 06 alerts when the live applicant population drifts from the training distribution.
SaaS · US HR platform: AI CV screening
Challenge
The startup must comply with NYC Local Law 144 (bias audit), Illinois AI Video Interview Act (consent and retention), and Colorado SB 24-205 (high-risk AI documentation) - three different laws, one small engineering team.
DALCO solution
Layer 01 defines canonical screening features with documented definitions preventing proxy discrimination. Layer 04 tags all candidate data with consent basis and state-specific retention rules, automated per jurisdiction. Layer 05 provides the bias audit log required by NYC LL 144 - demographic breakdown of model decisions by race, gender, age. One DALCO implementation satisfies three compliance requirements.
Framework Relationships
DALCO complements existing frameworks.
It does not replace them.
TOGAF
TOGAF Phase C defines the target data architecture conceptually. DALCO begins where Phase C ends - it is the operational execution layer that turns TOGAF's architectural intent into governed production practice.
DataOps
DALCO extends DataOps upstream (taxonomy and capture standards, which DataOps assumes exist) and downstream (AI experience and observability, which DataOps does not address). DataOps is one component of Layer 03 in DALCO terms.
Data Mesh
Data Mesh defines who owns data domains. DALCO defines how data is governed within those domains. They are complementary: Mesh is the organisational structure, DALCO is the governance practice within each domain.
NIST AI RMF
NIST AI RMF defines what AI risk management should achieve across four functions: Govern, Map, Measure, Manage. DALCO provides the data infrastructure through which those functions are operationalised in practice.
ISO 42001
Organisations implementing DALCO build the foundational controls required for ISO 42001 AI management system certification as a natural by-product of normal data governance operations.
Adoption
How to implement DALCO.
You do not need to implement all six layers at once. Start where the pain is greatest. Each step delivers value independently.
Week 1-2 · 2 days, key stakeholders
Taxonomy sprint
Agree canonical definitions for your 10 most-used data entities. One owner per definition. Put them in a Git repository. This single action resolves the most common cause of AI failure.
dbt docs · DataHub · Confluent Schema Registry · YAML in Git
Month 1-3 · 4-6 weeks, data engineers
Capture standards
For each critical data source (top 5 by AI usage), document the capture schema, add provenance metadata fields (source_system, captured_at, schema_version, consent_basis), and register the schema.
Apache Avro · Protobuf · Great Expectations · dbt tests
Month 2-4 · 4-8 weeks, data engineering
Lake zones
Formally designate Bronze, Silver, and Gold zones. Assign a named owner to each. Write the promotion criteria. Set automated quality checks as the gate between Silver and Gold. AI trains only from Gold.
Delta Lake · Apache Iceberg · S3/ADLS/GCS with access controls
Month 3-6 · 2-3 engineer-weeks before first AI goes to production
AI experience layer
Deploy experiment tracking and a basic model registry before any AI model goes to production. Without this you cannot satisfy EU AI Act Art. 12, NIST AI RMF MAP 2.1, or provide model explanations.
MLflow (open source, free) · Weights & Biases · Feast feature store
Month 5+ · ongoing, deploy alongside production AI
Observability
Add data drift detection to your top 3 production AI systems. Alert when input distributions diverge from training baselines by more than 2 standard deviations. Deploy alongside the model, not after the first incident.
Evidently AI (open source) · Monte Carlo · Datafold
Common Questions
Questions you will be asked — and the answers.
Is DALCO just DataOps with a new name?
No. DataOps covers pipeline automation and begins after data is captured. DALCO covers the full lifecycle: taxonomy before any pipeline, capture standards, lake governance, AI pipeline governance, and observability. DataOps is one component of Layer 03 in DALCO terms.
How is this different from Data Mesh?
Data Mesh is an organisational topology - who should own data domains. DALCO is a lifecycle governance framework - how data is governed within those domains. They are complementary. Data Mesh without DALCO has the right structure but no governance practices.
Does DALCO require specific tools?
No. DALCO is tool-agnostic. Layer 01 can be implemented with a YAML dictionary in Git or with Collibra. Layer 05 can use MLflow (free, open source) or Weights & Biases. The framework defines what capabilities are required - not which vendor provides them.
How long does it take to implement?
A first taxonomy sprint (Layer 01) takes 2 days. Basic capture standards for 5 sources (Layer 02) takes 4-6 weeks. Full DALCO Level 3 maturity typically takes 3-6 months for a focused data team. Each step delivers independent value.
Can a small team of 5-10 implement this?
Yes. DALCO scales to team size. A 5-person team implementing Level 2 needs: one owner per taxonomy entity, schema validation on critical pipelines, Bronze/Silver zones, and basic experiment tracking. Achievable in 6-8 weeks.
What does DALCO not cover?
Software engineering practice (DevOps), network and infrastructure security (NIST CSF or ISO 27001), organisational design for data ownership (Data Mesh or DAMA), and specific AI model development methodology. DALCO governs data from creation through AI consumption.
Licensing
Open framework. Commercial ecosystem.
DALCO is published under Creative Commons Attribution 4.0 (CC-BY 4.0)
Free to use commercially. Free to modify. Free to distribute. Free to build products on. The only requirement: credit the original — 'DALCO by Patryk Okreglicki, dalco.dev, CC-BY 4.0'
You can — no permission needed
- Use in client consulting engagements
- Teach in training courses
- Build products that implement DALCO
- Publish derivative frameworks
- Include in proprietary methodologies
- Reference in academic research
You must do this
- Credit: 'DALCO by Patryk Okreglicki'
- Include the CC-BY 4.0 licence reference
- Link to dalco.dev where practical
- State significant changes if you adapt it
Separate commercial products
- DALCO certification examinations
- The 'DALCO Certified' credential
- Official DALCO partner programme
- DALCO-compatible tool badges
- NOT covered by CC-BY 4.0