Project Carbonite: Full Submission

This page provides a clean, comprehensive version of the project submission for printing and analysis. All interactive elements have been removed and all content has been expanded.


Executive Summary: A Unified Intelligence Platform for Strategic Sourcing

Lidl & Kaufland Asia, supported by the Singapore Fashion Council (SFC), requires a unified intelligence platform to overcome the critical challenge of fragmented supply chain data. Project Carbonite is that solution. It is a causal intelligence platform designed to consolidate Lidl's internal sourcing records with external market data, providing a comprehensive, real-time understanding of sourcing trends and market movements.

By transforming scattered data points into a clear, actionable dashboard, Project Carbonite empowers Lidl's sourcing team to make more informed, data-driven decisions, directly addressing the core problem statement of the challenge. This will enhance efficiency, mitigate risk, and establish a strategic advantage in the "Fashion and Lifestyle" and "Home Textiles" categories.

What is a 'Unified Intelligence Platform for Strategic Sourcing'?

A Unified Intelligence Platform for Strategic Sourcing is a system that solves the core problem of data fragmentation in global supply chains. For a global enterprise, data is scattered across dozens of systems: internal ERPs, supplier databases, shipping manifests, and external market reports. Project Carbonite is our answer to this challenge. It acts as a non-disruptive Causal Governance Layer that overlays these existing systems. Instead of costly integration projects, it uses a Hypergraph Model to find and connect the 'digital threads' between all data points. This creates a single source of truth, allowing sourcing teams to see the full picture—from the cost of raw materials in one country to the geopolitical risk in another—enabling them to make strategic, proactive decisions instead of reactive, fragmented ones.

Phase 1: The Three-Month Causal Viability Study (The Task)

The S$80,000 is the budget for the Causal Viability Study (CCVS), a three-month validation of our architecture on the Textile/Lifestyle category. We will use pre-trained models for rapid deployment, delivering the first actionable pieces of financial and ESG intelligence, which will be validated against the real-world processes of Lidl’s team.

Three-Month Achievable Milestones

TimelineMilestoneDeliverable
Weeks 1-2Structural IntegrityERP to Index Harmonization and Unit Standardization Agent (POC).
Weeks 3-5Causal ModelingHyper Edge Matrix Boundaries defined; Factor Beta Regression engine calibrated.
Weeks 6-8Risk QuantificationSupply Chain Beta (β_SC) calculation verified.
Weeks 9-12Asset QuantificationCarbon Credit Beta (β_CC) model calibrated and final report delivered.

High-Level Budget Allocation (CCVS)

CategoryAllocationJustification
Lead AI/ML Engineering (3 Months)S$45,000Dedicated expert time for model calibration, architecture validation, and hypergraph implementation.
Cloud Computing & API CostsS$20,000Covers GPU time for model training/fine-tuning (Google Cloud) and GenAI API usage for the agentic workflows.
Data Acquisition & LicensingS$10,000Licensing for essential external market data feeds (e.g., commodity prices, trade indices) for the study period.
Project Management & ContingencyS$5,000Overseeing deliverables and providing a buffer for unforeseen technical challenges.

Proposed Project Team & Time Commitment (CCVS)

RoleSourceEstimated Hours (3 Months)
Lead AI/ML EngineerCLEVresearchFull-Time
Project Manager / Lidl LiaisonLidl (e.g., Raj)4-6 hours/week
Sourcing Data SMELidl (e.g., Anika)5-8 hours/week
Risk Assessment SMELidl (e.g., Mei)3-5 hours/week

Solution Architecture: The CLEV Causal Intelligence Layer (The Action)

The Causal Hypergraph (The Foundation)

The system's core architecture uses a Hypergraph Model to map multi-point causality. It organizes all fragmented data—from Lidl's modern ERP and market indices—into Matrix Boundaries linked by Hyper Edges (Foreign Keys). This creates a universal, semantic layer that automatically links data without human participation.

The Agentic Harmonization Layer (The Unifier)

An Agentic Workflow solves data fragmentation at every level, including legacy data harmonization and logistical unit standardization (e.g. automatically converting kg to lbs).

The Beta Factor (The Universal Language of Risk)

The system is governed by a universal metric of risk and volatility, the Factor Beta (β), which measures the sensitivity of any entity (e.g., a supplier, a shipping lane) to systemic market changes. This allows us to quantify and compare risk across completely different systems and data types without needing to merge them, creating a single language for decision-making.

Key Benefits: Project Carbonite as the SFC's Mandate Accelerator

Typical System LimitationThe Causal Governance Layer
Schema & Data Fragmentation. Non-standardized schemas from multiple sources introduce significant lags and inaccuracies, making truly auditable ML impossible.Agentic Harmonization & CV Oversight: Our Unit Standardization Agent automates cleansing (e.g., kg/lbs). Our Oversight Factor (Computer Vision) generates new, clean ground-truth data, bypassing schema issues for auditable traceability.
"Small File" & "Batch Upload" Time Gaps. Near real-time streams create "small file problems" and "orphan files" that require constant, expensive maintenance (compaction), introducing time gaps that halt a real-time ML system.Dynamic Tolerance Governance (ReLU Activation): Our Factor Beta (β) framework acts as a ReLU-like threshold to intelligently prioritize data. It governs existing maintenance jobs, telling them what to compact (high-risk) and what to ignore (low-risk), eliminating unnecessary system halts.
Process & Processing Redundancy. The conflict between legacy data (Purchase Price) and modern analytics (Retail Price) forces teams into "multiple processing of same data" with no unified language for risk.The Factor Beta (β) Framework: Our β framework is the "seamless integration." It creates a universal language of risk that translates both legacy and modern data into a single, unified metric, eliminating redundant processing.
Weak Governance & Access Control. Standard stacks often have weak governance, only providing "role based access controlling" and lacking the "row level and column level access controlling" needed for a secure, multi-stakeholder enterprise.The CLEV Causal Hypergraph: Our Hypergraph is the missing governance layer. It creates an indelible, auditable link ("Hyper Edge") for every action. This structure natively enables the granular, row-level security standard stacks are missing.
Reduction of Metadata Size. Large metadata from millions of files and snapshots becomes slow and expensive to manage.Hyper Edge Abstraction: Instead of storing massive, redundant metadata, the Causal Hypergraph stores a single, lightweight Hyper Edge that causally links disparate data. This drastically reduces the metadata footprint while preserving relational integrity.
Maintenance Cost Optimization. Constant compaction and data cleaning jobs are resource-intensive and expensive.Dynamic Tolerance Governance: By using the Beta Factor to identify and prioritize only high-risk data for maintenance, we reduce the frequency and scope of these expensive jobs, optimizing cost and compute resources.
Reliability. Failed write operations and system halts from maintenance jobs create an unreliable data environment.Resilient by Design: Our system is non-disruptive. Because the Causal Hypergraph operates as an abstraction layer, it is resilient to underlying data pipeline failures and reduces system downtime by minimizing unnecessary maintenance halts.
Self-Service. Data modelers and analysts often wait for data engineering to build complex views, slowing down insights.Agentic Workflow & Unified Views: The Agentic Harmonization Layer provides clean, standardized data on demand. The Hypergraph allows analysts to instantly query causally-linked data as if it were a single table, enabling true self-service.
ACL (Access Control). Providing granular access to specific rows or columns of data is complex and often impossible in fragmented systems.Native Granular Control: The Hypergraph’s structure allows for precise, auditable access control. Permissions (ACLs) can be applied directly to a Hyper Edge, granting a user access to a specific causal link (e.g., one supplier’s data) without exposing the entire dataset.

Strategic Advantage & Results: The Predictive Rebalancing Framework

The entire system is governed by Factor Beta (β), the universal language of distributed control. This framework allows us to computationally solve the challenge using Inverse Design: we start with the desired outcome (e.g., β_SC ≈ 0) and compute the optimal sourcing structure to achieve it.

The Dual Beta Model (Risk vs. Asset)

MetricDescriptionExample Persona Use Case
Supply Chain Beta (β_SC) — The Instability MetricThis Factor Beta measures sourcing sensitivity to external factors (commodities, tariffs, FX Risk).Mei (Risk Officer): A high β_SC on a supplier triggers a risk alert. She can then investigate and recommend hedging strategies.
Carbon Credit Beta (β_CC) — The Stability MetricThis Factor Beta quantifies a sourcing decision's future asset value through verifiable Carbon Insetting.Raj (Manager): Uses a high β_CC to justify strategic partnerships with sustainable suppliers, knowing it will yield tangible ESG assets.
Combined ViewThe system uses both Betas to create a complete picture of a supplier's true cost and value.Anika (Analyst): Sees a supplier with a low (good) β_SC but a high (good) β_CC, identifying them as a highly stable, high-value partner for the preferred program.

The Causal Viability Framework

Repurposing financial models to visualize Adaptive Transformations Across Related Variables for Causality in complex systems.

1. Core Variables & Components

The study must transition from simple arithmetic addition/subtraction to Adaptive Transformation (T^\hat{T}).

Waterfall ComponentCausal Study ConceptDescription
Initial ValueInitial Coherence State (C0C_0)The starting value of the system (e.g., a stock's initial price, the coherence of a qubit, or an investor's portfolio value). This is the initial "bar."
Increase BarPositive Adaptive Transformation (T^+\hat{T}_{+})An event/variable that increases the Causal State. The transformation is defined by a non-linear function based on system state and external input.
Decrease BarNegative Adaptive Transformation (T^\hat{T}_{-})An event/variable that decreases the Causal State. Represents decay, friction, or detrimental influence.
Final ValueNet Causal State (CNC_N)The final value after all transformations. This is the new, stabilized value (the final "bar").
Non-Additive SumAdaptive Function (fAf_{A}) over TimeThe critical difference: i=1NBari\sum_{i=1}^{N} \text{Bar}_i is replaced by the sequential application of functions: CN=T^N(T^2(T^1(C0)))C_N = \hat{T}_N(\dots \hat{T}_2(\hat{T}_1(C_0))\dots).

2. Incorporating Probabilistic/Adaptive Variables

The "bars" in the study do not represent fixed values, but rather Adaptive Functions (T^\hat{T}) whose magnitude (the visual height of the bar) is determined by an underlying probability or system-coupling strength.

Causal VariableConceptual AnalogyMathematical Representation
Adaptive Weight (αi\alpha_i)The "magnitude" of the event (the bar's height).Determined stochastically or via system coupling: αi=g(Ci1,Xi)\alpha_i = g(C_{i-1}, \mathbf{X}_i)
Causality Sequence (SS)The order of the transformations matters.Transformations are non-commutative: T^1(T^2(C0))T^2(T^1(C0))\hat{T}_1(\hat{T}_2(C_0)) \neq \hat{T}_2(\hat{T}_1(C_0))
Non-Linear ImpactThe bar's effect is conditional on the system's state before the event.ΔCi=T^i(Ci1)Ci1\Delta C_i = \hat{T}_i(C_{i-1}) - C_{i-1}

3. Visualization and Interpretation

The visual chart must convey sequential transformation and the probabilistic nature of the impact.

Visual FeatureInterpretation of Causality
Bar Color/ShadeRepresents the Confidence Interval (CI) or the stability of the transformation T^\hat{T}. A solid bar is highly deterministic (High CI); a gradient/fuzzier bar is highly probabilistic (Low CI).
Bar Height/MagnitudeRepresents the Expected Value of the change (E[ΔCi]\mathbb{E}[\Delta C_i]).
Horizontal PositionThe Flow of Time/Causality. Each adaptive transformation is applied sequentially, moving from C0C_0 to CNC_N.
Net Result Bar (CNC_N)The Final Coherence State. Represents the stable state that results from the cumulative cascade of adaptive transformations.

Application Example (Financial/Systemic):

If C0C_0 is the stability of a supply chain, T^1\hat{T}_1 is "Sudden Demand Spike," T^2\hat{T}_2 is "Input Cost Increase," and T^3\hat{T}_3 is "Adaptive Inventory Response." The chart shows how the expected net stability CNC_N is achieved, where the magnitude of "Adaptive Inventory Response" (T^3\hat{T}_3) is entirely dependent on the value of the stability after the first two, C2C_2.

The Project Carbonite Advantage

Causal Linking vs. Brute-Force Data Merging

The Old Way: Brittle Data MergingOur Solution: The Causal Governance Layer

The Problem: High-Maintenance, Brittle Data

Traditional data pipelines force non-standardized schemas from disparate sources into a single table. This process is fragile, requires constant maintenance for data deduplication, and is a major source of the time gaps and failures that halt modern machine learning pipelines.

The Result: Resilient, Linked Intelligence

We use a Causal Hypergraph to form Hyper Edges that semantically link disparate IDs into one queryable entity. Our Factor Beta (β) model doesn't need the raw data to be merged or deduplicated, only causally linked. It operates at a higher level of abstraction, making the entire brittle deduplication layer obsolete.

Key Concepts

Hypergraph Model

A data structure used in advanced mathematics and computer science that goes beyond a simple graph. While a normal graph connects two nodes with an edge, a hypergraph connects multiple nodes with a single "hyperedge." This is essential for modeling complex, multi-party relationships, like those in a supply chain where an event (e.g., a port closure) simultaneously affects multiple suppliers, shippers, and buyers.

Factor Beta (β)

Borrowed from financial engineering (Capital Asset Pricing Model), Beta (β) is a measure of volatility or systemic risk. We adapt this concept to supply chains. A Beta of 1 means a supplier's cost moves exactly with the market. A Beta > 1 is more volatile. A Beta < 1 is more stable. Our goal is to use this to quantify and minimize supply chain risk.

Inverse Design

A problem-solving approach where you define the desired outcome first and then work backward to find the input parameters that produce it. Instead of asking "What happens if we switch to this supplier?", we ask "What is the optimal supplier mix to achieve a target cost and risk profile (a target Beta)?"

Matrix Boundaries

In our system, this is a conceptual container for a set of related data. Think of it as a clearly defined "box" in the hypergraph that holds all information related to one specific entity, like a single supplier or a single shipment. This allows the system to treat complex groups of data as a single, manageable unit.

Foreign Keys

A standard database term for the "digital threads" that connect different data tables or boundaries. For example, a "Supplier ID" acts as a foreign key that links a specific supplier in one table to all of their corresponding shipments in another table, creating a relational link.

Agentic Workflow

A series of automated, intelligent processes (agents) that perform specific tasks to support human experts. For example:For Anika (Analyst): An agent automatically pre-processes and scores new supplier data based on established criteria, presenting her with a prioritized list for review.For Raj (Manager): An agent continuously monitors market data and competitor movements, automatically generating a weekly "Strategic Opportunities" brief for his attention.For Mei (Risk Officer): An agent monitors geopolitical and climate news, flagging events that could impact high-risk suppliers and automatically triggering a re-calculation of their Beta score.