Financial Services
Data solutions for banks, insurers, and asset managers. From Consumer Duty compliance to regulatory reporting automation, powered by AI agents for continuous quality and governance.
Data Solutions for Financial Services
Technical approaches with architecture patterns and code examples.
Consumer Duty Compliance
FCA Consumer Duty & Fair Value Assessment
Automate the identification, monitoring, and reporting of consumer outcomes across products and channels.
The Challenge
The FCA Consumer Duty requires firms to demonstrate they are delivering good outcomes for retail customers. This means aggregating data across products, channels, complaints, and customer journeys to prove fair value and appropriate treatment.
- •Fragmented customer interaction data across legacy systems
- •No unified view of customer outcomes by product
- •Manual evidence gathering for regulatory audits
- •Inconsistent data quality across business lines
Agent-Powered Solution
Deploy Claude Agents to continuously profile customer data, identify outcome patterns, and generate compliance evidence.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ CONSUMER DUTY DATA PLATFORM │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────────────────────┼───────────────────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────────┐ ┌─────────┐
│ CRM │ │ Product │ │Complaints│
│ Systems │ │ Systems │ │ & NPS │
└────┬────┘ └──────┬──────┘ └────┬────┘
│ │ │
└──────────────────────────┼──────────────────────────┘
▼
┌───────────────────────┐
│ Bronze Layer │
│ (Raw Ingestion) │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ DQ Profiler Agent │
│ • Profile quality │
│ • Detect anomalies │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ Silver Layer │
│ (Customer 360) │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ Governance Agent │
│ • Consumer Duty map │
│ • Evidence tagging │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ Gold Layer │
│ (Outcome Metrics) │
└───────────────────────┘
Implementation Example
-- Consumer Duty Outcome Metrics (dbt model)
-- models/gold/consumer_duty_outcomes.sql
{{
config(
materialized='incremental',
unique_key='customer_product_key',
tags=['consumer_duty', 'regulatory']
)
}}
WITH customer_products AS (
SELECT
c.customer_id,
p.product_id,
p.product_type,
c.vulnerability_flag,
c.channel_preference
FROM {{ ref('silver_customers') }} c
JOIN {{ ref('silver_product_holdings') }} p
ON c.customer_id = p.customer_id
),
outcome_metrics AS (
SELECT
cp.customer_id,
cp.product_id,
-- Fair Value Metrics
AVG(t.fees_paid) as avg_fees,
SUM(t.value_received) as total_value,
-- Suitability Metrics
COUNT(CASE WHEN comp.outcome = 'upheld' THEN 1 END) as upheld_complaints,
-- Communication Metrics
AVG(comm.clarity_score) as avg_clarity_score,
-- Support Metrics
AVG(support.resolution_time_hours) as avg_resolution_time
FROM customer_products cp
LEFT JOIN {{ ref('silver_transactions') }} t ON cp.product_id = t.product_id
LEFT JOIN {{ ref('silver_complaints') }} comp ON cp.customer_id = comp.customer_id
LEFT JOIN {{ ref('silver_communications') }} comm ON cp.customer_id = comm.customer_id
LEFT JOIN {{ ref('silver_support_tickets') }} support ON cp.customer_id = support.customer_id
GROUP BY cp.customer_id, cp.product_id
)
SELECT
{{ dbt_utils.generate_surrogate_key(['customer_id', 'product_id']) }} as customer_product_key,
om.*,
-- Consumer Duty Outcome Flags
CASE
WHEN om.avg_fees > (SELECT percentile_cont(0.95) WITHIN GROUP (ORDER BY avg_fees) FROM outcome_metrics)
THEN 'REVIEW_REQUIRED'
ELSE 'ACCEPTABLE'
END as fair_value_flag,
CASE
WHEN om.upheld_complaints > 2 THEN 'HIGH_RISK'
WHEN om.upheld_complaints > 0 THEN 'MEDIUM_RISK'
ELSE 'LOW_RISK'
END as suitability_flag,
CURRENT_TIMESTAMP() as _loaded_at
FROM outcome_metrics omExpected Outcomes
Regulatory Reporting Automation
Basel, MiFID II, CCAR & Beyond
Automate data lineage, quality checks, and report generation for complex regulatory submissions.
The Challenge
Financial institutions face an ever-growing burden of regulatory reports, each with strict data quality requirements, complex aggregation rules, and tight deadlines. Manual processes are error-prone and expensive.
- •Multiple overlapping regulatory requirements
- •Complex data lineage requirements for auditability
- •Strict data quality thresholds with penalties
- •Tight submission deadlines with no margin for error
Agent-Powered Solution
Implement agent-driven data pipelines that automatically validate, transform, and certify data for regulatory submission.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ REGULATORY REPORTING PLATFORM │
└─────────────────────────────────────────────────────────────────┘
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Core Banking│ │ Risk Systems│ │ Trading │
│ (Positions) │ │ (Exposures) │ │ (Trades) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└──────────────────┼──────────────────┘
▼
┌───────────────────────┐
│ Ingestion Layer │
│ • CDC Streaming │
│ • T+1 Batch │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ DQ Recommender │
│ • Basel rules │
│ • MiFID rules │
│ • CCAR rules │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ Validation Layer │
│ • Great Expectations │
│ • Threshold checks │
└───────────┬───────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Basel III│ │ MiFID II │ │ CCAR │
│ Reports │ │ Reports │ │ Reports │
└──────────┘ └──────────┘ └──────────┘
Implementation Example
# Regulatory DQ Rules - Great Expectations Suite
# Generated by DQ Recommender Agent
expectations:
# Basel III - Risk Weighted Assets
- expectation_type: expect_column_values_to_not_be_null
kwargs:
column: exposure_at_default
meta:
regulation: "Basel III"
rule_id: "BASEL_RWA_001"
severity: "critical"
business_justification: "EAD required for RWA calculation"
- expectation_type: expect_column_values_to_be_between
kwargs:
column: probability_of_default
min_value: 0.0003
max_value: 1.0
meta:
regulation: "Basel III"
rule_id: "BASEL_PD_001"
severity: "critical"
business_justification: "PD floor per Basel III IRB approach"
- expectation_type: expect_column_values_to_be_in_set
kwargs:
column: asset_class
value_set: ["CORPORATE", "SOVEREIGN", "BANK", "RETAIL", "EQUITY"]
meta:
regulation: "Basel III"
rule_id: "BASEL_AC_001"
severity: "high"
# MiFID II - Transaction Reporting
- expectation_type: expect_column_values_to_match_regex
kwargs:
column: lei_code
regex: "^[A-Z0-9]{20}$"
meta:
regulation: "MiFID II"
rule_id: "MIFID_LEI_001"
severity: "critical"
business_justification: "Valid LEI required for transaction reporting"
- expectation_type: expect_column_values_to_not_be_null
kwargs:
column: execution_timestamp
meta:
regulation: "MiFID II"
rule_id: "MIFID_TS_001"
severity: "critical"Expected Outcomes
Single Customer View
Customer 360 & Golden Record
Create a unified, trusted view of customers across all products, channels, and interactions.
The Challenge
Banks and insurers struggle with fragmented customer data across legacy systems, making it impossible to deliver personalized experiences or accurate risk assessments.
- •Customer data scattered across 50+ systems
- •No consistent customer identifier across products
- •Duplicate and conflicting customer records
- •Real-time vs batch synchronization challenges
Agent-Powered Solution
Use AI agents to discover, profile, and match customer records across systems, creating a continuously updated golden record.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ CUSTOMER 360 PLATFORM │
└─────────────────────────────────────────────────────────────────┘
Source Systems:
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Core │ │ Cards │ │ Loans │ │ Digital │ │ CRM │
│ Banking │ │ System │ │ System │ │ Banking │ │ System │
└────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘
│ │ │ │ │
└───────────┴───────────┴───────────┴───────────┘
│
┌──────────────▼──────────────┐
│ Discovery Agent │
│ • Schema extraction │
│ • Attribute mapping │
│ • PII classification │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ Profiler Agent │
│ • Data quality scores │
│ • Matching candidates │
│ • Survivorship rules │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ Master Data Layer │
│ ┌─────────────────────┐ │
│ │ Golden Record │ │
│ │ • customer_id │ │
│ │ • master_name │ │
│ │ • master_address │ │
│ │ • confidence_score│ │
│ └─────────────────────┘ │
└──────────────┬──────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Real-time │ │ Analytics │ │ Marketing │
│ Servicing │ │ & BI │ │ Campaigns │
└───────────┘ └───────────┘ └───────────┘
Implementation Example
-- Customer Matching & Golden Record Creation
-- models/gold/customer_golden_record.sql
{{
config(
materialized='table',
unique_key='master_customer_id',
tags=['customer_360', 'mdm']
)
}}
WITH source_customers AS (
-- Aggregate customers from all source systems
SELECT 'CORE_BANKING' as source_system, customer_id, name, email, phone, address, updated_at
FROM {{ ref('bronze_core_banking_customers') }}
UNION ALL
SELECT 'CARDS' as source_system, card_holder_id, name, email, phone, address, updated_at
FROM {{ ref('bronze_cards_customers') }}
UNION ALL
SELECT 'DIGITAL' as source_system, user_id, name, email, phone, address, updated_at
FROM {{ ref('bronze_digital_customers') }}
),
-- Apply matching rules (simplified - production would use ML matching)
matched_clusters AS (
SELECT
*,
-- Deterministic matching on email
FIRST_VALUE(customer_id) OVER (
PARTITION BY LOWER(TRIM(email))
ORDER BY updated_at DESC
) as cluster_id
FROM source_customers
WHERE email IS NOT NULL
),
-- Apply survivorship rules to create golden record
golden_record AS (
SELECT
cluster_id as master_customer_id,
-- Name: Most recent from CRM or Core Banking
FIRST_VALUE(name) OVER (
PARTITION BY cluster_id
ORDER BY
CASE source_system
WHEN 'CRM' THEN 1
WHEN 'CORE_BANKING' THEN 2
ELSE 3
END,
updated_at DESC
) as master_name,
-- Email: Most complete (longest)
FIRST_VALUE(email) OVER (
PARTITION BY cluster_id
ORDER BY LENGTH(email) DESC
) as master_email,
-- Address: Most recent
FIRST_VALUE(address) OVER (
PARTITION BY cluster_id
ORDER BY updated_at DESC
) as master_address,
-- Confidence score based on match quality
COUNT(*) OVER (PARTITION BY cluster_id) as source_count,
COUNT(DISTINCT source_system) OVER (PARTITION BY cluster_id) as system_coverage
FROM matched_clusters
)
SELECT DISTINCT
master_customer_id,
master_name,
master_email,
master_address,
source_count,
system_coverage,
CASE
WHEN system_coverage >= 4 THEN 'HIGH'
WHEN system_coverage >= 2 THEN 'MEDIUM'
ELSE 'LOW'
END as confidence_level,
CURRENT_TIMESTAMP() as _created_at
FROM golden_recordExpected Outcomes
ODS Migration & Modernization
Legacy ODS to Modern Data Platform
Migrate operational data stores to cloud-native platforms while maintaining data quality and lineage.
The Challenge
Legacy ODS systems are expensive to maintain, difficult to scale, and lack the flexibility for modern analytics. Migration risks include data loss, quality degradation, and business disruption.
- •Complex dependencies on legacy ODS
- •Undocumented transformation logic
- •Risk of data loss during migration
- •Need to maintain parallel operations
Agent-Powered Solution
Agent-driven discovery and profiling of legacy ODS, automated rule generation for migration validation, and continuous reconciliation.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ ODS MIGRATION FRAMEWORK │
└─────────────────────────────────────────────────────────────────┘
Phase 1: Discovery & Baseline
┌─────────────┐ ┌─────────────────────┐
│ Legacy ODS │────────▶│ Discovery Agent │
│ (Oracle/DB2)│ │ • Object inventory │
│ │ │ • Dependency map │
│ │ │ • Consumer catalog │
└─────────────┘ └──────────┬──────────┘
│
┌──────────▼──────────┐
│ Profiler Agent │
│ • Baseline metrics │
│ • Quality scores │
│ • Pattern detection│
└──────────┬──────────┘
│
Phase 2: Design & Build ▼
┌─────────────────────┐
│ Modeller Agent │
│ • Target schema │
│ • Medallion design │
│ • dbt models │
└──────────┬──────────┘
│
Phase 3: Migration & Validation ▼
┌─────────────┐ ┌─────────────────────┐
│ Legacy ODS │────────▶│ Migration Engine │
└─────────────┘ │ (Fivetran/Airbyte) │
└──────────┬──────────┘
│
┌──────────▼──────────┐ ┌─────────────┐
│ Recommender Agent │────────▶│ Modern DWH │
│ • Reconciliation │ │ (Snowflake) │
│ • Validation rules │ └─────────────┘
└─────────────────────┘
Implementation Example
# ODS Migration Reconciliation Script
# Generated by DQ Recommender Agent
import great_expectations as gx
from typing import Dict, Any
class ODSMigrationValidator:
"""
Validates data consistency between legacy ODS and modern platform.
"""
def __init__(self, legacy_conn: str, target_conn: str):
self.legacy = self._connect(legacy_conn)
self.target = self._connect(target_conn)
self.context = gx.get_context()
def generate_reconciliation_rules(self, table_name: str) -> Dict[str, Any]:
"""
Auto-generate reconciliation rules based on profiling.
"""
# Profile legacy table
legacy_profile = self._profile_table(self.legacy, table_name)
rules = {
"table": table_name,
"rules": [
# Row count match
{
"type": "row_count_match",
"tolerance": 0.0, # Exact match required
"legacy_count": legacy_profile["row_count"]
},
# Aggregate matches for numeric columns
*[
{
"type": "aggregate_match",
"column": col["name"],
"aggregations": ["sum", "avg", "min", "max"],
"tolerance": 0.001 # 0.1% tolerance for floating point
}
for col in legacy_profile["columns"]
if col["data_type"] in ["NUMBER", "DECIMAL", "FLOAT"]
],
# Distinct count matches for key columns
*[
{
"type": "distinct_count_match",
"column": col["name"],
"tolerance": 0.0
}
for col in legacy_profile["columns"]
if col["is_key"]
]
]
}
return rules
def validate_migration(self, table_name: str) -> Dict[str, Any]:
"""
Execute reconciliation validation.
"""
rules = self.generate_reconciliation_rules(table_name)
results = {"table": table_name, "validations": []}
for rule in rules["rules"]:
if rule["type"] == "row_count_match":
legacy_count = self._get_count(self.legacy, table_name)
target_count = self._get_count(self.target, table_name)
passed = abs(legacy_count - target_count) <= rule["tolerance"]
results["validations"].append({
"rule": "row_count_match",
"legacy": legacy_count,
"target": target_count,
"passed": passed
})
# ... additional rule implementations
return results
# Usage
validator = ODSMigrationValidator(
legacy_conn="oracle://legacy-ods:1521/PROD",
target_conn="snowflake://account/warehouse/database"
)
# Validate all migrated tables
for table in ["CUSTOMER", "ACCOUNT", "TRANSACTION"]:
result = validator.validate_migration(table)
print(f"{table}: {'PASSED' if all(v['passed'] for v in result['validations']) else 'FAILED'}")Expected Outcomes
Financial Services in Action
Anonymised examples of how we've helped banks and insurers transform their data operations.
Consumer Duty Compliance Platform
Built an automated evidence generation platform for FCA Consumer Duty compliance, integrating customer outcome data from 12 source systems.
Single Customer View Implementation
Created a unified customer view across 8M policyholders, enabling personalised communications and cross-sell optimisation.
Client names withheld by policy. References available under NDA.
Ready to Transform Your Data Operations?
Book a discovery call to discuss your specific regulatory and data challenges.