Agentic Data Quality on GCP
Lloyds Banking Agentic Hackathon·
A multi-agent system that lets non-technical users detect, treat, and remediate data-quality issues across financial datasets — autonomously and auditably.

3rd place — an autonomous multi-agent data-quality pipeline on GCP, mentored by Google and Lloyds engineers on the Agent Development Kit.
The problem
Banks run on data, and bad data quietly costs them — wrong schemas, anomalies, and silent corruption flowing through legacy systems. The challenge, set with Lloyds Banking Group on Google Cloud, was to let non-technical users validate and fix data-quality issues at scale, with a full audit trail.
What we built
An autonomous multi-agent system built on Google's Agent Development Kit (ADK), coordinating specialized agents over a shared workflow:
- Identifier — automatically detects data-quality issues across datasets.
- Treatment — generates SQL remediation for the issues found.
- Remediator — executes fixes and validates the results (with shadow-table dry-runs before anything touches real data).
- Metrics — quantifies business impact and the cost of inaction.
- Orchestrator — coordinates the end-to-end pipeline, with a Knowledge Bank of best practices and mock JIRA ticketing.
It ran on GCP / BigQuery with a Streamlit front end and auto-discovery of the environment (project, buckets, datasets, schemas) so setup was near-zero-config. The root orchestrator used Gemini.
Result
3rd place, with mentorship from Google and Lloyds engineers on ADK and cloud architecture — invaluable for learning how agentic systems are actually built and deployed in a regulated setting.
What I learned
- Agentic systems live or die on safety rails. Shadow-table validation and auditable decisions were what made "let an agent fix the data" credible to a bank.
- Make it usable by the non-technical user. The orchestration only matters if someone without SQL can drive it.
Scroll sideways · click any photo to enlarge