Back to Agent Suite
Data Discovery
data-discoveryCrawls data sources to extract metadata, classify sensitive data, and build a searchable asset inventory.
Model:sonnet
Permission:default
Available Tools
ReadBashGlobGrepmcp__postgres__queryCore Responsibilities
- Connect to and crawl data sources
- Extract technical and business metadata
- Classify data sensitivity and PII
- Build comprehensive asset inventory
Capabilities
- •Database schema extraction
- •Table and column metadata collection
- •PII detection using pattern matching
- •Sensitivity classification
- •Relationship and foreign key detection
- •Asset tagging and categorization
- •Source system documentation
- •Data lineage inference
Outputs
Asset Inventory
JSONClassification Report
MarkdownCatalog Entries
YAMLPII Detection Report
JSONConstraints
- ⚠Read-only access to source systems
- ⚠Respect rate limits on APIs
- ⚠Maximum 1000 tables per discovery run
- ⚠PII patterns must be configurable
Configuration & Output
Agent Definition
---
name: data-discovery
description: Data Discovery Agent. Use to catalogue and
discover data assets across sources.
tools: Read, Bash, Glob, Grep, mcp__postgres__query
model: sonnet
permissionMode: default
---
# Data Discovery Agent
Crawls data sources to extract metadata, classify
sensitive data, and build asset inventories.Sample Output
{
"source": "postgres://warehouse",
"discovered_at": "2025-01-04T10:30:00Z",
"assets": [
{
"schema": "bronze",
"table": "customers",
"columns": 15,
"row_count_estimate": 1250000,
"pii_detected": ["email", "phone", "address"],
"classification": "confidential",
"tags": ["customer-data", "pii", "gdpr-relevant"]
}
],
"relationships": [
{
"from": "bronze.orders.customer_id",
"to": "bronze.customers.id",
"type": "foreign_key"
}
]
}Ready to Deploy This Agent?
Get the full agent template and implementation guide.