GenAI in Enterprise Data: Beyond the Hype

Every vendor is now an "AI company." Every product has a copilot. And every conference is saturated with GenAI announcements.

Beneath the hype, though, GenAI is delivering genuine value in enterprise data management. We've been separating signal from noise, implementing what works and learning from what doesn't.

Here's our honest assessment of where GenAI actually helps — and where it falls short.

Where GenAI Delivers Value Today

1. Natural Language to SQL (Text-to-SQL)

The promise: Business users query data in plain English.

The reality: It works — with caveats.

Modern LLMs are remarkably good at generating SQL from natural language. We've seen:

70-80% of simple queries generated correctly first time
50-60% of moderately complex queries needing only minor edits
Complex multi-join queries still requiring data team involvement

Key Success Factors

Text-to-SQL works best when:

The schema is well-documented with clear naming
A semantic layer provides business definitions
Users can see and verify the generated SQL
The system learns from corrections

Where it struggles:

Ambiguous business terms ("revenue" means different things to different teams)
Complex aggregations with multiple conditions
Understanding temporal nuances ("last quarter" vs "previous quarter")

Our recommendation: Implement for self-service analytics with data-literate users who can verify outputs. Don't deploy to general business users without guardrails.

2. Automated Documentation

The promise: AI writes and maintains documentation automatically.

The reality: A genuine time-saver.

This is perhaps the most mature GenAI use case for data:

Generate column descriptions from sample data
Infer relationships between tables
Create lineage documentation from SQL
Keep docs in sync as schemas evolve

We've seen teams reduce documentation effort by 60-70% whilst improving completeness.

Where it struggles:

Business context that isn't in the data
Complex transformation logic
Nuances that require domain expertise

Our recommendation: Use AI to generate first drafts, then have data owners review and enhance. Don't expect fully autonomous documentation.

3. Data Quality Rule Generation

The promise: AI learns what "normal" looks like and generates quality rules.

The reality: Highly effective for pattern detection.

LLMs excel at:

Identifying implicit constraints (e.g., "country is always two letters")
Detecting outlier patterns in historical data
Suggesting referential integrity rules
Flagging suspicious combinations

We've helped clients generate 200+ quality rules in hours that would have taken weeks manually.

Where it struggles:

Business rules that aren't reflected in historical patterns
Edge cases that are valid but rare
Context about why certain patterns exist

Our recommendation: Use AI to accelerate rule discovery, but have data stewards review before activation. Never auto-deploy generated rules.

4. Anomaly Explanation

The promise: When something breaks, AI explains why.

The reality: Transformative for incident response.

This is where GenAI shines. Given an anomaly, an LLM can:

Compare current patterns to historical norms
Identify potential root causes
Search documentation for relevant context
Draft stakeholder communications

We've seen mean time to resolution drop by 60%+ when teams have AI-assisted root cause analysis.

Where it struggles:

Novel issues with no historical precedent
Cross-system problems requiring infrastructure knowledge
Situations where the anomaly is actually correct

Our recommendation: Deploy for Tier 1 incident response. Escalate to humans when confidence is low.

5. Semantic Search Over Data

The promise: Find relevant data by describing what you need.

The reality: A step change from keyword search.

Traditional data discovery relies on knowing the right terms. GenAI enables:

"Find data about customer churn in EMEA"
"Which tables have information about product returns?"
"Where is revenue by channel calculated?"

Combined with a good catalogue, this dramatically improves data findability.

Where it struggles:

Poorly documented data
Ambiguous or overloaded terms
Data that exists but isn't catalogued

Our recommendation: Implement as a layer on top of your data catalogue. Investment in metadata quality pays dividends.

Where GenAI Falls Short

1. Fully Autonomous Data Engineering

The hype: AI writes and maintains your data pipelines.

The reality: Far from production-ready.

LLMs can generate dbt models or pipeline code, but:

Error rates are too high for autonomous deployment
Testing and validation still require humans
Debugging AI-generated code is often harder than writing it

What works instead: AI-assisted development — copilots that suggest code for humans to review and refine.

2. Data Strategy and Architecture

The hype: AI designs your data architecture.

The reality: LLMs lack the context to make strategic decisions.

Architecture decisions require understanding:

Organisational dynamics and constraints
Business strategy and priorities
Technical debt and migration realities
Team capabilities and culture

These are beyond what an LLM can assess.

What works instead: AI for research and option analysis, human judgement for decisions.

3. Complex Data Transformations

The hype: Describe what you want, AI transforms the data.

The reality: Works for simple cases, fails for complex ones.

AI can handle:

Basic cleaning (nulls, duplicates, type conversions)
Simple derived fields
Standard aggregations

AI struggles with:

Complex business logic with many conditions
Multi-step transformations with dependencies
Edge cases and exception handling

What works instead: AI generates scaffolding, humans refine the logic.

Implementation Principles

Based on our experience, here's how to approach GenAI in data management:

1. Start With Human-in-the-Loop

Every GenAI implementation should start with human review. Over time, as you build confidence, you can automate more. But never start fully autonomous.

2. Measure Accuracy Before Scaling

Before rolling out widely:

Sample outputs for accuracy
Track false positive and negative rates
Understand failure modes

If accuracy is below 80%, you're not ready for broad deployment.

3. Invest in Context

GenAI is only as good as the context it receives. Invest in:

Clear, consistent naming conventions
Comprehensive metadata and documentation
Semantic layers with business definitions
Quality historical data for learning

4. Plan for Failure

GenAI will make mistakes. Plan for:

How users identify incorrect outputs
Feedback mechanisms to improve the system
Fallback processes when AI fails
Audit trails for compliance

5. Be Sceptical of Vendor Claims

Most vendors are overstating GenAI capabilities. Ask:

What's the accuracy rate in production?
What are the failure modes?
How much setup is required?
Can we see reference customers?

If a vendor claims their AI "just works" without mentioning limitations, they're selling hype. Every AI system has boundaries — honest vendors tell you what they are.

The Near-Term Roadmap

Here's what we expect to become production-ready over the next 12-18 months:

| Capability | Timeline | Confidence | |------------|----------|------------| | Text-to-SQL with guardrails | Now | High | | Automated documentation | Now | High | | DQ rule generation (human review) | Now | High | | Anomaly explanation | Now | High | | Semantic data discovery | Now | Medium | | Autonomous simple pipelines | 12-18 months | Medium | | Complex transformation generation | 18-24 months | Low |

Conclusion

GenAI is real, but it's not magic. The use cases delivering value today are about augmentation — making humans more effective — not replacement.

The organisations getting value are:

Starting with specific, bounded use cases
Keeping humans in the loop
Investing in the foundational data quality and metadata that makes AI effective
Being honest about what works and what doesn't

The hype will fade. The practical applications will endure.

GenAI in Enterprise Data: Beyond the Hype

Where GenAI Delivers Value Today

1. Natural Language to SQL (Text-to-SQL)

Key Success Factors

2. Automated Documentation

3. Data Quality Rule Generation

4. Anomaly Explanation

5. Semantic Search Over Data

Where GenAI Falls Short

1. Fully Autonomous Data Engineering

2. Data Strategy and Architecture

3. Complex Data Transformations

Implementation Principles

1. Start With Human-in-the-Loop

2. Measure Accuracy Before Scaling

3. Invest in Context

4. Plan for Failure

5. Be Sceptical of Vendor Claims

The Near-Term Roadmap

Conclusion

Further Reading

Want to Discuss This Further?

Related Insights

Claude Agents & Sub-Agents: A Technical Deep-Dive for Data Management

Building AI Agents for Data Quality