Priority Implementation Roadmap

Based on analysis of todo.md and current project state (post v0.12.0), here’s a prioritized list of the best next items to implement.

🚀 Immediate Next Steps (v0.13.0)

1. Schema Inspection & Reflection (High Impact, Medium Effort) ✅ COMPLETED

Priority: CRITICAL

[x] Table reflection (db.reflect_table(name))
[x] Database reflection (db.reflect())
[x] Schema introspection utilities (db.get_table_names(), db.get_view_names(), etc.)
[x] Column metadata introspection (db.get_columns(table_name), etc.)

Why First:

Foundation for other features: Enables better schema management, migrations, and tooling
Developer experience: Critical for working with existing databases
Low risk: SQLAlchemy already provides reflection APIs, mainly needs DataFrame API wrapper
High value: Users frequently need to inspect existing schemas

Estimated Effort: 2-3 weeks

2. FILTER Clause for Conditional Aggregation (High Impact, Low-Medium Effort) ✅ COMPLETED

Priority: HIGH

[x] FILTER clause support (COUNT(*) FILTER (WHERE condition))

Why Second:

SQL standard: Supported by PostgreSQL, MySQL 8.0+, SQL Server, Oracle
Common use case: Very useful for conditional aggregations without subqueries
Clean API: Fits naturally into existing aggregation methods
Performance: More efficient than subquery alternatives

Estimated Effort: 1-2 weeks

Status: COMPLETED (v0.13.0)

3. Schema Management - Constraints (High Impact, Medium Effort) ✅ COMPLETED

Priority: HIGH

[x] Unique constraints (UNIQUE)
[x] Check constraints (CHECK)
[x] Foreign key constraints (FOREIGN KEY ... REFERENCES)
[x] Indexes (CREATE INDEX, DROP INDEX)

Why Third:

Database fundamentals: Essential for production use
Enables migrations: Foundation for schema evolution
Data integrity: Critical for real-world applications
SQL standard: Well-defined, low ambiguity

Estimated Effort: 2-3 weeks

Status: COMPLETED (v0.13.0)

📈 Short-term (v0.14.0)

No items currently planned for v0.14.0

🔧 Developer Experience (v0.15.0)

7. Better Type Safety (Medium Impact, High Effort)

Priority: MEDIUM

[ ] Better type inference for schemas
[ ] Generic DataFrame types with schema
[ ] Type-safe column references
[ ] Better mypy coverage (reduce Any types)

Why:

Long-term value: Improves developer experience significantly
Catches errors early: Type safety prevents runtime issues
IDE support: Better autocomplete and error detection
Note: High effort but foundational for long-term maintainability

Estimated Effort: 3-4 weeks

8. Test Coverage Improvements (Medium Impact, Medium Effort)

Priority: MEDIUM

[ ] Increase coverage from 75% to 80%+
[ ] Property-based testing with Hypothesis
[ ] Load testing

Why:

Quality assurance: Prevents regressions
Confidence: Enables faster feature development
Current gap: Already at 75%, small push to 80%+

Estimated Effort: 2-3 weeks

🌐 Ecosystem Expansion (v0.16.0+)

9. DuckDB Support (High Impact, Low Effort) ✅ COMPLETED

Priority: MEDIUM-HIGH

[x] DuckDB dialect support

Why:

Growing popularity: DuckDB is rapidly gaining adoption
Low effort: SQLAlchemy has DuckDB support
Analytics focus: Aligns with Moltres’s analytics use case
Embedded database: Great for local development and testing

Status: COMPLETED

Estimated Effort: 1 week

10. BigQuery Support (High Impact, Medium Effort)

Priority: MEDIUM-HIGH

[ ] BigQuery dialect support

Why:

Enterprise adoption: BigQuery is widely used
Cloud-native: Important for modern data stacks
Analytics focus: Strong analytics capabilities

Estimated Effort: 2 weeks

🛠️ Infrastructure Improvements

11. Automated Release Process (Low Impact, Low Effort)

Priority: MEDIUM

[ ] Automated release process
[ ] Version bump automation
[ ] Changelog generation

Why:

Time savings: Reduces manual work for releases
Consistency: Ensures release process is repeatable
Low effort: Can leverage existing tools (semantic-release, etc.)

Estimated Effort: 1 week

12. Enhanced Documentation (Medium Impact, Medium Effort)

Priority: MEDIUM

[ ] Enhanced docs/index.md with better organization
[ ] Migration guides (Pandas, SQLAlchemy, Spark)
[ ] Performance benchmarks documentation

Why:

Adoption: Good documentation drives adoption
User onboarding: Reduces friction for new users
Marketing: Helps communicate value proposition

Estimated Effort: 2-3 weeks

📊 Advanced Features (Future)

13. UNNEST / Table-Valued Functions (High Impact, High Effort)

Priority: MEDIUM (after foundation)

[ ] UNNEST() support
[ ] Table-valued functions in FROM clause

Why:

Completes explode(): API exists but needs SQL compilation
Complex feature: Requires significant compiler work
Dialect-specific: Different implementations per database

Estimated Effort: 3-4 weeks

14. Transaction Control Enhancements (Medium Impact, Medium Effort)

Priority: MEDIUM

[ ] Savepoints
[ ] Transaction isolation levels
[ ] Locking (SELECT ... FOR UPDATE)

Why:

Production needs: Important for concurrent access
SQL standard: Well-defined features
Note: Current transaction support may be sufficient for many use cases

Estimated Effort: 2-3 weeks

🎯 Strategic Recommendations

Quick Wins (Do First)

Schema inspection/reflection
FILTER clause
DuckDB support

Foundation Building (Do Early)

Schema management (constraints, indexes)

High-Value Features (Do Soon)

Better type safety
Test coverage improvements

Ecosystem Expansion (Do When Ready)

BigQuery support
Snowflake support
Redshift support

Nice-to-Have (Do Later)

Advanced JSON functions
Full-text search
PIVOT/UNPIVOT

📝 Implementation Notes

Dependencies

Schema inspection should come before schema management features
Type safety improvements can be incremental
Dialect support can be added independently

Risk Assessment

Low Risk: Schema inspection, FILTER clause, DuckDB support
Medium Risk: Schema management
High Risk: Type safety overhaul, UNNEST support

Effort vs Impact Matrix

High Impact, Low Effort:

Schema inspection
FILTER clause
DuckDB support

High Impact, Medium Effort:

Schema management
BigQuery support

Medium Impact, High Effort:

Type safety improvements
UNNEST support
Advanced transaction control

🎯 Recommended Sprint Plan

Sprint 1 (v0.13.0) - 4-6 weeks ✅ COMPLETED

Schema inspection & reflection ✅
FILTER clause ✅
Schema management (UNIQUE, CHECK, FOREIGN KEY, Indexes) ✅

Sprint 2 (v0.14.0) - 4-6 weeks

No items currently planned

Sprint 3 (v0.15.0) - 4-6 weeks

Type safety improvements (incremental)
Test coverage to 80%+
DuckDB support
Documentation enhancements

Sprint 4 (v0.16.0) - 4-6 weeks

BigQuery support
Automated release process
Migration guides
Performance benchmarks

💡 Key Insights

Start with schema inspection - It’s foundational and enables many other features
Prioritize SQL standard features - They work across databases and provide most value
Incremental type safety - Don’t try to do it all at once
Focus on developer experience - Schema inspection, better docs, and type safety all help adoption
Quick wins matter - DuckDB support and FILTER clause provide high value with low effort