Priority Implementation Roadmap

Based on analysis of todo.md and current project state (post v0.12.0), here’s a prioritized list of the best next items to implement.

🚀 Immediate Next Steps (v0.13.0)

1. Schema Inspection & Reflection (High Impact, Medium Effort) ✅ COMPLETED

Priority: CRITICAL

  • [x] Table reflection (db.reflect_table(name))

  • [x] Database reflection (db.reflect())

  • [x] Schema introspection utilities (db.get_table_names(), db.get_view_names(), etc.)

  • [x] Column metadata introspection (db.get_columns(table_name), etc.)

Why First:

  • Foundation for other features: Enables better schema management, migrations, and tooling

  • Developer experience: Critical for working with existing databases

  • Low risk: SQLAlchemy already provides reflection APIs, mainly needs DataFrame API wrapper

  • High value: Users frequently need to inspect existing schemas

Estimated Effort: 2-3 weeks


2. FILTER Clause for Conditional Aggregation (High Impact, Low-Medium Effort) ✅ COMPLETED

Priority: HIGH

  • [x] FILTER clause support (COUNT(*) FILTER (WHERE condition))

Why Second:

  • SQL standard: Supported by PostgreSQL, MySQL 8.0+, SQL Server, Oracle

  • Common use case: Very useful for conditional aggregations without subqueries

  • Clean API: Fits naturally into existing aggregation methods

  • Performance: More efficient than subquery alternatives

Estimated Effort: 1-2 weeks


Status: COMPLETED (v0.13.0)


3. Schema Management - Constraints (High Impact, Medium Effort) ✅ COMPLETED

Priority: HIGH

  • [x] Unique constraints (UNIQUE)

  • [x] Check constraints (CHECK)

  • [x] Foreign key constraints (FOREIGN KEY ... REFERENCES)

  • [x] Indexes (CREATE INDEX, DROP INDEX)

Why Third:

  • Database fundamentals: Essential for production use

  • Enables migrations: Foundation for schema evolution

  • Data integrity: Critical for real-world applications

  • SQL standard: Well-defined, low ambiguity

Estimated Effort: 2-3 weeks


Status: COMPLETED (v0.13.0)


📈 Short-term (v0.14.0)

No items currently planned for v0.14.0


🔧 Developer Experience (v0.15.0)

7. Better Type Safety (Medium Impact, High Effort)

Priority: MEDIUM

  • [ ] Better type inference for schemas

  • [ ] Generic DataFrame types with schema

  • [ ] Type-safe column references

  • [ ] Better mypy coverage (reduce Any types)

Why:

  • Long-term value: Improves developer experience significantly

  • Catches errors early: Type safety prevents runtime issues

  • IDE support: Better autocomplete and error detection

  • Note: High effort but foundational for long-term maintainability

Estimated Effort: 3-4 weeks


8. Test Coverage Improvements (Medium Impact, Medium Effort)

Priority: MEDIUM

  • [ ] Increase coverage from 75% to 80%+

  • [ ] Property-based testing with Hypothesis

  • [ ] Load testing

Why:

  • Quality assurance: Prevents regressions

  • Confidence: Enables faster feature development

  • Current gap: Already at 75%, small push to 80%+

Estimated Effort: 2-3 weeks


🌐 Ecosystem Expansion (v0.16.0+)

9. DuckDB Support (High Impact, Low Effort) ✅ COMPLETED

Priority: MEDIUM-HIGH

  • [x] DuckDB dialect support

Why:

  • Growing popularity: DuckDB is rapidly gaining adoption

  • Low effort: SQLAlchemy has DuckDB support

  • Analytics focus: Aligns with Moltres’s analytics use case

  • Embedded database: Great for local development and testing

Status: COMPLETED

Estimated Effort: 1 week


10. BigQuery Support (High Impact, Medium Effort)

Priority: MEDIUM-HIGH

  • [ ] BigQuery dialect support

Why:

  • Enterprise adoption: BigQuery is widely used

  • Cloud-native: Important for modern data stacks

  • Analytics focus: Strong analytics capabilities

Estimated Effort: 2 weeks


🛠️ Infrastructure Improvements

11. Automated Release Process (Low Impact, Low Effort)

Priority: MEDIUM

  • [ ] Automated release process

  • [ ] Version bump automation

  • [ ] Changelog generation

Why:

  • Time savings: Reduces manual work for releases

  • Consistency: Ensures release process is repeatable

  • Low effort: Can leverage existing tools (semantic-release, etc.)

Estimated Effort: 1 week


12. Enhanced Documentation (Medium Impact, Medium Effort)

Priority: MEDIUM

  • [ ] Enhanced docs/index.md with better organization

  • [ ] Migration guides (Pandas, SQLAlchemy, Spark)

  • [ ] Performance benchmarks documentation

Why:

  • Adoption: Good documentation drives adoption

  • User onboarding: Reduces friction for new users

  • Marketing: Helps communicate value proposition

Estimated Effort: 2-3 weeks


📊 Advanced Features (Future)

13. UNNEST / Table-Valued Functions (High Impact, High Effort)

Priority: MEDIUM (after foundation)

  • [ ] UNNEST() support

  • [ ] Table-valued functions in FROM clause

Why:

  • Completes explode(): API exists but needs SQL compilation

  • Complex feature: Requires significant compiler work

  • Dialect-specific: Different implementations per database

Estimated Effort: 3-4 weeks


14. Transaction Control Enhancements (Medium Impact, Medium Effort)

Priority: MEDIUM

  • [ ] Savepoints

  • [ ] Transaction isolation levels

  • [ ] Locking (SELECT ... FOR UPDATE)

Why:

  • Production needs: Important for concurrent access

  • SQL standard: Well-defined features

  • Note: Current transaction support may be sufficient for many use cases

Estimated Effort: 2-3 weeks


🎯 Strategic Recommendations

Quick Wins (Do First)

  1. Schema inspection/reflection

  2. FILTER clause

  3. DuckDB support

Foundation Building (Do Early)

  1. Schema management (constraints, indexes)

High-Value Features (Do Soon)

  1. Better type safety

  2. Test coverage improvements

Ecosystem Expansion (Do When Ready)

  1. BigQuery support

  2. Snowflake support

  3. Redshift support

Nice-to-Have (Do Later)

  1. Advanced JSON functions

  2. Full-text search

  3. PIVOT/UNPIVOT


📝 Implementation Notes

Dependencies

  • Schema inspection should come before schema management features

  • Type safety improvements can be incremental

  • Dialect support can be added independently

Risk Assessment

  • Low Risk: Schema inspection, FILTER clause, DuckDB support

  • Medium Risk: Schema management

  • High Risk: Type safety overhaul, UNNEST support

Effort vs Impact Matrix

High Impact, Low Effort:

  • Schema inspection

  • FILTER clause

  • DuckDB support

High Impact, Medium Effort:

  • Schema management

  • BigQuery support

Medium Impact, High Effort:

  • Type safety improvements

  • UNNEST support

  • Advanced transaction control



💡 Key Insights

  1. Start with schema inspection - It’s foundational and enables many other features

  2. Prioritize SQL standard features - They work across databases and provide most value

  3. Incremental type safety - Don’t try to do it all at once

  4. Focus on developer experience - Schema inspection, better docs, and type safety all help adoption

  5. Quick wins matter - DuckDB support and FILTER clause provide high value with low effort