Migration Guide
This guide helps you migrate to Moltres from other data processing libraries.
Migrating from Pandas
Basic Operations
Pandas:
import pandas as pd
df = pd.read_csv("data.csv")
filtered = df[df["age"] > 18]
result = filtered.groupby("category").sum()
Moltres:
from moltres import connect, col
db = connect("sqlite:///data.db")
df = db.read.csv("data.csv")
filtered = df.where(col("age") > 18)
result = filtered.group_by("category").agg(sum(col("amount")))
Key Differences
Lazy Evaluation: Moltres operations are lazy until
.collect()SQL Pushdown: Operations execute in the database
No In-Memory Data: Data stays in the database
Migrating from SQLAlchemy ORM
Query Building
SQLAlchemy ORM:
from sqlalchemy.orm import Session
from models import User
session = Session()
users = session.query(User).filter(User.age > 18).all()
Moltres:
from moltres import connect, col
db = connect("postgresql://...")
users = db.table("users").select().where(col("age") > 18).collect()
CRUD Operations
SQLAlchemy ORM:
# Create
user = User(name="Alice", age=30)
session.add(user)
session.commit()
# Update
user.age = 31
session.commit()
# Delete
session.delete(user)
session.commit()
Moltres:
# Create
db.createDataFrame([{"name": "Alice", "age": 30}]).write.insertInto("users")
# Update
df = db.table("users").select()
df.write.update("users", where=col("name") == "Alice", set={"age": 31})
# Delete
df.write.delete("users", where=col("name") == "Alice")
Migrating from PySpark
DataFrame Operations
PySpark:
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
df = spark.read.csv("data.csv")
result = df.filter(df.age > 18).groupBy("category").sum("amount")
Moltres:
from moltres import connect, col
db = connect("postgresql://...")
df = db.read.csv("data.csv")
result = df.where(col("age") > 18).group_by("category").agg(sum(col("amount")))
Key Differences
No Cluster: Moltres works with existing databases, no cluster needed
Same API: 98% API compatibility with PySpark
SQL Pushdown: All operations compile to SQL
Migrating from Ibis
Query Building
Ibis:
import ibis
con = ibis.postgres.connect(...)
table = con.table("users")
result = table.filter(table.age > 18).group_by("category").aggregate(...)
Moltres:
from moltres import connect, col
db = connect("postgresql://...")
df = db.table("users").select().where(col("age") > 18)
result = df.group_by("category").agg(...)
Key Differences
DataFrame API: Moltres uses DataFrame API (like Pandas/PySpark)
CRUD Operations: Moltres supports INSERT/UPDATE/DELETE
Type Safety: Full type hints throughout
Migration Checklist
Pre-Migration
[ ] Identify all data sources
[ ] Map current operations to Moltres equivalents
[ ] Identify breaking changes
[ ] Plan migration strategy (big bang vs. gradual)
Migration Steps
Setup
Install Moltres
Configure database connections
Test connectivity
Data Migration
Migrate data to target database
Verify data integrity
Set up indexes
Code Migration
Replace library imports
Update API calls
Update data access patterns
Testing
Test all operations
Verify results match
Performance testing
Deployment
Deploy to staging
Monitor for issues
Deploy to production
Post-Migration
[ ] Monitor performance
[ ] Verify data correctness
[ ] Update documentation
[ ] Train team members
Common Migration Patterns
Pattern 1: Gradual Migration
Keep existing system running
Migrate one module at a time
Use Moltres for new features
Gradually replace old code
Pattern 2: Big Bang Migration
Migrate entire system at once
Requires thorough testing
Higher risk but faster completion
Pattern 3: Hybrid Approach
Use Moltres for new features
Keep existing code as-is
Migrate when touching old code
Troubleshooting
Common Issues
Performance Differences
Add indexes
Optimize queries
Use connection pooling
API Differences
Check documentation
Use type hints for IDE help
Review examples
Data Type Mismatches
Verify schema
Check type mappings
Use explicit casting
Getting Help
Check documentation
Search GitHub issues
Ask questions in discussions
Review examples in docs/