Operational Runbooks
This document provides operational procedures for running Moltres in production environments.
Deployment
Pre-Deployment Checklist
[ ] All tests pass (
pytest)[ ] Type checking passes (
mypy src)[ ] Linting passes (
ruff check .)[ ] Documentation is up to date
[ ] Changelog is updated
[ ] Version number is bumped
[ ] Dependencies are reviewed for security issues
[ ] Performance benchmarks are within acceptable ranges
Deployment Steps
Build Package
python -m build
Test Installation
pip install dist/moltres-*.whl python -c "import moltres; print(moltres.__version__)"
Upload to PyPI
twine upload dist/*
Verify Release
pip install --upgrade moltres python -c "import moltres; print(moltres.__version__)"
Create GitHub Release
Tag the release:
git tag vX.Y.ZPush tag:
git push origin vX.Y.ZCreate release on GitHub with changelog
Schema Migrations
Adding New Tables
Create Migration Script
from moltres import connect from moltres.table.schema import column
db = connect(os.getenv(“DATABASE_URL”))
Create new table
db.create_table( “new_table”, [ column(“id”, “INTEGER”, primary_key=True), column(“name”, “TEXT”), ], if_not_exists=True, ).collect()
2. **Test Migration**
- Test on staging environment first
- Verify table structure
- Test data insertion/retrieval
3. **Apply to Production**
- Run during maintenance window
- Monitor for errors
- Verify table creation
### Modifying Existing Tables
1. **Add Columns**
```python
# Use ALTER TABLE (database-specific)
db.sql("ALTER TABLE existing_table ADD COLUMN new_column TEXT").collect()
Drop Columns
Use ALTER TABLE (database-specific)
db.sql(“ALTER TABLE existing_table DROP COLUMN old_column”).collect()
3. **Rename Columns**
```python
# Use ALTER TABLE (database-specific)
db.sql("ALTER TABLE existing_table RENAME COLUMN old_name TO new_name").collect()
Migration Best Practices
Backup First: Always backup database before migrations
Test on Staging: Test all migrations on staging environment
Rollback Plan: Have rollback scripts ready
Monitor: Watch for errors during migration
Document: Document all schema changes
Incident Response
Database Connection Issues
Symptoms:
Connection timeouts
“Connection refused” errors
Pool exhaustion errors
Diagnosis:
from moltres.utils.health import check_connection_health, check_pool_health
# Check connection health
result = check_connection_health(db)
print(result)
# Check pool health
pool_result = check_pool_health(db)
print(pool_result)
Resolution:
Check database server status
Verify network connectivity
Check firewall rules
Review connection pool settings
Increase pool size if needed:
db = connect( dsn, pool_size=20, # Increase from default max_overflow=40, )
Query Performance Issues
Symptoms:
Slow query execution
Timeout errors
High CPU usage
Diagnosis:
# Enable query logging
from moltres import register_performance_hook
def log_slow_queries(sql: str, elapsed: float, metadata: dict):
if elapsed > 1.0:
print(f"Slow query ({elapsed:.2f}s): {sql[:200]}")
register_performance_hook("query_end", log_slow_queries)
Resolution:
Analyze slow queries with EXPLAIN
Add indexes on frequently queried columns
Optimize query patterns
Consider query caching
Scale database resources if needed
Memory Issues
Symptoms:
Out of memory errors
High memory usage
Process crashes
Diagnosis:
import psutil
import os
process = psutil.Process(os.getpid())
print(f"Memory usage: {process.memory_info().rss / 1024 / 1024} MB")
Resolution:
Use streaming for large datasets:
Use streaming mode
df = db.read.csv(“large_file.csv”, stream=True)
2. Process data in chunks
3. Close database connections when done
4. Monitor memory usage
5. Scale application resources
### Data Corruption
**Symptoms:**
- Unexpected query results
- Constraint violations
- Data inconsistencies
**Diagnosis:**
```python
# Verify data integrity
result = db.sql("SELECT COUNT(*) FROM table_name").collect()
print(f"Row count: {result[0]['count']}")
# Check for duplicates
result = db.sql("""
SELECT column_name, COUNT(*) as count
FROM table_name
GROUP BY column_name
HAVING COUNT(*) > 1
""").collect()
Resolution:
Identify corrupted data
Restore from backup if needed
Fix data inconsistencies
Add data validation
Implement data integrity checks
Monitoring
Key Metrics to Monitor
Query Performance
Average query duration
P95/P99 query latency
Slow query count
Connection Pool
Active connections
Pool utilization
Connection wait time
Error Rates
Query error rate
Connection error rate
Timeout rate
Resource Usage
Memory usage
CPU usage
Network I/O
Setting Up Monitoring
from moltres.utils.telemetry import get_metrics_collector, get_structured_logger
# Get metrics
metrics = get_metrics_collector()
print(metrics.get_metrics())
# Structured logging
logger = get_structured_logger()
# Logs are automatically structured JSON
Alerting Thresholds
Query Duration: Alert if P95 > 5 seconds
Error Rate: Alert if error rate > 1%
Connection Pool: Alert if utilization > 80%
Memory Usage: Alert if usage > 80% of available
Backup and Recovery
Backup Procedures
Database Backups
# PostgreSQL pg_dump -h host -U user -d dbname > backup.sql # MySQL mysqldump -h host -u user -p dbname > backup.sql # SQLite sqlite3 database.db ".backup backup.db"
Application Data Backups
Export critical data to files
Store backups in secure location
Encrypt sensitive backups
Recovery Procedures
Restore from Backup
# PostgreSQL psql -h host -U user -d dbname < backup.sql # MySQL mysql -h host -u user -p dbname < backup.sql # SQLite cp backup.db database.db
Point-in-Time Recovery
Use database transaction logs
Restore to specific timestamp
Verify data integrity
Scaling
Horizontal Scaling
Read Replicas
Set up read replicas for read-heavy workloads
Route read queries to replicas
Keep writes on primary
Connection Pooling
Increase pool size for more connections
Use connection pooler (e.g., PgBouncer)
Vertical Scaling
Database Resources
Increase CPU/memory for database server
Add faster storage (SSD)
Optimize database configuration
Application Resources
Increase application server resources
Use multiple application instances
Load balance requests
Maintenance Windows
Scheduled Maintenance
Database Maintenance
Run VACUUM/OPTIMIZE (database-specific)
Update statistics
Check for corruption
Application Maintenance
Deploy updates
Restart services
Clear caches
Maintenance Checklist
[ ] Notify users of maintenance window
[ ] Backup database
[ ] Put application in maintenance mode
[ ] Perform maintenance tasks
[ ] Verify system health
[ ] Remove maintenance mode
[ ] Monitor for issues
Troubleshooting
Common Issues
“Connection pool exhausted”
Increase pool size
Check for connection leaks
Use connection pooler
“Query timeout”
Optimize slow queries
Increase query timeout
Add indexes
“Table does not exist”
Verify table name
Check schema/database
Verify permissions
“Permission denied”
Check database user permissions
Verify table access
Check firewall rules
Debug Mode
Enable debug logging:
import logging
logging.basicConfig(level=logging.DEBUG)
logging.getLogger("moltres").setLevel(logging.DEBUG)
Getting Help
Check documentation
Search GitHub issues
Review logs
Contact support (if available)