# Operational Runbooks

This document provides operational procedures for running Moltres in production environments.

## Deployment

### Pre-Deployment Checklist

- [ ] All tests pass (`pytest`)
- [ ] Type checking passes (`mypy src`)
- [ ] Linting passes (`ruff check .`)
- [ ] Documentation is up to date
- [ ] Changelog is updated
- [ ] Version number is bumped
- [ ] Dependencies are reviewed for security issues
- [ ] Performance benchmarks are within acceptable ranges

### Deployment Steps

1. **Build Package**
   ```bash
   python -m build
   ```

2. **Test Installation**
   ```bash
   pip install dist/moltres-*.whl
   python -c "import moltres; print(moltres.__version__)"
   ```

3. **Upload to PyPI**
   ```bash
   twine upload dist/*
   ```

4. **Verify Release**
   ```bash
   pip install --upgrade moltres
   python -c "import moltres; print(moltres.__version__)"
   ```

5. **Create GitHub Release**
   - Tag the release: `git tag vX.Y.Z`
   - Push tag: `git push origin vX.Y.Z`
   - Create release on GitHub with changelog

## Schema Migrations

### Adding New Tables

1. **Create Migration Script**
   ```python
from moltres import connect
from moltres.table.schema import column

db = connect(os.getenv("DATABASE_URL"))

# Create new table
db.create_table(
    "new_table",
    [
        column("id", "INTEGER", primary_key=True),
        column("name", "TEXT"),
    ],
    if_not_exists=True,
).collect()
   ```

2. **Test Migration**
   - Test on staging environment first
   - Verify table structure
   - Test data insertion/retrieval

3. **Apply to Production**
   - Run during maintenance window
   - Monitor for errors
   - Verify table creation

### Modifying Existing Tables

1. **Add Columns**
   ```python
# Use ALTER TABLE (database-specific)
db.sql("ALTER TABLE existing_table ADD COLUMN new_column TEXT").collect()
   ```

2. **Drop Columns**
   ```python
# Use ALTER TABLE (database-specific)
db.sql("ALTER TABLE existing_table DROP COLUMN old_column").collect()
   ```

3. **Rename Columns**
   ```python
# Use ALTER TABLE (database-specific)
db.sql("ALTER TABLE existing_table RENAME COLUMN old_name TO new_name").collect()
   ```

### Migration Best Practices

1. **Backup First**: Always backup database before migrations
2. **Test on Staging**: Test all migrations on staging environment
3. **Rollback Plan**: Have rollback scripts ready
4. **Monitor**: Watch for errors during migration
5. **Document**: Document all schema changes

## Incident Response

### Database Connection Issues

**Symptoms:**
- Connection timeouts
- "Connection refused" errors
- Pool exhaustion errors

**Diagnosis:**
```python
from moltres.utils.health import check_connection_health, check_pool_health

# Check connection health
result = check_connection_health(db)
print(result)

# Check pool health
pool_result = check_pool_health(db)
print(pool_result)
```

**Resolution:**
1. Check database server status
2. Verify network connectivity
3. Check firewall rules
4. Review connection pool settings
5. Increase pool size if needed:
   ```python
   db = connect(
       dsn,
       pool_size=20,  # Increase from default
       max_overflow=40,
   )
   ```

### Query Performance Issues

**Symptoms:**
- Slow query execution
- Timeout errors
- High CPU usage

**Diagnosis:**
```python
# Enable query logging
from moltres import register_performance_hook

def log_slow_queries(sql: str, elapsed: float, metadata: dict):
    if elapsed > 1.0:
        print(f"Slow query ({elapsed:.2f}s): {sql[:200]}")

register_performance_hook("query_end", log_slow_queries)
```

**Resolution:**
1. Analyze slow queries with EXPLAIN
2. Add indexes on frequently queried columns
3. Optimize query patterns
4. Consider query caching
5. Scale database resources if needed

### Memory Issues

**Symptoms:**
- Out of memory errors
- High memory usage
- Process crashes

**Diagnosis:**
```python
import psutil
import os

process = psutil.Process(os.getpid())
print(f"Memory usage: {process.memory_info().rss / 1024 / 1024} MB")
```

**Resolution:**
1. Use streaming for large datasets:
   ```python
# Use streaming mode
df = db.read.csv("large_file.csv", stream=True)
   ```
2. Process data in chunks
3. Close database connections when done
4. Monitor memory usage
5. Scale application resources

### Data Corruption

**Symptoms:**
- Unexpected query results
- Constraint violations
- Data inconsistencies

**Diagnosis:**
```python
# Verify data integrity
result = db.sql("SELECT COUNT(*) FROM table_name").collect()
print(f"Row count: {result[0]['count']}")

# Check for duplicates
result = db.sql("""
    SELECT column_name, COUNT(*) as count
    FROM table_name
    GROUP BY column_name
    HAVING COUNT(*) > 1
""").collect()
```

**Resolution:**
1. Identify corrupted data
2. Restore from backup if needed
3. Fix data inconsistencies
4. Add data validation
5. Implement data integrity checks

## Monitoring

### Key Metrics to Monitor

1. **Query Performance**
   - Average query duration
   - P95/P99 query latency
   - Slow query count

2. **Connection Pool**
   - Active connections
   - Pool utilization
   - Connection wait time

3. **Error Rates**
   - Query error rate
   - Connection error rate
   - Timeout rate

4. **Resource Usage**
   - Memory usage
   - CPU usage
   - Network I/O

### Setting Up Monitoring

```python
from moltres.utils.telemetry import get_metrics_collector, get_structured_logger

# Get metrics
metrics = get_metrics_collector()
print(metrics.get_metrics())

# Structured logging
logger = get_structured_logger()
# Logs are automatically structured JSON
```

### Alerting Thresholds

- **Query Duration**: Alert if P95 > 5 seconds
- **Error Rate**: Alert if error rate > 1%
- **Connection Pool**: Alert if utilization > 80%
- **Memory Usage**: Alert if usage > 80% of available

## Backup and Recovery

### Backup Procedures

1. **Database Backups**
   ```bash
   # PostgreSQL
   pg_dump -h host -U user -d dbname > backup.sql
   
   # MySQL
   mysqldump -h host -u user -p dbname > backup.sql
   
   # SQLite
   sqlite3 database.db ".backup backup.db"
   ```

2. **Application Data Backups**
   - Export critical data to files
   - Store backups in secure location
   - Encrypt sensitive backups

### Recovery Procedures

1. **Restore from Backup**
   ```bash
   # PostgreSQL
   psql -h host -U user -d dbname < backup.sql
   
   # MySQL
   mysql -h host -u user -p dbname < backup.sql
   
   # SQLite
   cp backup.db database.db
   ```

2. **Point-in-Time Recovery**
   - Use database transaction logs
   - Restore to specific timestamp
   - Verify data integrity

## Scaling

### Horizontal Scaling

1. **Read Replicas**
   - Set up read replicas for read-heavy workloads
   - Route read queries to replicas
   - Keep writes on primary

2. **Connection Pooling**
   - Increase pool size for more connections
   - Use connection pooler (e.g., PgBouncer)

### Vertical Scaling

1. **Database Resources**
   - Increase CPU/memory for database server
   - Add faster storage (SSD)
   - Optimize database configuration

2. **Application Resources**
   - Increase application server resources
   - Use multiple application instances
   - Load balance requests

## Maintenance Windows

### Scheduled Maintenance

1. **Database Maintenance**
   - Run VACUUM/OPTIMIZE (database-specific)
   - Update statistics
   - Check for corruption

2. **Application Maintenance**
   - Deploy updates
   - Restart services
   - Clear caches

### Maintenance Checklist

- [ ] Notify users of maintenance window
- [ ] Backup database
- [ ] Put application in maintenance mode
- [ ] Perform maintenance tasks
- [ ] Verify system health
- [ ] Remove maintenance mode
- [ ] Monitor for issues

## Troubleshooting

### Common Issues

1. **"Connection pool exhausted"**
   - Increase pool size
   - Check for connection leaks
   - Use connection pooler

2. **"Query timeout"**
   - Optimize slow queries
   - Increase query timeout
   - Add indexes

3. **"Table does not exist"**
   - Verify table name
   - Check schema/database
   - Verify permissions

4. **"Permission denied"**
   - Check database user permissions
   - Verify table access
   - Check firewall rules

### Debug Mode

Enable debug logging:

```python
import logging

logging.basicConfig(level=logging.DEBUG)
logging.getLogger("moltres").setLevel(logging.DEBUG)
```

### Getting Help

1. Check documentation
2. Search GitHub issues
3. Review logs
4. Contact support (if available)