Postgres Troubleshooting Guide

1. Monitor CPU Usage

Use Kubernetes-native tools like kubectl top to check CPU usage for the PostgreSQL pod:
```
kubectl top pod <postgres-pod-name> -n <namespace>
```
Check if the CPU usage is consistently high or spiking.

2. Check PostgreSQL Metrics

Use PostgreSQL’s built-in statistics and monitoring tools:

pg_stat_activity: Identify active queries and their states.
```
SELECT * FROM pg_stat_activity;
```

pg_stat_statements: Identify the most resource-intensive queries.

SELECT
    query,
    calls,
    total_exec_time,
    rows,
    100.0 * total_exec_time / sum(total_exec_time) over () as percentage
FROM
    pg_stat_statements
ORDER BY
    total_exec_time DESC;

Reset pg_stat_statements:
```
SELECT pg_stat_statements_reset();
```
pg_stat_user_tables: Check table-level statistics like sequential scans, index scans, etc.
```
SELECT * FROM pg_stat_user_tables;
```

3. Analyze Slow Queries

Enable log_min_duration_statement to log slow queries:

ALTER SYSTEM SET log_min_duration_statement = '1000'; -- Log queries slower than 1 second
SELECT pg_reload_conf();

Check the PostgreSQL logs for slow queries:

kubectl logs <postgres-pod-name> -n <namespace>

4. Check Index Usage

Ensure that queries are using indexes efficiently:
- Look for sequential scans in pg_stat_user_tables.
- Use EXPLAIN or EXPLAIN ANALYZE to analyze query plans:
```
EXPLAIN ANALYZE SELECT * FROM your_table WHERE your_condition;
```

5. Review Resource Limits

Check the resource limits and requests for the PostgreSQL pod in the Kubernetes manifest:

resources:
  limits:
    cpu: "2"
    memory: "4Gi"
  requests:
    cpu: "1"
    memory: "2Gi"

Ensure the limits are appropriate for your workload.

6. Check for Connection Spikes

High CPU usage can be caused by too many connections. Check the number of active connections:
```
SELECT COUNT(*) FROM pg_stat_activity;
```
Adjust the max_connections parameter in postgresql.conf if necessary.

7. Inspect Disk I/O

High CPU usage can sometimes be caused by disk I/O bottlenecks. Check disk performance using tools like kubectl describe pod or node-level monitoring tools.

8. Review CrunchyData Operator Logs

Check the logs of the CrunchyData operator for any anomalies:
```
kubectl logs <crunchy-operator-pod-name> -n <namespace>
```

9. Scale Resources

If the database is under-provisioned, consider scaling up:
- Increase CPU/memory limits in the Kubernetes manifest.
- Add more replicas if using a clustered setup.

10. Use External Monitoring Tools

Integrate with monitoring tools like Prometheus, Grafana, or the CrunchyData monitoring suite to get detailed insights into database performance.

1. Monitor CPU Usage#

2. Check PostgreSQL Metrics#

3. Analyze Slow Queries#

4. Check Index Usage#

5. Review Resource Limits#

6. Check for Connection Spikes#

7. Inspect Disk I/O#

8. Review CrunchyData Operator Logs#

9. Scale Resources#

10. Use External Monitoring Tools#