# Monitoring Stack - cAdvisor + Prometheus + Grafana

## 🚀 CÁCH SỬ DỤNG

### **Bước 1: Start toàn bộ monitoring stack**

```bash
cd monitoring
docker-compose up -d
```

### **Bước 2: Verify services đang chạy**

```bash
docker-compose ps

# OUTPUT:
# cadvisor      Running  0.0.0.0:8080->8080/tcp
# node-exporter Running  0.0.0.0:9100->9100/tcp
# prometheus    Running  0.0.0.0:9090->9090/tcp
# grafana       Running  0.0.0.0:3000->3000/tcp
```

### **Bước 3: Truy cập các services**

| Service | URL | Chức năng |
|---------|-----|----------|
| **cAdvisor** | http://localhost:8080 | Xem metrics container trực tiếp |
| **Prometheus** | http://localhost:9090 | Xem raw metrics, query |
| **Grafana** | http://localhost:3000 | Xem biểu đồ đẹp |

### **Bước 4: Login Grafana**

```
URL: http://localhost:3000
Username: admin
Password: admin123
```

---

## 📊 CÁCH DÙNG GRAFANA

### **1. Prometheus Datasource đã auto-config**
- ✅ Đã thêm sẵn `http://prometheus:9090`
- Bro có thể dùng ngay

### **2. Import Dashboards từ Grafana**

Vào: **Grafana → Dashboards → Import**

**Recommended Dashboards:**

1. **Docker Containers Monitoring** (ID: 14282)
   - CPU, RAM, Network của containers
   
2. **Node Exporter Full** (ID: 1860)
   - CPU, RAM, Disk, Network của host machine

3. **Prometheus Stats** (ID: 3662)
   - Monitor Prometheus itself

**Cách import:**
```
1. Vào Grafana
2. Click "+" → Import
3. Paste Dashboard ID (e.g., 14282)
4. Select Prometheus datasource
5. Click Import
```

### **3. Hoặc tạo Custom Dashboard**

```
1. Vào Grafana
2. Click "+" → Create → Dashboard
3. Add Panel
4. Query metrics từ Prometheus
```

**Popular metrics to query:**

```
# Container CPU %
container_cpu_usage_seconds_total

# Container Memory MB
container_memory_usage_bytes / 1024 / 1024

# Host CPU %
node_cpu_seconds_total

# Host Memory MB
node_memory_MemTotal_bytes / 1024 / 1024 - node_memory_MemAvailable_bytes / 1024 / 1024

# Network Traffic
rate(node_network_receive_bytes_total[5m])
```

---

## 🧪 KHI CHẠY LOCUST TEST

### **Setup 3 cửa sổ:**

1. **Terminal 1: Chạy app + monitoring**
   ```bash
   cd monitoring
   docker-compose up -d
   ```

2. **Terminal 2: Chạy Locust**
   ```bash
   cd ..
   locust -f locustfile.py --host=http://localhost:8000 --web
   ```

3. **Browser Tab 1: Locust Web UI**
   ```
   http://localhost:8089
   ```

4. **Browser Tab 2: Grafana Dashboard**
   ```
   http://localhost:3000
   ```

### **Workflow:**

```
1. Mở Locust UI (8089)
2. Mở Grafana (3000)
3. Start test từ Locust
4. Nhìn realtime metrics từ Grafana
5. Monitor CPU, RAM, Network
6. Stop test khi xong
```

---

## 🛑 STOP MONITORING

```bash
cd monitoring
docker-compose down

# Xóa data (nếu muốn reset)
docker volume rm monitoring_prometheus-data monitoring_grafana-data
```

---

## 📁 FILE STRUCTURE

```
monitoring/
├── docker-compose.yml          (Main config)
├── prometheus.yml              (Prometheus config)
├── README.md                   (File này)
└── grafana/
    └── provisioning/
        ├── datasources/
        │   └── prometheus.yml  (Auto-add Prometheus datasource)
        └── dashboards/
            └── dashboard.yml   (Dashboard provisioning)
```

---

## 💾 PERSISTENT DATA

```
Prometheus data:  monitoring_prometheus-data (docker volume)
Grafana data:     monitoring_grafana-data (docker volume)

Khi bro restart docker-compose, data vẫn giữ nguyên.
```

---

## 🔧 TROUBLESHOOTING

### **Problem: cAdvisor không chạy trên Docker Desktop Windows**

```bash
# CAdvisor cần --privileged mode (đã config sẵn)
# Nếu vẫn lỗi, thử:
docker run --privileged gcr.io/cadvisor/cadvisor:latest
```

### **Problem: Prometheus không kết nối đến cAdvisor**

```bash
# Check Prometheus targets:
http://localhost:9090/targets

# Nếu hiển thị "DOWN", check docker logs:
docker logs prometheus
```

### **Problem: Grafana không thấy metrics**

```bash
# 1. Verify Prometheus datasource:
http://localhost:3000 → Configuration → Data Sources

# 2. Test connection:
http://localhost:9090/api/v1/query?query=up
```

---

## 📈 METRICS EXPLANATION

### **cAdvisor Metrics:**
```
container_cpu_usage_seconds_total    = CPU usage
container_memory_usage_bytes         = Memory usage
container_network_receive_bytes_total = Network in
container_network_transmit_bytes_total= Network out
```

### **Node Exporter Metrics:**
```
node_cpu_seconds_total          = CPU time
node_memory_MemTotal_bytes      = Total memory
node_memory_MemAvailable_bytes  = Available memory
node_disk_reads_total           = Disk reads
node_disk_writes_total          = Disk writes
node_network_receive_bytes_total= Network in
```

---

## 🎯 BEST PRACTICES

1. **Monitor CPU & RAM** → Tìm bottleneck
2. **Monitor Network** → Xem bandwidth usage
3. **Monitor Disk I/O** → Xem database performance
4. **Set Resource Limits** → Tránh OOM
5. **Retention Policy** → Prometheus keep 30 days data

---

## 🚀 NEXT STEPS

1. ✅ Start monitoring stack
2. ✅ Run Locust test
3. ✅ Monitor realtime từ Grafana
4. ✅ Analyze results
5. ✅ Optimize app based on metrics

Happy monitoring! 📊🎯
