手把手教你为Docker Compose启动的Milvus部署监控

2025-01-08

By 臧伟

手把手教你为Docker Compose启动的Milvus部署监控

背景

Milvus Standalone 作为单机服务器部署,把所有组件都打包到一个 Docker 镜像中,部署起来非常方便。对于中型数据集而言,在内存充足的单机上运行 Milvus Standalone 是一个不错的选择。此外,Milvus Standalone 通过主从复制支持高可用性。

另外,Milvus天然支持 Prometheus 来监控指标,以及 Grafana 来可视化指标和创建警报,但是文档中只是列出了在 Kubernetes 上部署监控服务操作步骤 https://milvus.io/docs/zh/monitor.md , 其实在 Milvus Standalone 也可以集成部署 Prometheus 和 Grafana 来监控 Milvus 服务。

Docker compose的Milvus Standalone监控部署

首先,我们来看看完整的 docker compose文件

services:
  etcd:
    container_name: milvus-etcd
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
      - ETCD_SNAPSHOT_COUNT=50000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
    healthcheck:
      test: ["CMD", "etcdctl", "endpoint", "health"]
      interval: 30s
      timeout: 20s
      retries: 3

  minio:
    container_name: milvus-minio
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    ports:
      - "9001:9001"
      - "9000:9000"
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data
    command: minio server /minio_data --console-address ":9001"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3

  standalone:
    container_name: milvus-standalone
    image: milvusdb/milvus:v2.4.11
    command: ["milvus", "run", "standalone"]
    security_opt:
    - seccomp:unconfined
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
      - ./milvus.yaml:/milvus/configs/milvus.yaml 
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
      interval: 30s
      start_period: 90s
      timeout: 20s
      retries: 3
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - "etcd"
      - "minio"

  prometheus:
    image: prom/prometheus
    container_name: prometheus
    user: root
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
    ports:
      - 9090:9090
    restart: unless-stopped
    volumes:
      - ./prometheus:/etc/prometheus
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/prometheus:/prometheus

  grafana:
    image: grafana/grafana
    container_name: grafana
    user: root
    ports:
      - 3000:3000
    restart: unless-stopped
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=grafana
    volumes:
      - ./grafana/datasource.yml:/etc/grafana/provisioning/datasources/datasource.yml
      - ./grafana/dashboard.yml:/etc/grafana/provisioning/dashboards/main.yml
      - ./grafana/dashboards:/var/lib/grafana/dashboards
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/grafana:/var/lib/grafana

networks:
  default:
    name: milvus

部署 Prometheus

由于 Milvus 为 Prometheus 在 http://<component-host>:9091/metrics 上导出每个 Milvus 组件的指标。因此,我们在 Promtheus 的 scrape_configs 设置这个地址

scrape_configs:
   # Allows ephemeral and batch jobs to expose their metrics to Prometheus 
  - job_name: 'milvus-standalone'
    honor_labels: true
    metrics_path: /metrics
    static_configs:
    - targets: ['standalone:9091']

同时在docker compose文件里,增加 Prometheus Service 部署

  prometheus:
    image: prom/prometheus
    container_name: prometheus
    user: root
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
    ports:
      - 9090:9090
    restart: unless-stopped
    volumes:
      - ./prometheus:/etc/prometheus
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/prometheus:/prometheus

部署 Grafana

前面的 Promtheus 的部署中,我们定义 Prometheus 的端口是 9090。因此,在 Grafana 定义 Prometheus 数据源

datasources:
- name: Prometheus
  type: prometheus
  url: http://prometheus:9090 
  isDefault: true
  access: proxy
  editable: true

同时,一个 Milvus Standalone 监控看板 参见 https://github.com/milvus-io/milvus-docs/blob/v2.5.x/assets/standalone-monitoring/grafana/dashboards/milvus-standalone-dashboard.json

同样,我们也需要在docker compose文件里,增加 Grafana Service 部署

  grafana:
    image: grafana/grafana
    container_name: grafana
    user: root
    ports:
      - 3000:3000
    restart: unless-stopped
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=grafana
    volumes:
      - ./grafana/datasource.yml:/etc/grafana/provisioning/datasources/datasource.yml
      - ./grafana/dashboard.yml:/etc/grafana/provisioning/dashboards/main.yml
      - ./grafana/dashboards:/var/lib/grafana/dashboards
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/grafana:/var/lib/grafana

此时,我们可以通过 http://<your-host>:3000 进入 Grafana 界面

Grafana.png Grafana.png

然后查看 Milvus Standalone 监控大盘

Milvus Standalone.png Milvus Standalone.png

详细的docker compose 以及相关文件参见 https://github.com/milvus-io/milvus-docs/tree/v2.5.x/assets/standalone-monitoring ,需要注意到是,这个模版里是以 Milvus 2.4.11 为例,如果需要更新 Milvus 版本,需要对应的Docker image版本号即可。

总结

本文介绍如何在docker compose 部署的Milvus Standalone 服务增加 Prometheus 和 Grafana 来实现服务监控,为 Milvus Standalone 服务监控提供了便利。

  • 臧伟

    臧伟

    准备好开始了吗?

    立刻创建 Zilliz Cloud 集群,存储和检索您的向量。

    免费试用 Zilliz Cloud