手把手教你为Docker Compose启动的Milvus部署监控
背景
Milvus Standalone 作为单机服务器部署,把所有组件都打包到一个 Docker 镜像中,部署起来非常方便。对于中型数据集而言,在内存充足的单机上运行 Milvus Standalone 是一个不错的选择。此外,Milvus Standalone 通过主从复制支持高可用性。
另外,Milvus天然支持 Prometheus 来监控指标,以及 Grafana 来可视化指标和创建警报,但是文档中只是列出了在 Kubernetes 上部署监控服务操作步骤 https://milvus.io/docs/zh/monitor.md , 其实在 Milvus Standalone 也可以集成部署 Prometheus 和 Grafana 来监控 Milvus 服务。
Docker compose的Milvus Standalone监控部署
首先,我们来看看完整的 docker compose文件
services:
etcd:
container_name: milvus-etcd
image: quay.io/coreos/etcd:v3.5.5
environment:
- ETCD_AUTO_COMPACTION_MODE=revision
- ETCD_AUTO_COMPACTION_RETENTION=1000
- ETCD_QUOTA_BACKEND_BYTES=4294967296
- ETCD_SNAPSHOT_COUNT=50000
volumes:
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd
command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
healthcheck:
test: ["CMD", "etcdctl", "endpoint", "health"]
interval: 30s
timeout: 20s
retries: 3
minio:
container_name: milvus-minio
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
environment:
MINIO_ACCESS_KEY: minioadmin
MINIO_SECRET_KEY: minioadmin
ports:
- "9001:9001"
- "9000:9000"
volumes:
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data
command: minio server /minio_data --console-address ":9001"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
standalone:
container_name: milvus-standalone
image: milvusdb/milvus:v2.4.11
command: ["milvus", "run", "standalone"]
security_opt:
- seccomp:unconfined
environment:
ETCD_ENDPOINTS: etcd:2379
MINIO_ADDRESS: minio:9000
volumes:
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
- ./milvus.yaml:/milvus/configs/milvus.yaml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
interval: 30s
start_period: 90s
timeout: 20s
retries: 3
ports:
- "19530:19530"
- "9091:9091"
depends_on:
- "etcd"
- "minio"
prometheus:
image: prom/prometheus
container_name: prometheus
user: root
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- 9090:9090
restart: unless-stopped
volumes:
- ./prometheus:/etc/prometheus
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/prometheus:/prometheus
grafana:
image: grafana/grafana
container_name: grafana
user: root
ports:
- 3000:3000
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=grafana
volumes:
- ./grafana/datasource.yml:/etc/grafana/provisioning/datasources/datasource.yml
- ./grafana/dashboard.yml:/etc/grafana/provisioning/dashboards/main.yml
- ./grafana/dashboards:/var/lib/grafana/dashboards
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/grafana:/var/lib/grafana
networks:
default:
name: milvus
部署 Prometheus
由于 Milvus 为 Prometheus 在 http://<component-host>:9091/metrics
上导出每个 Milvus 组件的指标。因此,我们在 Promtheus 的 scrape_configs
设置这个地址
scrape_configs:
# Allows ephemeral and batch jobs to expose their metrics to Prometheus
- job_name: 'milvus-standalone'
honor_labels: true
metrics_path: /metrics
static_configs:
- targets: ['standalone:9091']
同时在docker compose文件里,增加 Prometheus Service 部署
prometheus:
image: prom/prometheus
container_name: prometheus
user: root
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- 9090:9090
restart: unless-stopped
volumes:
- ./prometheus:/etc/prometheus
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/prometheus:/prometheus
部署 Grafana
前面的 Promtheus 的部署中,我们定义 Prometheus 的端口是 9090。因此,在 Grafana 定义 Prometheus 数据源
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus:9090
isDefault: true
access: proxy
editable: true
同时,一个 Milvus Standalone 监控看板 参见 https://github.com/milvus-io/milvus-docs/blob/v2.5.x/assets/standalone-monitoring/grafana/dashboards/milvus-standalone-dashboard.json
同样,我们也需要在docker compose文件里,增加 Grafana Service 部署
grafana:
image: grafana/grafana
container_name: grafana
user: root
ports:
- 3000:3000
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=grafana
volumes:
- ./grafana/datasource.yml:/etc/grafana/provisioning/datasources/datasource.yml
- ./grafana/dashboard.yml:/etc/grafana/provisioning/dashboards/main.yml
- ./grafana/dashboards:/var/lib/grafana/dashboards
- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/grafana:/var/lib/grafana
此时,我们可以通过 http://<your-host>:3000
进入 Grafana 界面
Grafana.png
然后查看 Milvus Standalone 监控大盘
Milvus Standalone.png
详细的docker compose 以及相关文件参见 https://github.com/milvus-io/milvus-docs/tree/v2.5.x/assets/standalone-monitoring ,需要注意到是,这个模版里是以 Milvus 2.4.11 为例,如果需要更新 Milvus 版本,需要对应的Docker image版本号即可。
总结
本文介绍如何在docker compose 部署的Milvus Standalone 服务增加 Prometheus 和 Grafana 来实现服务监控,为 Milvus Standalone 服务监控提供了便利。
技术干货
LLMs 记忆体全新升级:六大新功能全面出击,用户体验值拉满!
本次,我们新增了价格计算器、取消存储配额限制、自动暂停不活跃数据库等功能,用户体验感再上新台阶。通过阅读本文,用户可以快速、详尽地了解 Zilliz Cloud 的六大新功能!
2023-5-5技术干货
如何在 Jupyter Notebook 用一行代码启动 Milvus?
本文将基于 Milvus Lite,为大家介绍如何在 Jupyter Notebook 中使用向量数据库。
2023-6-12技术干货
向量数据库发展迎里程碑时刻!Zilliz Cloud 全新升级:超高性价比,向量数据库唾手可得
升级后的 Zilliz Cloud 不仅新增了诸如支持 JSON 数据类型、动态 Schema 、Partition key 等新特性,而且在价格上给出了史无前例的优惠,例如推出人人可免费使用的 Serverless cluster 版本、上线经济型 CU 等。这意味着,更多的开发者可以在不考虑预算限制的情况下畅用云原生向量数据库。
2023-6-15