监控指标¶
GOST内部通过Prometheus的指标(Metrics)来提供监控数据。
开启服务¶
Metrics服务支持两种运行方式:全局服务和普通服务。
当使用全局服务,采用web API方式进行配置重载时,服务将不受影响。
全局服务¶
通过命令行-metrics
或配置文件中的metrics
对象来定义metrics服务。
services:
- name: service-0
addr: ":8080"
handler:
type: auto
listener:
type: tcp
metrics:
addr: :9000
# unix domain socket
# addr: unix:///var/run/gost.sock
path: /metrics
auth:
username: user
password: pass
auther: auther-0
metrics.addr
(string)- 监控指标HTTP API服务地址
metrics.path
(string, default=/metrics)- API路径
普通服务¶
采用普通服务运行时,可以使用服务所支持的所有功能。
身份认证¶
身份认证采用HTTP Basic Auth方式。
配置文件中通过auth
或auther
选项可以设置身份认证信息,如果设置了auther
选项,auth
选项则会被忽略。
开启之后可以通过http://localhost:9000/metrics
地址查看到指标数据。
指标示例
gost_chain_errors_total{chain="chain-0",host="host-0"} 1
gost_service_handler_errors_total{host="host-0",service="service-0"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.005"} 0
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.01"} 0
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.025"} 0
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.05"} 0
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.1"} 0
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.25"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="0.5"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="1"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="2.5"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="5"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="10"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="15"} 1
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="30"} 2
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="60"} 2
gost_service_request_duration_seconds_bucket{host="host-0",service="service-0",le="+Inf"} 2
gost_service_request_duration_seconds_sum{host="host-0",service="service-0"} 15.172895206
gost_service_request_duration_seconds_count{host="host-0",service="service-0"} 2
gost_service_requests_in_flight{host="host-0",service="service-0"} 0
gost_service_requests_total{host="host-0",service="service-0"} 2
gost_service_transfer_input_bytes_total{host="host-0",service="service-0"} 1018
gost_service_transfer_output_bytes_total{host="host-0",service="service-0"} 7327
gost_services{host="host-0"} 1
指标说明¶
gost_services
(type=gauge)- 运行的服务数量
gost_service_requests_total
(type=counter)- 服务处理的请求总数
gost_service_transfer_input_bytes_total
(type=counter)- 服务接收到的数据字节数
gost_service_transfer_output_bytes_total
(type=counter)- 服务发送出的数据字节数
gost_service_requests_in_flight
(type=gauge)- 服务当前正在处理中的请求数
gost_service_request_duration_seconds_*
(type=histogram)- 服务请求处理的时长分布
gost_service_handler_errors_total
(type=counter)- 服务处理请求失败数
gost_chain_errors_total
(type=counter)- 转发链本身建立连接失败数
Prometheus¶
Prometheus配置文件prometheus.yaml
需要在scrape_configs
中增加一个Job。
global:
scrape_interval: 15s
# A list of scrape configurations.
scrape_configs:
- job_name: 'gost'
scrape_interval: 5s
static_configs:
- targets: ['127.0.0.1:9000']
Grafana Dashboard¶
你可以使用以下的Dashboard来呈现监控指标数据
https://grafana.com/grafana/dashboards/16037