±à¼ÍƼö: |
±¾ÎÄ´ÓPrometheusµÄ»ù´¡ËµÆð£¬Ñ§Ï°ºÍÁ˽âPrometheusÇ¿´óµÄÊý¾Ý´¦ÀíÄÜÁ¦£¬Á˽âÈçºÎʹÓÃPrometheus½øÐÐ°×ºÐºÍºÚºÐ¼à¿Ø£¬ÒÔ¼°PrometheusÔÚ¹æÄ£»¯¼à¿ØÏµĽâ¾ö·½°¸µÈ¡£
±¾ÎÄÀ´×ÔÓÚËѺüÍø£¬ÓÉ»ðÁú¹ûÈí¼þAlice±à¼¡¢ÍƼö¡£ |
|
PrometheusÊǼÌKubernetesºóµÚ2¸öÕýʽ¼ÓÈëCNCF»ù½ð»áµÄÏîÄ¿£¬ÈÝÆ÷ºÍÔÆÔÉúÁìÓòÊÂʵµÄ¼à¿Ø±ê×¼½â¾ö·½°¸¡£±¾ÎÄ×îºó½«´Ó0¿ªÊ¼¹¹½¨ÍêÕûµÄKubernetes¼à¿Ø¼Ü¹¹¡£
¼à¿ØµÄÄ¿±ê
ÔÚ¡¶SRE£ºGoogleÔËά½âÃÜ¡·Ò»ÊéÖÐÖ¸³ö£¬¼à¿ØÏµÍ³ÐèÒªÄܹ»ÓÐЧµÄÖ§³Ö°×ºÐ¼à¿ØºÍºÚºÐ¼à¿Ø¡£Í¨¹ý°×ºÐÄܹ»Á˽âÆäÄÚ²¿µÄʵ¼ÊÔËÐÐ״̬£¬Í¨¹ý¶Ô¼à¿ØÖ¸±êµÄ¹Û²ìÄܹ»Ô¤ÅпÉÄܳöÏÖµÄÎÊÌ⣬´Ó¶ø¶ÔDZÔڵIJ»È·¶¨ÒòËØ½øÐÐÓÅ»¯¡£¶øºÚºÐ¼à¿Ø£¬³£¼ûµÄÈçHTTP̽Õ룬TCP̽ÕëµÈ£¬¿ÉÒÔÔÚϵͳ»òÕß·þÎñÔÚ·¢Éú¹ÊÕÏʱÄܹ»¿ìËÙ֪ͨÏà¹ØµÄÈËÔ±½øÐд¦Àí¡£Í¨¹ý½¨Á¢ÍêÉÆµÄ¼à¿ØÌåϵ£¬´Ó¶ø´ïµ½ÒÔÏÂÄ¿µÄ£º
³¤ÆÚÇ÷ÊÆ·ÖÎö£ºÍ¨¹ý¶Ô¼à¿ØÑù±¾Êý¾ÝµÄ³ÖÐøÊÕ¼¯ºÍͳ¼Æ£¬¶Ô¼à¿ØÖ¸±ê½øÐг¤ÆÚÇ÷ÊÆ·ÖÎö¡£ÀýÈ磬ͨ¹ý¶Ô´ÅÅ̿ռäÔö³¤ÂʵÄÅжϣ¬ÎÒÃÇ¿ÉÒÔÌáǰԤ²âÔÚδÀ´Ê²Ã´Ê±¼ä½ÚµãÉÏÐèÒª¶Ô×ÊÔ´½øÐÐÀ©ÈÝ¡£
¶ÔÕÕ·ÖÎö£ºÁ½¸ö°æ±¾µÄϵͳÔËÐÐ×ÊԴʹÓÃÇé¿öµÄ²îÒìÈçºÎ£¿ÔÚ²»Í¬ÈÝÁ¿Çé¿öÏÂϵͳµÄ²¢·¢ºÍ¸ºÔر仯ÈçºÎ£¿Í¨¹ý¼à¿ØÄܹ»·½±ãµÄ¶Ôϵͳ½øÐиú×ٺͱȽϡ£
¸æ¾¯£ºµ±ÏµÍ³³öÏÖ»òÕß¼´½«³öÏÖ¹ÊÕÏʱ£¬¼à¿ØÏµÍ³ÐèҪѸËÙ·´Ó¦²¢Í¨Öª¹ÜÀíÔ±£¬´Ó¶øÄܹ»¶ÔÎÊÌâ½øÐпìËٵĴ¦Àí»òÕßÌáǰԤ·ÀÎÊÌâµÄ·¢Éú£¬±ÜÃâ³öÏÖ¶ÔÒµÎñµÄÓ°Ïì¡£
¹ÊÕÏ·ÖÎöÓ붨λ£ºµ±ÎÊÌâ·¢Éúºó£¬ÐèÒª¶ÔÎÊÌâ½øÐе÷²éºÍ´¦Àí¡£Í¨¹ý¶Ô²»Í¬¼à¿ØÖ¸±êÒÔ¼°ÀúÊ·Êý¾ÝµÄ·ÖÎö£¬Äܹ»ÕÒµ½²¢½â¾ö¸ùÔ´ÎÊÌâ¡£
Êý¾Ý¿ÉÊÓ»¯£ºÍ¨¹ý¿ÉÊÓ»¯ÒDZíÅÌÄܹ»Ö±½Ó»ñȡϵͳµÄÔËÐÐ״̬¡¢×ÊԴʹÓÃÇé¿ö¡¢ÒÔ¼°·þÎñÔËÐÐ״̬µÈÖ±¹ÛµÄÐÅÏ¢¡£
¶ø¶ÔÓÚÉÏÒ»´ú¼à¿ØÏµÍ³¶øÑÔ£¬ÔÚʹÓùý³ÌÖÐÍùÍù»áÃæÁÙÒÔÏÂÎÊÌ⣺
ÓëÒµÎñÍÑÀëµÄ¼à¿Ø£º¼à¿ØÏµÍ³»ñÈ¡µ½µÄ¼à¿ØÖ¸±êÓëÒµÎñ±¾ÉíÒ²ÊÇÒ»ÖÖ·ÖÀëµÄ¹ØÏµ¡£ºÃ±È¿Í»§¿ÉÄܹØ×¢µÄÊÇ·þÎñµÄ¿ÉÓÃÐÔ¡¢·þÎñµÄSLAµÈ¼¶£¬¶ø¼à¿ØÏµÍ³È´Ö»Äܸù¾Ýϵͳ¸ºÔØÈ¥²úÉú¸æ¾¯£»
ÔËά¹ÜÀíÄѶȴó£ºNagiosÕâÒ»Àà¼à¿ØÏµÍ³±¾ÉíÔËά¹ÜÀíÄѶȾͱȽϴó£¬ÐèÒªÓÐרҵµÄÈËÔ±½øÐа²×°£¬ÅäÖú͹ÜÀí£¬¶øÇÒ¹ý³Ì²¢²»¼òµ¥£»
¿ÉÀ©Õ¹ÐԵͣº ¼à¿ØÏµÍ³×ÔÉíÄÑÒÔÀ©Õ¹£¬ÒÔÊÊÓ¦¼à¿Ø¹æÄ£µÄ±ä»¯£»
ÎÊÌⶨλÄѶȴ󣺵±ÎÊÌâ²úÉúÖ®ºó£¨±ÈÈçÖ÷»ú¸ºÔØÒì³£Ôö¼Ó£©¶ÔÓÚÓû§¶øÑÔ£¬ËûÃÇ¿´µ½µÄÒÀÈ»ÊÇÒ»¸öºÚºÐ£¬ËûÃÇÎÞ·¨Á˽âÖ÷»úÉÏ·þÎñÕæÕýµÄÔËÐÐÇé¿ö£¬Òò´Ëµ±¹ÊÕÏ·¢Éúºó£¬ÕâЩ¸æ¾¯ÐÅÏ¢²¢²»ÄÜÓÐЧµÄÖ§³ÖÓû§¶ÔÓÚ¹ÊÕϸùÔ´ÎÊÌâµÄ·ÖÎöºÍ¶¨Î»¡£
ÔÚÉÏÊöÐèÇóÖУ¬ÎÒÃÇ¿ÉÒÔÌáÈ¡³öÒÔ϶ÔÓÚÒ»¸öÍêÉÆµÄ¼à¿Ø½â¾ö·½°¸µÄ¼¸¸ö¹Ø¼ü´Ê£ºÊý¾Ý·ÖÎö¡¢Ç÷ÊÆÔ¤²â¡¢¸æ¾¯¡¢¹ÊÕ϶¨Î»¡¢¿ÉÊÓ»¯¡£
³ý´ËÒÔÍ⣬µ±Ç°Ô½À´Ô½¶àµÄ²úÆ·¹«Ë¾Ç¨ÒƵ½ÔÆ»òÕßÈÝÆ÷µÄÇé¿öÏ£¬¶ÔÓÚ¼à¿Ø½â¾ö·½°¸¶øÑÔ»¹ÐèÒªÁíÍâÒ»¸ö¹Ø¼ü´Ê£ºÔÆÔÉú¡£
Ö÷ÒªÄÚÈÝ
½ñÌ콫´ÓÒÔϼ¸¸ö·½ÃæÀ´½éÉÜÏÂÒ»´ú¼à¿Ø½â¾ö·½°¸PrometheusÊÇÈçºÎ½â¾öÒÔÉÏÎÊÌâµÄ£º
³õʶPrometheus
ÈÃÊý¾Ý»á˵»°£ºPromQLÓë¿ÉÊÓ»¯
AlertmanagerÓë¸æ¾¯´¦Àí£»
°×ºÐÓëºÚºÐ¼à¿Ø
¹æÄ£»¯¼à¿Ø½â¾ö·½°¸
´Ó0¿ªÊ¼¼à¿ØKubernetes¼¯Èº
³õʶPrometheus
PrometheusÊÜÆô·¢ÓÚGoogleµÄBrogmon¼à¿ØÏµÍ³£¨ÏàËÆµÄKubernetesÊÇ´ÓGoogleµÄBrogϵͳÑÝ±ä¶øÀ´£©£¬´Ó2012Ä꿪ʼÓÉǰGoogle¹¤³ÌʦÔÚSoundcloudÒÔ¿ªÔ´Èí¼þµÄÐÎʽ½øÐÐÑз¢£¬²¢ÇÒÓÚ2015ÄêÔçÆÚ¶ÔÍâ·¢²¼ÔçÆÚ°æ±¾¡£2016Äê5Ô¼ÌKubernetesÖ®ºó³ÉΪµÚ¶þ¸öÕýʽ¼ÓÈëCNCF»ù½ð»áµÄÏîÄ¿£¬Í¬Äê6ÔÂÕýʽ·¢²¼1.0°æ±¾¡£2017Äêµ×·¢²¼ÁË»ùÓÚȫд洢²ãµÄ2.0°æ±¾£¬ÄܸüºÃµØÓëÈÝÆ÷ƽ̨¡¢ÔÆÆ½Ì¨ÅäºÏ¡£
´Óhttps://prometheus.io/download/»ñÈ¡×îеÄnode
exporter°æ±¾µÄ¶þ½øÖưüºóÖ±½ÓÔËÐм´¿É£º
$ node_exporter INFO[0000] Starting node_exporter (version=0.15.2, branch=HEAD, revision=98bc64930d34878b84a0f87df e6e1a6da61e532d) source=" node_exporter.go:43" INFO[0000] Enabled collectors: source="node_exporter.go:50"
INFO[0000] - time source="node_exporter.go:52" INFO[0000] - meminfo source="node_exporter.go:52" INFO[0000] - textfile source="node_exporter.go:52"
INFO[0000] - filesystem source="node_exporter.go:52"
INFO[0000] - netdev source="node_exporter.go:52"
INFO[0000] - cpu source="node_exporter.go:52"
INFO[0000] - diskstats source="node_exporter.go:52"
INFO[0000] - loadavg source="node_exporter.go:52"
INFO[0000] Listening on :9100 source="node_exporter.go:76" |
·ÃÎÊhttp://localhost:9100/metrics£¬¿ÉÒÔ¿´µ½Node
Exporter»ñÈ¡µ½µÄµ±Ç°Ö÷»úµÄËùÓÐ¼à¿ØÊý¾Ý£¬ÈçÏÂËùʾ£º

ÿһ¸ö¼à¿ØÖ¸±ê֮ǰ¶¼»áÓÐÒ»¶ÎÀàËÆÓÚÈçÏÂÐÎʽµÄÐÅÏ¢£º
# HELP node_cpu
Seconds the cpus spent in each mode.
# TYPE node_cpu counter
node_cpu{cpu= "cpu0",mode= "idle"}
362812.7890625
# HELP node_load1 1m load average.
# TYPE node_load1 gauge
node_load13.0703125 |
Node Exporterͨ¹ýÖ¸±êÃû³ÆºÍ±êÇ©·µ»ØÁ˵±Ç°Ö÷»úµÄ¼à¿ØÑù±¾Êý¾Ý¡£
´Óhttps://prometheus.io/download/ÕÒµ½×îа汾µÄPrometheus
SevrerÈí¼þ°ü£¬Ä¿Ç°ÕâÀï²ÉÓÃ×îеÄÎȶ¨°æ±¾2.x.x¡£
´´½¨ÅäÖÃÎļþprometheus.yml£¬ÈçÏÂËùʾ£º
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: [ 'localhost:9100']
- job_name: 'prometheus'
static_configs:
- targets: [ 'localhost:9090'] |
²¢Æô¶¯Prometheus£º
$ prometheus
--config.file=prometheus.yml --storage.tsdb.path=/data/prometheus
......
level=info ts=2018-03-11T13:38:06.317645234Z
caller=
main.go:486 msg= "Server is ready to receive
web requests."
level=info ts=2018-03-11T13:38:06.317679086Z
caller=manager.go:59 component= "scrape
manager"msg=
"Starting scrape manager..." |
·ÃÎÊhttp://localhost:9090£¬½øÈëµ½Prometheus
Server¡£Í¨¹ýÖ¸±êÃû³Ænode_load1£¬¿ÉÒÔÕÒµ½µ±Ç°²É¼¯µ½µÄÖ÷»ú¸ºÔصÄÑù±¾Êý¾Ý¡£

ÔÚÉÏÊöµÄÀý×ÓÖУ¬ÎÒÃÇÖ÷ҪʹÓõ½ÁËNode ExporterʵÀýÈ¥»ñÈ¡Ö÷»úµÄ¼à¿ØÊý¾Ý£¬Ò»¸öÔËÐеÄNode
ExporterʵÀý³ÆÎªÒ»¸öTarget¡£PromthuesÖÜÆÚÐԵĴÓNode ExporterʵÀýÖлñÈ¡¼à¿ØÑù±¾£¬²¢±£´æµ½Promtheus»ùÓÚ±¾µØ´ÅÅÌʵÏÖµÄʱ¼äÐòÁÐÊý¾Ý¿âÖС£

ÔÚʵ¼ÊµÄÓ¦Óó¡¾°ÖÐExporter¿ÉÒÔ·ÖΪÁ½Àࣺ
¶ÀÁ¢ÔËÐеģºÀàËÆÓÚNode ExporterÕâÖÖ£¬Ëü²¢²»Ö±½Ó²úÉúÊý¾Ý£¬ËüÖ»¸ºÔð´ÓÊý¾ÝÔ´ÖлñÈ¡Êý¾Ý£¬²¢ÒÔPrometheusÖ§³ÖµÄ¸ñʽ·µ»Ø¼à¿ØÊý¾Ý¼´¿É¡£
¼¯³Éµ½Ó¦ÓÃÖеģºÎªÁËÄܹ»¸üºÃµÄ¼à¿ØÏµÍ³µÄÄÚ²¿ÔËÐÐ״̬£¬ÓÐЩ¿ªÔ´ÏîÄ¿ÈçKubernetes£¬ETCDµÈÖ±½ÓÔÚÄÚ²¿¼¯³ÉÁ˶ÔPrometheusµÄÖ§³Ö£¬Í¨¹ýÄÚ²¿ÂñµãµÄÐÎʽ£¬¿ÉÒÔ¸üºÃµÄ¼à¿Ø·þÎñµÄÄÚ²¿ÔËÐÐ״̬¡£
ÈÃÊý¾Ý˵»°£ºPromQLÓëÊý¾Ý¿ÉÊÓ»¯
Àí½âʱ¼äÐòÁÐ
ÔÚNode ExporterµÄ/metrics½Ó¿ÚÖзµ»ØµÄÿһÐÐ¼à¿ØÊý¾Ý£¬ÔÚPrometheusϳÆÎªÒ»¸öÑù±¾¡£²É¼¯µ½µÄÑù±¾ÓÉÒÔÏÂÈý²¿·Ö×é³É£º
Ö¸±ê£¨metric£©£ºÖ¸±êºÍÒ»×éÃèÊöµ±Ç°Ñù±¾ÌØÕ÷µÄlabelsetsΨһ±êʶ£»
ʱ¼ä´Á£¨timestamp£©£ºÒ»¸ö¾«È·µ½ºÁÃëµÄʱ¼ä´Á£¬Ò»°ãÓɲɼ¯Ê±¼ä¾ö¶¨£»
Ñù±¾Öµ£¨value£©£º Ò»¸öfolat64µÄ¸¡µãÐÍÊý¾Ý±íʾµ±Ç°Ñù±¾µÄÖµ¡£
Prometheus»á½«ËùÓвɼ¯µ½µÄÑù±¾Êý¾ÝÒÔʱ¼äÐòÁУ¨time-series£©µÄ·½Ê½±£´æÔÚÄÚ´æÊý¾Ý¿âÖУ¬²¢ÇÒ¶¨Ê±±£´æµ½Ó²ÅÌÉÏ¡£Ã¿Ìõtime-seriesͨ¹ýÖ¸±êÃû³Æ£¨metrics
name£©ºÍÒ»×é±êÇ©¼¯£¨labelset£©ÃüÃû¡£ÈçÏÂËùʾ£¬¿ÉÒÔ½«time-seriesÀí½âΪһ¸öÒÔʱ¼äΪXÖáµÄ¶þά¾ØÕó£º

ÕâÖÖ¶àά¶ÈµÄÊý¾Ý´æ´¢·½Ê½£¬¿ÉÒÔÑÜÉú³öºÜ¶à²»Í¬µÄÍæ·¨¡£ ±ÈÈ磬Èç¹ûÊý¾ÝÀ´×Ô²»Í¬µÄÊý¾ÝÖÐÐÄ£¬ÄÇôÎÒÃÇ¿ÉÒÔÔÚÑù±¾ÖÐÌí¼Ó±êÇ©À´Çø·ÖÀ´×Ô²»Í¬Êý¾ÝÖÐÐÄµÄ¼à¿ØÑù±¾£¬ÀýÈ磺
node_cpu{cpu= "cpu0",mode=
"idle", dc= "dc0"} |
´ÓÄÚ²¿ÊµÏÖÉÏÀ´¿´PrometheusÖÐËùÓд洢µÄ¼à¿ØÑù±¾Êý¾ÝûÓÐÈκβîÒ죬¾ùÊÇÒ»×é±êÇ©£¬Ê±¼ä´ÁÒÔ¼°Ñù±¾Öµ¡£
´Ó´æ´¢ÉÏÀ´½²ËùÓÐµÄ¼à¿ØÖ¸±êmetric¶¼ÊÇÏàͬµÄ£¬µ«ÊÇÔÚ²»Í¬µÄ³¡¾°ÏÂÕâЩmetricÓÖÓÐһЩϸ΢µÄ²îÒì¡£
ÀýÈ磬ÔÚNode Exporter·µ»ØµÄÑù±¾ÖÐÖ¸±ênode_load1·´Ó¦µÄÊǵ±Ç°ÏµÍ³µÄ¸ºÔØ×´Ì¬£¬Ëæ×Åʱ¼äµÄ±ä»¯Õâ¸öÖ¸±ê·µ»ØµÄÑù±¾Êý¾ÝÊÇÔÚ²»¶Ï±ä»¯µÄ¡£¶øÖ¸±ênode_cpuËù»ñÈ¡µ½µÄÑù±¾Êý¾ÝÈ´²»Í¬£¬ËüÊÇÒ»¸ö³ÖÐøÔö´óµÄÖµ£¬ÒòΪÆä·´Ó¦µÄÊÇCPUµÄÀÛ»ýʹÓÃʱ¼ä£¬´ÓÀíÂÛÉϽ²Ö»ÒªÏµÍ³²»¹Ø»ú£¬Õâ¸öÖµÊÇ»áÎÞÏÞ±ä´óµÄ¡£
ΪÁËÄܹ»°ïÖúÓû§Àí½âºÍÇø·ÖÕâЩ²»Í¬¼à¿ØÖ¸±êÖ®¼äµÄ²îÒ죬Prometheus¶¨ÒåÁË4Öв»Í¬µÄÖ¸±êÀàÐÍ£¨metric
type£©£ºCounter£¨¼ÆÊýÆ÷£©¡¢Gauge£¨ÒDZíÅÌ£©¡¢Histogram£¨Ö±·½Í¼£©¡¢Summary£¨ÕªÒª£©¡£
Counter£ºÖ»Ôö²»¼õµÄ¼ÆÊýÆ÷
CounterÊÇÒ»¸ö¼òµ¥µ«ÓÐÇ¿´óµÄ¹¤¾ß£¬ÀýÈçÎÒÃÇ¿ÉÒÔÔÚÓ¦ÓóÌÐòÖмǼijЩʼþ·¢ÉúµÄ´ÎÊý£¬Í¨¹ýÒÔʱÐòµÄÐÎʽ´æ´¢ÕâЩÊý¾Ý£¬ÎÒÃÇ¿ÉÒÔÇáËɵÄÁ˽â¸Ãʼþ²úÉúËÙÂʵı仯¡£PromQLÄÚÖõľۺϲÙ×÷ºÍº¯Êý¿ÉÒÔÓû§¶ÔÕâЩÊý¾Ý½øÐнøÒ»²½µÄ·ÖÎö£º
ÀýÈ磬ͨ¹ýrate()º¯Êý»ñÈ¡HTTPÇëÇóÁ¿µÄÔö³¤ÂÊ£º
rate( http_requests_total[5m])
Gauge£º¿ÉÔö¿É¼õµÄÒDZíÅÌ
ÓëCounter²»Í¬£¬GaugeÀàÐ͵ÄÖ¸±ê²àÖØÓÚ·´Ó¦ÏµÍ³µÄµ±Ç°×´Ì¬¡£Òò´ËÕâÀàÖ¸±êµÄÑù±¾Êý¾Ý¿ÉÔö¿É¼õ¡£³£¼ûÖ¸±êÈ磺node_memory_MemFree£¨Ö÷»úµ±Ç°¿ÕÏеÄÄÚÈÝ´óС£©¡¢node_memory_MemAvailable£¨¿ÉÓÃÄÚ´æ´óС£©¶¼ÊÇGaugeÀàÐÍµÄ¼à¿ØÖ¸±ê¡£
ͨ¹ýGaugeÖ¸±ê£¬Óû§¿ÉÒÔÖ±½Ó²é¿´ÏµÍ³µÄµ±Ç°×´Ì¬£º
¶ÔÓÚGaugeÀàÐÍµÄ¼à¿ØÖ¸±ê£¬Í¨¹ýPromQLÄÚÖú¯Êýdelta()¿ÉÒÔ»ñÈ¡Ñù±¾ÔÚÒ»¶Îʱ¼ä·µ»ØÄڵı仯Çé¿ö¡£ÀýÈ磬¼ÆËãCPUζÈÔÚÁ½¸öСʱÄڵIJîÒ죺
delta(cpu_temp_celsius{host=
"zeus"}[2h]) |
»¹¿ÉÒÔʹÓÃderiv()¼ÆËãÑù±¾µÄÏßÐԻعéÄ£ÐÍ£¬ÉõÖÁÊÇÖ±½ÓʹÓÃpredict_linear()¶ÔÊý¾ÝµÄ±ä»¯Ç÷ÊÆ½øÐÐÔ¤²â¡£ÀýÈ磬Ԥ²âϵͳ´ÅÅ̿ռäÔÚ4¸öСʱ֮ºóµÄÊ£ÓàÇé¿ö£º
predict_linear(node_filesystem_free{job=
"node"}[1h], 4 * 3600) |
ʹÓÃHistogramºÍSummary·ÖÎöÊý¾Ý·Ö²¼Çé¿ö
ÔÚ´ó¶àÊýÇé¿öÏÂÈËÃǶ¼ÇãÏòÓÚʹÓÃijЩÁ¿»¯Ö¸±êµÄƽ¾ùÖµ£¬ÀýÈçCPUµÄƽ¾ùʹÓÃÂÊ¡¢Ò³ÃæµÄƽ¾ùÏìӦʱ¼ä¡£ÕâÖÖ·½Ê½µÄÎÊÌâºÜÃ÷ÏÔ£¬ÒÔϵͳAPIµ÷ÓÃµÄÆ½¾ùÏìӦʱ¼äΪÀý£ºÈç¹û´ó¶àÊýAPIÇëÇó¶¼Î¬³ÖÔÚ100msµÄÏìӦʱ¼ä·¶Î§ÄÚ£¬¶ø¸ö±ðÇëÇóµÄÏìӦʱ¼äÐèÒª5s£¬ÄÇô¾Í»áµ¼ÖÂijЩWEBÒ³ÃæµÄÏìӦʱ¼äÂäµ½ÖÐλÊýµÄÇé¿ö£¬¶øÕâÖÖÏÖÏó±»³ÆÎª³¤Î²ÎÊÌâ¡£
ΪÁËÇø·ÖÊÇÆ½¾ùµÄÂý»¹Êdz¤Î²µÄÂý£¬×î¼òµ¥µÄ·½Ê½¾ÍÊǰ´ÕÕÇëÇóÑӳٵķ¶Î§½øÐзÖ×é¡£ÀýÈ磬ͳ¼ÆÑÓ³ÙÔÚ010msÖ®¼äµÄÇëÇóÊýÓжàÉÙ¶ø1020msÖ®¼äµÄÇëÇóÊýÓÖÓжàÉÙ¡£Í¨¹ýÕâÖÖ·½Ê½¿ÉÒÔ¿ìËÙ·ÖÎöϵͳÂýµÄÔÒò¡£HistogramºÍSummary¶¼ÊÇΪÁËÄܹ»½â¾öÕâÑùÎÊÌâµÄ´æÔÚ£¬Í¨¹ýHistogramºÍSummaryÀàÐÍµÄ¼à¿ØÖ¸±ê£¬ÎÒÃÇ¿ÉÒÔ¿ìËÙÁ˽â¼à¿ØÑù±¾µÄ·Ö²¼Çé¿ö¡£
ÀýÈ磬ָ±êprometheus_tsdb_wal_fsync_duration_secondsµÄÖ¸±êÀàÐÍΪSummary¡£
Ëü¼Ç¼ÁËPrometheus ServerÖÐwal_fsync´¦ÀíµÄ´¦Àíʱ¼ä£¬Í¨¹ý·ÃÎÊPrometheus
ServerµÄ/metricsµØÖ·£¬¿ÉÒÔ»ñÈ¡µ½ÒÔÏÂ¼à¿ØÑù±¾Êý¾Ý£º
prometheus_tsdb_wal_fsync_duration_seconds{quantile=
"0.5"} 0.012352463
prometheus_tsdb_wal_fsync_duration_seconds{quantile=
"0.9"} 0.014458005
prometheus_tsdb_wal_fsync_duration_seconds{quantile=
"0.99"} 0.017316173
prometheus_tsdb_wal_fsync_duration_seconds_sum2.8887161
27000002
prometheus_tsdb_wal_fsync_duration_seconds_count216 |
´ÓÉÏÃæµÄÑù±¾ÖпÉÒÔµÃÖªµ±Ç°Promtheus Server½øÐÐwal_fsync²Ù×÷µÄ×Ü´ÎÊýΪ216´Î£¬ºÄʱ2.888716127000002s¡£ÆäÖÐÖÐλÊý£¨quantile=0.5£©µÄºÄʱΪ0.012352463£¬9·ÖλÊý£¨quantile=0.9£©µÄºÄʱΪ0.014458005s¡£

Prometheus¶ÔÓÚÊý¾ÝµÄ´æ´¢·½Ê½¾ÍÒâζ×Å£¬²»Í¬µÄ±êÇ©¾Í´ú±í×Ų»Í¬µÄÌØÕ÷ά¶È¡£Óû§¿ÉÒÔͨ¹ýÕâÐ©ÌØÕ÷ά¶È¶Ô²éѯ£¬¹ýÂ˺;ۺÏÑù±¾Êý¾Ý¡£
ÀýÈ磬ͨ¹ýnode_load1£¬²éѯ³öµ±Ç°Ê±¼äÐòÁÐÊý¾Ý¿âÖÐËùÓÐÃûΪnode_load1µÄʱ¼äÐòÁУº

Èç¹ûÕÒµ½Âú×ãÄ³Ð©ÌØÕ÷ά¶ÈµÄʱ¼äÐòÁУ¬Ôò¿ÉÒÔʹÓñêÇ©½øÐйýÂË£º

ͨ¹ýÒÔ±êǩΪºËÐĵÄÌØÕ÷ά¶È£¬Óû§¿ÉÒÔ¶Ôʱ¼äÐòÁнøÐÐÓÐЧµÄ²éѯºÍ¹ýÂË£¬µ±È»Èç¹û½ö½öÊÇÕâÑù£¬ÏÔÈ»»¹²»¹»Ç¿´ó£¬PrometheusÌṩµÄ·á¸»µÄ¾ÛºÏ²Ù×÷ÒÔ¼°ÄÚÖú¯Êý£¬¿ÉÒÔͨ¹ýPromQLÇáËɻشðÒÔÏÂÎÊÌ⣺
µ±Ç°ÏµÍ³µÄCPUʹÓÃÂÊ£¿
avg(irate(node_cpu{mode!=
"idle"}[2m])) without (cpu, mode) |

CPUÕ¼ÓÃÂÊǰ5λµÄÖ÷»úÓÐÄÄЩ£¿
topk(5, avg(irate(node_cpu{mode!=
"idle"}[2m])) without (cpu, mode)) |

Ô¤²âÔÚ4Сʱºòºó£¬´ÅÅ̿ռäÕ¼ÓôóÖ»áÊÇʲôÇé¿ö£¿
predict_linear(node_filesystem_free{job=
"node"}[2h], 4 * 3600) |

ÆäÖÐavg()£¬topk()µÈ¶¼ÊÇPromQLÄÚÖõľۺϲÙ×÷£¬irate()£¬predict_linear()ÊÇPromQLÄÚÖõĺ¯Êý£¬irate()º¯Êý¿ÉÒÔ¼ÆËãÒ»¶Îʱ¼ä·µ»ØÄÚʱ¼äÐòÁÐÖÐËùÓÐÑù±¾µÄµ¥Î»Ê±¼ä±ä»¯ÂÊ¡£predict_linearº¯ÊýÄÚ²¿Ôòͨ¹ý¼òµ¥ÏßÐԻعéµÄ·½Ê½Ô¤²âÊý¾ÝµÄ±ä»¯Ç÷ÊÆ¡£
ÒÔGrafanaΪÀý£¬ÔÚGrafanaÖпÉÒÔͨ¹ý½«Promtheus×÷ΪÊý¾ÝÔ´Ìí¼Óµ½ÏµÍ³ÖУ¬ºóÔÙʹÓÃPromQL½øÐÐÊý¾Ý¿ÉÊÓ»¯¡£ÔÚGrafana
v5.1ÖÐÌṩÁ˶ÔPromtheus 4ÖÖ¼à¿ØÀàÐ͵ÄÍêÕûÖ§³Ö£¬¿ÉÒÔͨ¹ýGraph Panel£¬Singlestat
Panel£¬Heatmap Panel¶Ô¼à¿ØÖ¸±êÊý¾Ý½øÐпÉÊÓ»¯¡£
ʹÓÃGraph Panel¿ÉÊÓ»¯Ö÷»úCPUʹÓÃÂʱ仯Çé¿ö£º

ʹÓÃSigle PanelÏÔʾµ±Ç°×´Ì¬£º

ʹÓÃHeatmap PanelÏÔʾÊý¾Ý·Ö²¼Çé¿ö£º

Prometheusͨ¹ýPromQLÌṩÁËÇ¿´óµÄÊý¾Ý²éѯºÍ´¦ÀíÄÜÁ¦¡£¶ÔÓÚÍⲿϵͳ¶øÑÔ¿ÉÒÔͨ¹ýPrometheusÌṩµÄAPI½Ó¿Ú£¬Ê¹ÓÃPromQL²éѯÏà¹ØµÄÑù±¾Êý¾Ý£¬´Ó¶øÊµÏÖÈçÊý¾Ý¿ÉÊÓ»¯µÈ×Ô¶¨ÒåÐèÇó£¬PromQLÊÇPrometheus¶ÔÄÚ£¬¶ÔÍ⹦ÄÜʵÏÖµÄÖ÷Òª½Ó¿Ú¡£

¸æ¾¯´¦ÀíÖÐÐÄ£ºAlertmanager
¸æ¾¯ÔÚPrometheusµÄ¼Ü¹¹Öб»»®·Ö³ÉÁ½¸ö¶ÀÁ¢µÄ²¿·Ö£º¸æ¾¯²úÉúºÍ¸æ¾¯´¦Àí¡£
ÔÚPrometheus¿ÉÒÔͨ¹ýÎļþµÄÐÎʽ¶¨Ò叿¾¯¹æÔò£¬Promthues»áÖÜÆÚÐԵļÆËã¸æ¾¯¹æÔòÖеÄPromQL±í´ïʽÅжÏÊÇ·ñ´ïµ½¸æ¾¯´¥·¢Ìõ¼þ£¬Èç¹ûÂú×㣬ÔòÔÚPrometheusÄÚ²¿²úÉúÒ»Ìõ¸æ¾¯¡£
¸æ¾¯¹æÔòÎļþ£¬Í¨¹ýYAML¸ñʽ½øÐж¨Ò壺
yaml
groups:
- name: hostStatsAlert
rules:
- alert: hostCpuUsageAlert
expr: sum(avg without (cpu)(irate(node_cpu{mode!=
'idle'}[ 5m]))) by(instance) > 0.85
for: 1m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }}
CPU usgae high"
deion: "{{ $labels.instance }} CPU usage
above 85% (current value: {{ $value}})" |
ÕâÀﶨÒåµ±Ö÷»úµÄCPUʹÓÃÂÊ´óÓÚ85%ʱ£¬²úÉú¸æ¾¯¡£¸æ¾¯×´Ì¬½«ÔÚPromethuesµÄUIÖнøÐÐչʾ¡£

µ½Ä¿Ç°ÎªÖ¹Promethuesͨ¹ýÖÜÆÚÐÔµÄУÑ鏿¾¯¹æÔòÎļþ£¬´Ó¶øÔÚÄÚ²¿´¦·£¸æ¾¯¡£

¶øºóÐøµÄ¸æ¾¯´¦ÀíÔòÓÉAlertmanager½øÐÐͳһ´¦Àí¡£Alertmanager×÷Ϊһ¸ö¶ÀÁ¢µÄ×é¼þ£¬¸ºÔð½ÓÊÕ²¢´¦ÀíÀ´×ÔPrometheus
Server£¨Ò²¿ÉÒÔÊÇÆäËüµÄ¿Í»§¶Ë³ÌÐò£©µÄ¸æ¾¯ÐÅÏ¢¡£Alertmanager¿ÉÒÔ¶ÔÕâЩ¸æ¾¯ÐÅÏ¢½øÐнøÒ»²½µÄ´¦Àí£¬±ÈÈçÏû³ýÖØ¸´µÄ¸æ¾¯ÐÅÏ¢£¬¶Ô¸æ¾¯ÐÅÏ¢½øÐзÖ×é²¢ÇÒ·Óɵ½ÕýÈ·µÄ½ÓÊÜ·½£¬AlertmanagerÄÚÖÃÁ˶ÔÓʼþ£¬SlackµÈ֪ͨ·½Ê½µÄÖ§³Ö£¬Í¬Ê±»¹Ö§³ÖÓëWebhookµÄ֪ͨ¼¯³É£¬ÒÔÖ§³Ö¸ü¶àµÄ¿ÉÄÜÐÔ£¬ÀýÈç¿ÉÒÔͨ¹ýWebhookÓ붤¶¤»òÕ߯óҵ΢ÐŽøÐм¯³É¡£Í¬Ê±AlertManager»¹ÌṩÁ˾²Ä¬ºÍ¸æ¾¯ÒÖÖÆ»úÖÆÀ´¶Ô¸æ¾¯Í¨ÖªÐÐΪ½øÐÐÓÅ»¯¡£

Prometheus×÷ΪÊÇÒ»¸ö¿ªÔ´µÄÍêÕû¼à¿Ø½â¾ö·½°¸£¬Æä¶Ô´«Í³¼à¿ØÏµÍ³µÄcheck-alertÄ£ÐͽøÐÐÁ˳¹µ×µÄµß¸²£¬ÐγÉÁË»ùÓÚÖÐÑ뻯µÄ¹æÔò¼ÆË㡢ͳһ·ÖÎöºÍ¸æ¾¯µÄÐÂÄ£ÐÍ¡£
ʹÓÃBlackbox½øÐÐºÚºÐ¼à¿Ø
ÔÚÇ°ÃæµÄ²¿·ÖÖУ¬ÎÒÃÇÖ÷Òª½éÉÜÁËNode ExporterµÄʹÓ㬶ÔÓÚÕâÀàExporter¶øÑÔ£¬ËüÃÇÖ÷Òª¼à¿Ø·þÎñ»òÕß»ù´¡ÉèÊ©µÄÄÚ²¿Ê¹ÓÃ״̬£¬¼´°×ºÐ¼à¿Ø¡£Í¨¹ý¶Ô¼à¿ØÖ¸±êµÄ¹Û²ìÄܹ»Ô¤ÅпÉÄܳöÏÖµÄÎÊÌ⣬´Ó¶ø¶ÔDZÔڵIJ»È·¶¨ÒòËØ½øÐÐÓÅ»¯¡£
¶ø´ÓÍêÕûµÄ¼à¿ØÂß¼µÄ½Ç¶È£¬³ýÁË´óÁ¿µÄÓ¦ÓÃ°×ºÐ¼à¿ØÒÔÍ⣬»¹Ó¦¸ÃÌí¼ÓÊʵ±µÄºÚºÐ¼à¿Ø¡£ºÚºÐ¼à¿Ø¼´ÒÔÓû§µÄÉí·Ý²âÊÔ·þÎñµÄÍⲿ¿É¼ûÐÔ£¬³£¼ûµÄºÚºÐ¼à¿Ø°üÀ¨HTTP̽Õë¡¢TCP̽ÕëµÈÓÃÓÚ¼ì²âÕ¾µã»òÕß·þÎñµÄ¿É·ÃÎÊÐÔ£¬ÒÔ¼°·ÃÎÊЧÂʵȡ£
ºÚºÐ¼à¿ØÏà½ÏÓÚ°×ºÐ¼à¿Ø×î´óµÄ²»Í¬ÔÚÓÚºÚºÐ¼à¿ØÊÇÒÔ¹ÊÕÏΪµ¼Ïòµ±¹ÊÕÏ·¢Éúʱ£¬ºÚºÐ¼à¿ØÄÜ¿ìËÙ·¢ÏÖ¹ÊÕÏ£¬¶ø°×ºÐ¼à¿ØÔò²àÖØÓÚÖ÷¶¯·¢ÏÖ»òÕßÔ¤²âDZÔÚµÄÎÊÌâ¡£Ò»¸öÍêÉÆµÄ¼à¿ØÄ¿±êÊÇÒªÄܹ»´Ó°×ºÐµÄ½Ç¶È·¢ÏÖDZÔÚÎÊÌ⣬Äܹ»ÔںںеĽǶȿìËÙ·¢ÏÖÒѾ·¢ÉúµÄÎÊÌâ¡£

ÕâÀïÀà±ÈÃô½ÝÖÐÖøÃûµÄÃô½Ý²âÊÔ½ð×ÖËþ£¬¶ÔÓÚÍêÕûµÄ¼à¿Ø¶øÑÔ£¬ÎÒÃÇÐèÒª´óÁ¿µÄ°×ºÐ¼à¿Ø£¬ÓÃÓÚ¼à¿Ø·þÎñµÄÄÚ²¿ÔËÐÐ״̬£¬´Ó¶ø¿ÉÒÔÖ§³ÖÓÐЧµÄ¹ÊÕÏ·ÖÎö¡£
ͬʱҲÐèÒª²¿·ÖµÄºÚºÐ¼à¿Ø£¬ÓÃÓÚ¼ì²âÖ÷Òª·þÎñÊÇ·ñ·¢Éú¹ÊÕÏ¡£
Blackbox ExporterÊÇPrometheusÉçÇøÌṩµÄ¹Ù·½ºÚºÐ¼à¿Ø½â¾ö·½°¸£¬ÆäÔÊÐíÓû§Í¨¹ý£ºHTTP¡¢HTTPS¡¢DNS¡¢TCPÒÔ¼°ICMPµÄ·½Ê½¶ÔÍøÂç½øÐÐ̽²â¡£Óû§¿ÉÒÔÖ±½ÓʹÓÃgo
getÃüÁî»ñÈ¡Blackbox ExporterÔ´Âë²¢Éú³É±¾µØ¿ÉÖ´ÐÐÎļþ¡£
Blackbox ExporterÔËÐÐʱ£¬ÐèÒªÖ¸¶¨Ì½ÕëÅäÖÃÎļþ£¬ÀýÈçblackbox.yml£º
modules:
http_2xx:
prober: http
http:
method: GET
http_post_2xx:
prober: http
http:
method: POST |
Æô¶¯blackbox_exporter¼´¿ÉÆô¶¯Ò»¸ö̽Õë·þÎñ£º
blackbox_exporter
--config.file= /etc/prometheus/blackbox.yml |
Æô¶¯ºó£¬Í¨¹ý·ÃÎÊhttp://127.0.0.1:9115/probe?module=http
_ 2xx&target = baidu .com ¿ÉÒÔ»ñµÃ blackbox ¶Ô baidu.comÕ¾µã̽²âµÄ½á¹û¡£
probe_http_duration_seconds{phase=
"connect"} 0.055551141
probe_http_duration_seconds{phase= "processing"}
0.049736019
probe_http_duration_seconds{phase= "resolve"}
0.011633673
probe_http_duration_seconds{phase= "tls"}
0
probe_http_duration_seconds{phase= "transfer"}
3.8919e-05
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects0
# HELP probe_http_ssl Indicates if SSL was used
for the
final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl0
# HELP probe_http_status_code Response HTTP status
code
# TYPE probe_http_status_code gauge
probe_http_status_code200
# HELP probe_http_version Returns the version
of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version1.1
# HELP probe_ip_protocol Specifies whether probe
ip protocol
is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol4
# HELP probe_success Displays whether or not
the probe was a success
# TYPE probe_success gauge
probe_success1 |
ÔÚPrometheusÖпÉÒÔͨ¹ýÌí¼ÓÏìÓ¦µÄ¼à¿Ø²É¼¯ÈÎÎñ£¬¼´¿É»ñÈ¡¶ÔÏàÓ¦Õ¾µãµÄ̽²â½á¹¹Ñù±¾Êý¾Ý£º
- job_name:
'blackbox'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- http: //prometheus.io # Target to probe with
http.
- https: //prometheus.io # Target to probe with
https.
- http: //example.com:8080 # Target to probe
with
http on port 8080.
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 127.0.0.1: 9115 |
¹æÄ£»¯¼à¿Ø½â¾ö·½°¸
µ½Ä¿Ç°ÎªÖ¹£¬ÎÒÃÇÁ˽âÁËPrometheusµÄ»ù´¡¼Ü¹¹ºÍÖ÷Òª¹¤×÷»úÖÆ£¬ÈçÏÂËùʾ£º

PrometheusÖÜÆÚÐԵĴÓTargetÖлñÈ¡¼à¿ØÊý¾Ý²¢±£´æµ½±¾µØµÄtime-seriesÖУ¬²¢ÇÒͨ¹ýPromQL¶ÔÍⱩ¶Êý¾Ý²éѯ½Ó¿Ú¡£
ÄÚ²¿ÖÜÆÚÐԵļì²é¸æ¾¯¹æÔòÎļþ£¬²úÉú¸æ¾¯²¢ÓÐAlertmanager¶Ô¸æ¾¯½øÐкóÐø´¦Àí¡£
ÄÇôÎÊÌâÀ´ÁË£¬ÕâÀïPrometheusÊǵ¥µã£¬AlertmanagerÒ²Êǵ¥µã¡£
ÕâÑùµÄ½á¹¹ÄÜ·ñÖ§³Ö´ó¹æÄ£µÄ¼à¿ØÁ¿£¿
¶ÔÓÚPrometheus¶øÑÔ£¬ÒªÏëÍêÈ«Àí½âÆä¸ß¿ÉÓò¿Êðģʽ£¬Ê×ÏÈÎÒÃÇÐèÒªÀí½âPrometheusµÄÊý¾Ý´æ´¢»úÖÆ¡£

ÈçÉÏËùʾ£¬Prometheus 2.x²ÉÓÃ×Ô¶¨ÒåµÄ´æ´¢¸ñʽ½«Ñù±¾Êý¾Ý±£´æÔÚ±¾µØ´ÅÅ̵±ÖС£°´ÕÕÁ½¸öСʱΪһ¸öʱ¼ä´°¿Ú£¬½«Á½Ð¡Ê±ÄÚ²úÉúµÄÊý¾Ý´æ´¢ÔÚÒ»¸ö¿é£¨Block£©ÖУ¬Ã¿Ò»¸ö¿éÖаüº¬¸Ãʱ¼ä´°¿ÚÄÚµÄËùÓÐÑù±¾Êý¾Ý£¨chunks£©£¬ÔªÊý¾ÝÎļþ£¨meta.json£©ÒÔ¼°Ë÷ÒýÎļþ£¨index£©¡£
µ±Ç°Ê±¼ä´°¿ÚÄÚÕýÔÚÊÕ¼¯µÄÑù±¾Êý¾Ý£¬PrometheusÔò»áÖ±½Ó½«Êý¾Ý±£´æÔÚÄÚ´æµ±ÖС£ÎªÁËÈ·±£´ËÆÚ¼äÈç¹ûPrometheus·¢Éú±ÀÀ£»òÕßÖØÆôʱÄܹ»»Ö¸´Êý¾Ý£¬PrometheusÆô¶¯Ê±»á´ÓдÈëÈÕÖ¾£¨WAL£©½øÐÐÖØ²¥£¬´Ó¶ø»Ö¸´Êý¾Ý¡£´ËÆÚ¼äÈç¹ûͨ¹ýAPIɾ³ýʱ¼äÐòÁУ¬É¾³ý¼Ç¼Ҳ»á±£´æÔÚµ¥¶ÀµÄÂß¼Îļþµ±ÖУ¨tombstone£©¡£

ͨ¹ýʱ¼ä´°¿ÚµÄÐÎʽ±£´æËùÓеÄÑù±¾Êý¾Ý£¬¿ÉÒÔÃ÷ÏÔÌá¸ßPrometheusµÄ²éѯЧÂÊ£¬µ±²éѯһ¶Îʱ¼ä·¶Î§ÄÚµÄËùÓÐÑù±¾Êý¾Ýʱ£¬Ö»ÐèÒª¼òµ¥µÄ´ÓÂäÔڸ÷¶Î§ÄڵĿéÖвéѯÊý¾Ý¼´¿É¡£¶ø¶ÔÓÚÀúÊ·Êý¾ÝµÄɾ³ý£¬Ò²±äµÃ·Ç³£¼òµ¥£¬Ö»ÒªÉ¾³ýÏàÓ¦¿éËùÔÚµÄĿ¼¼´¿É¡£
¶ÔÓÚµ¥½ÚµãµÄPrometheus¶øÑÔ£¬ÕâÖÖ»ùÓÚ±¾µØÎļþϵͳµÄ´æ´¢·½Ê½Äܹ»ÈÃÆäÖ§³ÖÊýÒÔ°ÙÍòµÄ¼à¿ØÖ¸±ê£¬Ã¿Ãë´¦ÀíÊýÊ®ÍòµÄÊý¾Ýµã¡£ÎªÁ˱£³Ö×ÔÉí¹ÜÀíºÍ²¿ÊðµÄ¼òµ¥ÐÔ£¬Prometheus·ÅÆúÁ˹ÜÀíHAµÄ¸´ÔÓ¶È¡£
Òò´ËÊ×ÏÈ£¬¶ÔÓÚÕâÖÖ´æ´¢·½Ê½¶øÑÔ£¬ÎÒÃÇÐèÒªÃ÷È·µÄ¼¸µã£º
Prometheus±¾Éí²»ÊÊÓÃÓڳ־û¯´æ´¢³¤ÆÚµÄÀúÊ·Êý¾Ý£¬Ä¬ÈÏÇé¿öÏÂPrometheusÖ»±£Áô15ÌìµÄÊý¾Ý¡£
±¾µØ´æ´¢Ò²Òâζ×ÅPrometheus×ÔÉíÎÞ·¨½øÐÐÓÐЧµÄµ¯ÐÔÉìËõ¡£
¶øµ±¼à¿Ø¹æÄ£±äµÃ¾Þ´óµÄʱºò£¬¶ÔÓÚµ¥Ì¨Prometheus¶øÑÔ£¬ÆäÖ÷ÒªÌôÕ½°üÀ¨ÒÔϼ¸µã£º
·þÎñµÄ¿ÉÓÃÐÔ£¬ÈçºÎÈ·±£Prometheus²»»á·¢Éúµ¥µã¹ÊÕÏ£»
¼à¿Ø¹æÄ£±ä´óµÄÒâζ×Å£¬PrometheusµÄ²É¼¯JobµÄÊýÁ¿Ò²»á±ä´ó£¨Ð´£©²Ù×÷»á±äµÃ·Ç³£ÏûºÄ×ÊÔ´£»
ͬʱҲÒâζ×Å´óÁ¿µÄÊý¾Ý´æ´¢µÄÐèÇó¡£
¼òµ¥HA£º·þÎñ¿ÉÓÃÐÔ
ÓÉÓÚPrometheusµÄPull»úÖÆµÄÉè¼Æ£¬ÎªÁËÈ·±£Prometheus·þÎñµÄ¿ÉÓÃÐÔ£¬Óû§Ö»ÐèÒª²¿Êð¶àÌ×Prometheus
ServerʵÀý£¬²¢ÇҲɼ¯ÏàͬµÄExporterÄ¿±ê¼´¿É¡£

»ù±¾µÄHAģʽֻÄÜÈ·±£Prometheus·þÎñµÄ¿ÉÓÃÐÔÎÊÌ⣬µ«ÊDz»½â¾öPrometheus
ServerÖ®¼äµÄÊý¾ÝÒ»ÖÂÐÔÎÊÌâÒÔ¼°³Ö¾Ã»¯ÎÊÌ⣨Êý¾Ý¶ªÊ§ºóÎÞ·¨»Ö¸´£©£¬Ò²ÎÞ·¨½øÐж¯Ì¬µÄÀ©Õ¹¡£Òò´ËÕâÖÖ²¿Êð·½Ê½ÊÊºÏ¼à¿Ø¹æÄ£²»´ó£¬Promthues
ServerÒ²²»»áƵ·±·¢ÉúÇ¨ÒÆµÄÇé¿ö£¬²¢ÇÒÖ»ÐèÒª±£´æ¶ÌÖÜÆÚ¼à¿ØÊý¾ÝµÄ³¡¾°¡£
»ù±¾HA + Ô¶³Ì´æ´¢
ÔÚ»ù±¾HAģʽµÄ»ù´¡ÉÏͨ¹ýÌí¼ÓRemote Storage´æ´¢Ö§³Ö£¬½«¼à¿ØÊý¾Ý±£´æÔÚµÚÈý·½´æ´¢·þÎñÉÏ¡£

µ±PrometheusÔÚ»ñÈ¡¼à¿ØÑù±¾²¢±£´æµ½±¾µØµÄͬʱ£¬»á½«¼à¿ØÊý¾Ý·¢Ë͵½Remote
Storage Adaptor£¬ÓÉAdaptorÍê³É¶ÔµÚÈý·½´æ´¢µÄ¸ñʽת»»ÒÔ¼°Êý¾Ý³Ö¾Ã»¯¡£
µ±Prometheus²éѯÊý¾ÝµÄʱºò£¬Ò²»á´ÓRemote Storage
Adaptor»ñÈ¡Êý¾Ý£¬ºÏ²¢±¾µØÊý¾Ýºó½øÐÐÊý¾Ý²éѯ¡£
ÔÚ½â¾öÁËPrometheus·þÎñ¿ÉÓÃÐԵĻù´¡ÉÏ£¬Í¬Ê±È·±£ÁËÊý¾ÝµÄ³Ö¾Ã»¯£¬µ±Prometheus
Server·¢Éúå´»ú»òÕßÊý¾Ý¶ªÊ§µÄÇé¿öÏ£¬¿ÉÒÔ¿ìËٵĻָ´¡£ ͬʱPrometheus Server¿ÉÄܺܺõĽøÐÐÇ¨ÒÆ¡£Òò´Ë£¬¸Ã·½°¸ÊÊÓÃÓÚÓû§¼à¿Ø¹æÄ£²»´ó£¬µ«ÊÇÏ£ÍûÄܹ»½«¼à¿ØÊý¾Ý³Ö¾Ã»¯£¬Í¬Ê±Äܹ»È·±£Prometheus
ServerµÄ¿ÉÇ¨ÒÆÐԵij¡¾°¡£
»ù±¾HA + Ô¶³Ì´æ´¢ + Áª°î¼¯Èº
µ±µ¥Ì¨Prometheus ServerÎÞ·¨´¦Àí´óÁ¿µÄ²É¼¯ÈÎÎñʱ£¬Óû§¿ÉÒÔ¿¼ÂÇ»ùÓÚPrometheusÁª°î¼¯ÈºµÄ·½Ê½½«¼à¿Ø²É¼¯ÈÎÎñ»®·Öµ½²»Í¬µÄPrometheusʵÀýµ±Öм´ÔÚÈÎÎñ¼¶±ð¹¦ÄÜ·ÖÇø¡£

ÕâÖÖ²¿Êð·½Ê½Ò»°ãÊÊÓÃÓÚÁ½ÖÖ³¡¾°£º
³¡¾°Ò»£ºµ¥Êý¾ÝÖÐÐÄ + ´óÁ¿µÄ²É¼¯ÈÎÎñ
ÕâÖÖ³¡¾°ÏÂPrometheusµÄÐÔÄÜÆ¿¾±Ö÷ÒªÔÚÓÚ´óÁ¿µÄ²É¼¯ÈÎÎñ£¬Òò´ËÓû§ÐèÒªÀûÓÃPrometheusÁª°î¼¯ÈºµÄÌØÐÔ£¬½«²»Í¬ÀàÐ͵IJɼ¯ÈÎÎñ»®·Öµ½²»Í¬µÄPrometheus×Ó·þÎñÖУ¬´Ó¶øÊµÏÖ¹¦ÄÜ·ÖÇø¡£ÀýÈçÒ»¸öPrometheus
Server¸ºÔð²É¼¯»ù´¡ÉèÊ©Ïà¹ØµÄ¼à¿ØÖ¸±ê£¬ÁíÍâÒ»¸öPrometheus Server¸ºÔð²É¼¯Ó¦ÓÃ¼à¿ØÖ¸±ê¡£ÔÙÓÐÉϲãPrometheus
ServerʵÏÖ¶ÔÊý¾ÝµÄ»ã¾Û¡£
³¡¾°¶þ£º¶àÊý¾ÝÖÐÐÄ
ÕâÖÖģʽҲÊʺÏÓë¶àÊý¾ÝÖÐÐĵÄÇé¿ö£¬µ±Prometheus ServerÎÞ·¨Ö±½ÓÓëÊý¾ÝÖÐÐÄÖеÄExporter½øÐÐͨѶʱ£¬ÔÚÿһ¸öÊý¾ÝÖв¿ÊðÒ»¸öµ¥¶ÀµÄPrometheus
Server¸ºÔðµ±Ç°Êý¾ÝÖÐÐĵIJɼ¯ÈÎÎñÊÇÒ»¸ö²»´íµÄ·½Ê½¡£ÕâÑù¿ÉÒÔ±ÜÃâÓû§½øÐдóÁ¿µÄÍøÂçÅäÖã¬Ö»ÐèҪȷ±£Ö÷Prometheus
ServerʵÀýÄܹ»Ó뵱ǰÊý¾ÝÖÐÐĵÄPrometheus ServerͨѶ¼´¿É¡£ ÖÐÐÄPrometheus
Server¸ºÔðʵÏÖ¶Ô¶àÊý¾ÝÖÐÐÄÊý¾ÝµÄ¾ÛºÏ¡£
¸ß¿ÉÓ÷½°¸Ñ¡Ôñ
ÉÏÃæµÄ²¿·Ö£¬¸ù¾Ý²»Í¬µÄ³¡¾°ÑÝʾÁË3ÖÖ²»Í¬µÄ¸ß¿ÉÓò¿Êð·½°¸¡£µ±È»¶ÔÓÚPrometheus²¿Êð·½°¸ÐèÒªÓû§¸ù¾Ý¼à¿Ø¹æÄ£ÒÔ¼°×ÔÉíµÄÐèÇó½øÐж¯Ì¬µ÷Õû£¬Ï±íչʾÁËPrometheusºÍ¸ß¿ÉÓÃÓйØ3¸öÑ¡Ïî¸÷×Ô½â¾öµÄÎÊÌ⣬Óû§¿ÉÒÔ¸ù¾Ý×Ô¼ºµÄÐèÇóÁé»îÑ¡Ôñ¡£

¶ÔÓÚAlertmanager¶øÑÔ£¬Alertmanager¼¯ÈºÖ®¼äʹÓÃGossipÐÒéÏ໥´«µÝ״̬£¬Òò´Ë¶ÔÓÚPrometheus¶øÑÔ£¬Ö»ÐèÒª¹ØÁª¶à¸öAlertmanagerʵÀý¼´¿É£¬¹ØÓÚAlertmanager¼¯ÈºµÄÏêϸÏêϸ¿ÉÒԲο¼£ºhttps://github.com
/ yunlzheng / prometheus -book/blob/master/ha/alertmanager-high-availability.md

·þÎñ·¢ÏÖÓëÔÆÔÉú£ºÒÔKubernetesΪÀý
¶ÔÓÚÖîÈçKubernetesÕâÀàÈÝÆ÷»òÕßÔÆ»·¾³£¬¶ÔÓÚPrometheus¶øÑÔ£¬ÐèÒª½â¾öµÄÒ»¸öÖØÒªÎÊÌâ¾ÍÊÇÈçºÎ¶¯Ì¬µÄ·¢ÏÖ²¿ÊðÔÚKubernetes»·¾³ÏµÄÐèÒª¼à¿ØµÄËùÓÐÄ¿±ê¡£

¶ÔÓÚKubernetes¶øÑÔ£¬ÈçÉÏͼËùʾ£¬ÎÒÃÇ¿ÉÒ԰ѵ±ÖÐËùÓеÄ×ÊÔ´·ÖΪ¼¸Àࣺ
1.»ù´¡ÉèÊ©²ã£¨Node£©£º¼¯Èº½Úµã£¬ÎªÕû¸ö¼¯ÈººÍÓ¦ÓÃÌṩÔËÐÐʱ×ÊÔ´
2.ÈÝÆ÷»ù´¡ÉèÊ©£¨Container£©£ºÎªÓ¦ÓÃÌṩÔËÐÐʱ»·¾³
3.Óû§Ó¦Óã¨Pod£©£ºPodÖлá°üº¬Ò»×éÈÝÆ÷£¬ËüÃÇÒ»Æð¹¤×÷£¬²¢ÇÒ¶ÔÍâÌṩһ¸ö£¨»òÕßÒ»×飩¹¦ÄÜ
4.ÄÚ²¿·þÎñ¸ºÔؾùºâ£¨Service£©£ºÔÚ¼¯ÈºÄÚ£¬Í¨¹ýServiceÔÚ¼¯Èº±©Â¶Ó¦Óù¦ÄÜ£¬¼¯ÈºÄÚÓ¦ÓúÍÓ¦ÓÃÖ®¼ä·ÃÎÊʱÌṩÄÚ²¿µÄ¸ºÔؾùºâ
5.Íⲿ·ÃÎÊÈë¿Ú£¨Ingress£©£ºÍ¨¹ýIngressÌṩ¼¯ÈºÍâµÄ·ÃÎÊÈë¿Ú£¬´Ó¶ø¿ÉÒÔʹÍⲿ¿Í»§¶ËÄܹ»·ÃÎʵ½²¿ÊðÔÚKubernetes¼¯ÈºÄڵķþÎñ
Òò´Ë£¬ÔÚ²»¿¼ÂÇKubernetes×ÔÉí×é¼þµÄÇé¿öÏ£¬Èç¹ûÒª¹¹½¨Ò»¸öÍêÕûµÄ¼à¿ØÌåϵ£¬ÎÒÃÇÓ¦¸Ã¿¼ÂÇ£¬ÒÔÏÂ5¸ö·½Ã棺
¼¯Èº½Úµã״̬¼à¿Ø£º´Ó¼¯ÈºÖи÷½ÚµãµÄkubelet·þÎñ»ñÈ¡½ÚµãµÄ»ù±¾ÔËÐÐ״̬£»
¼¯Èº½Úµã×ÊÔ´ÓÃÁ¿¼à¿Ø£ºÍ¨¹ýDaemonsetµÄÐÎʽÔÚ¼¯ÈºÖи÷¸ö½Úµã²¿ÊðNode
Exporter²É¼¯½ÚµãµÄ×ÊԴʹÓÃÇé¿ö£»
½ÚµãÖÐÔËÐеÄÈÝÆ÷¼à¿Ø£ºÍ¨¹ý¸÷¸ö½ÚµãÖÐkubeletÄÚÖõÄcAdvisorÖлñÈ¡¸ö½ÚµãÖÐËùÓÐÈÝÆ÷µÄÔËÐÐ״̬ºÍ×ÊԴʹÓÃÇé¿ö£»
´ÓºÚºÐ¼à¿ØµÄ½Ç¶ÈÔÚ¼¯ÈºÖв¿ÊðBlackbox Exporter̽Õë·þÎñ£¬¼ì²âServiceºÍIngressµÄ¿ÉÓÃÐÔ£»
Èç¹ûÔÚ¼¯ÈºÖв¿ÊðµÄÓ¦ÓóÌÐò±¾ÉíÄÚÖÃÁ˶ÔPrometheusµÄ¼à¿ØÖ§³Ö£¬ÄÇôÎÒÃÇ»¹Ó¦¸ÃÕÒµ½ÏàÓ¦µÄPodʵÀý£¬²¢´Ó¸ÃPodʵÀýÖлñÈ¡ÆäÄÚ²¿ÔËÐÐ״̬µÄ¼à¿ØÖ¸±ê¡£
¶ø¶ÔÓÚPrometheusÕâÒ»Àà»ùÓÚPullģʽµÄ¼à¿ØÏµÍ³£¬ÏÔȻҲÎÞ·¨¼ÌÐøÊ¹ÓõÄstatic_configsµÄ·½Ê½¾²Ì¬µÄ¶¨Òå¼à¿ØÄ¿±ê¡£¶ø¶ÔÓÚPrometheus¶øÑÔÆä½â¾ö·½°¸¾ÍÊÇÒýÈëÒ»¸öÖмäµÄ´úÀíÈË£¨·þÎñ×¢²áÖÐÐÄ£©£¬Õâ¸ö´úÀíÈËÕÆÎÕ×ŵ±Ç°ËùÓÐ¼à¿ØÄ¿±êµÄ·ÃÎÊÐÅÏ¢£¬PrometheusÖ»ÐèÒªÏòÕâ¸ö´úÀíÈËѯÎÊÓÐÄÄЩ¼à¿ØÄ¿±ê¿Ø¼´¿É£¬
ÕâÖÖģʽ±»³ÆÎª·þÎñ·¢ÏÖ¡£

PrometheusÌṩÁ˶ÔKubernetesµÄÍêÕûÖ§³Ö£¬Í¨¹ýÓëKubernetesµÄAPI½øÐн»»¥£¬Prometheus¿ÉÒÔ×Ô¶¯µÄ·¢ÏÖKubernetesÖÐËùÓеÄNode¡¢Service¡¢Pod¡¢EndpointsÒÔ¼°Ingress×ÊÔ´µÄÏà¹ØÐÅÏ¢¡£
ͨ¹ý·þÎñ·¢ÏÖÕÒµ½ËùÓÐµÄ¼à¿ØÄ¿±êºó£¬²¢Í¨¹ýPrometheusµÄRelabling»úÖÆ¶ÔÕâЩ×ÊÔ´½øÐйýÂË£¬metricsµØÖ·Ìæ»»µÈ²Ù×÷£¬´Ó¶øÊµÏÖ¶Ô¸÷Àà×ÊÔ´µÄÈ«×Ô¶¯»¯¼à¿Ø¡£
ÀýÈ磬ͨ¹ýÒÔÏÂÁ÷³ÌÈÎÎñÅäÖ㬿ÉÒÔ×Ô¶¯´Ó¼¯Èº½ÚµãµÄkubelet·þÎñÖÐÄÚÖõÄcAdvisorÖлñÈ¡ÈÝÆ÷µÄ¼à¿ØÊý¾Ý£º
- job_name:
'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: / var/run/secrets/kubernetes.io/
serviceaccount/ca.crtbearer_token_file: / var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes. default.svc: 443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path_
replacement: /api/v1/nodes/${ 1}/proxy/metrics/cadvisor |
ÓÉ»òÕßÊÇͨ¹ý¼¯ÈºÖв¿ÊðµÄblackbox exporter¶Ô·þÎñ½øÐÐÍøÂç̽²â£º
- job_name:'kubernetes-services'
metrics_path:/probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role:service
relabel_configs:
- source_labels:[__address_ _]
target_label:__param_target
- target_label:__address_ _
replacement:blackbox-exporter.example. com:9115
- source_labels:[__param_target]
target_label:instance
- action:labelmap
regex:__meta_kubernetes_service_label _(.+)
- source_labels:[__meta_kubernetes_namespace]
target_label:kubernetes_namespace
- source_labels:[__meta_kubernetes_service_name]
target_label:kubernetes_name |
С½á
ÓÉÓÚÏßÉÏ·ÖÏíµÄÐÎʽÎÞ·¨ÊÂÎÞ¾ÞϸµÄ·ÖÏí¹ØÓÚPrometheusµÄËùÓÐÄÚÈÝ£¬µ«ÊÇÏ£Íû´ó¼ÒÄܹ»Í¨¹ý½ñÌìµÄ·ÖÏíÄܹ»¶ÔPrometheusÓиüºÃµÄÀí½â¡£
ÕâÀïÎÒÒ²½«¹ØÓÚPrometheusµÄÏà¹ØÊµ¼ùͨ¹ýµç×ÓÊéµÄÐÎʽ½øÐÐÁËÕûÀí£ºhttps://github.com/yunlzheng/prometheus-book£¬Ï£ÍûÄܶԴó¼ÒѧϰºÍʹÓÃPrometheusÆðµ½Ò»¶¨µÄ°ïÖú×÷Ó㬵±È»¹ØÓÚPrometheusµÄÏà¹ØÎÊÌ⣬Ҳ¿ÉÒÔͨ¹ýGithub
IssueÀ´Ï໥½»Á÷¡£
Q&A
Q£ºPrometheusµÄÊý¾ÝÄÜ·ñ×Ô¶¯Í¬²½µ½InfluxDBÖУ¿
A£º¿ÉÒÔ£¬Í¨¹ýremote_write¿ÉÒÔʵÏÖ£¬¿ÉÒԲο¼£ºPrometheusͨ¹ý½«²É¼¯µ½µÄÊý¾Ý·¢Ë͵½Adaptor£¬ÔÙÓÉAdaptorÍê³É¶ÔÊý¾Ý¸ñʽµÄת»»´æ´¢µ½InfluxDB¼´¿É¡£
Q£ºPrometheusÒ»¸öServer×î¶àÄÜÔËÐжàÉÙ¸öJob£¿
A£ºÕâ¸öûÓÐ×ö¾ßÌåµÄÊÔÑ飬²»¹ýÐèҪעÒâµÄÊÇJobÈÎÎñÁ¿£¨Ð´²Ù×÷£©£¬»áÖ±½ÓÓ°ÏìPrometheusµÄÐÔÄÜ£¬×îºÃʹÓÃfederationʵÏÖ¶Áд·ÖÀë¡£
Q£ºÇëÎʸ澯ÓÉGrafanaʵÏֱȽϺ㬻¹ÊÇAlertmanager£¬³£ÓõÄmetricÁбíÓÐûÓлã×ܵÄÇåµ¥Á´½Ó·ÖÏíÏ£¬ÀúÊ·Êý¾ÝĬÈϱ£Áôʱ¼äÈçºÎÉèÖã¿
A£ºGrafana×ÔÉíÊÇÖ§³Ö¶àÊý¾ÝÔ´£¬PromethuesÖ»ÊÇÆäÖÐÖ®Ò»¡£
Èç¹ûֻʹÓÃPromthuesÄÇÓÃAlertmanager¾ÍºÃÁË£¬ÀïÃæÊµÏÖÁËºÜ¶à¸æ¾¯È¥Öغ;²Ä¬µÄ»úÖÆ£¬²»È»ÊÕµ½ÓʼþºäÕ¨Ò²²»Ì«ºÃ¡£
Èç¹ûÐèÒª»ùÓÚGrafanaÖÐÓõ½µÄ¶àÖÖÊý¾ÝÔ´×ö¸æ¾¯µÄ»°£¬ÄǾÍÓÃGrafana¡£
Q£ºPrometheus¼à¿ØÊý¾ÝÍÆ¼ö´æÄÄÀïÊÇInfluxDB£¬»òÕßESÀïÃæ£¬InfluxDBµ¥½ÚµãÃâ·Ñ£¬¶à½ÚµÄËÆºõÊշѵģ¿
A£ºÄ¬ÈÏÇé¿öÏ£¬Ö±½ÓÊDZ£´æµ½±¾µØµÄ¡£Èç¹ûÒª°ÑÊý¾Ý³Ö¾Ã»¯µ½µÚÈý·½´æ´¢Ö»ÒªÊµÏÖremote_write½Ó¿Ú¾Í¿ÉÒÔ¡£ÀíÂÛÉÏ¿ÉÒÔ¶Ô½ÓÈÎÒâµÄµÚÈý·½´æ´¢¡£
InfluxDBÖ»Êǹٷ½ÌṩµÄÒ»¸öʾÀýÖ®Ò»¡£
Q£ºÇëÎÊ¡°ÔÙÓÐÉϲãPrometheus ServerʵÏÖ¶ÔÊý¾ÝµÄ»ã¾Û¡£¡±ÊDZíʾ¸ÃPrometheus»á¶ÔϲãPrometheus½øÐÐÊý¾ÝÊÕ¼¯Âð£¿Ê¹ÓÃʲô½Ó¿Ú£¿
A£ºÇë²Î¿¼Prometheus Fedreation£¬ÕâÀïÖ÷ÒªÊÇÖ¸ÓÉÒ»²¿·ÖPrometheusʵÀý¸ºÔð²É¼¯ÈÎÎñ£¬È»ºóGlobalµÄPromethe
us»ã¼¯Êý¾Ý£¬²¢¶ÔÍâÌṩ²éѯ½Ó¿Ú¡£ ¼õÉÙGlobal PrometheusµÄѹÁ¦¡£
Q£ºÁ½Ì¨Prometheus server ¿É·ñÓÃKeepalived£¿
A£ºÖ±½Ó¸ºÔؾùºâ¾Í¿ÉÒÔÁË£¬¶ÔÓÚPrometheus¶øÑÔ£¬ÊµÀýÖ®¼ä±¾Éí²¢Ã»ÓÐÈκεÄÖ±½Ó¹ØÏµ¡£
Q£ºÓÃPrometheus¼à¿ØÒµÎñµÄAPI½Ó¿Ú£¬ÓÐʲôºÃµÄ·½·¨Âð£¬ÄÜ¼à¿ØÊý¾Ý¿âµÄÂý²éѯÂð£¿
|