熱門搜索 Zabbix技術(shù)資料 Zabbix常見問、答討論成功案例 Zabbix交流區(qū) Prometheus交流區(qū)

Prometheus技術(shù)分享——prometheus的函數(shù)與計算公式詳解

2022/12/28 Prometheus技術(shù)資料 prometheus函數(shù)prometheus技術(shù)分享 prometheus計算公式7833

Prometheus與zabbix相比，它的強大之處就在于可以它可以使用的很多計算公式去獲取自己需要的數(shù)據(jù)。當(dāng)然，這里所涉及到的計算公式，也是我們普遍認(rèn)為的難點所在。比如，我們要獲取CPU使用率，使用zabbix就可以輕易獲得，但是在Prometheus中卻需要通過計算公式來完成CPU使用率的計算。

如果要統(tǒng)計CPU的使用：node_exporter會抓取CPU常用你的8種狀態(tài)的累計工作時間，然后再用（所有非空閑狀態(tài)的CPU時間總和）/（所有狀態(tài)的CPU時間總和）= CPU使用率。而如果想要獲取中間某一分鐘的CPU使用時間還需要用到Counter數(shù)據(jù)類型。由于Counter的數(shù)據(jù)一致是增量，所以需要截取其中一段增量值，然后再拿這個值去套用公式進行計算。

一、常用函數(shù)

Prometheus為不同的數(shù)據(jù)提供了非常多的計算函數(shù)，其中有個小技巧就是遇到counter數(shù)據(jù)類型，在做任何操作之前，先套上一個rate()或者increase()函數(shù)。下面是一些比較常用的函數(shù)：

1、rate函數(shù)

rate() 函數(shù)是專門搭配counter數(shù)據(jù)類型使用函數(shù)，功能是取counter在這個時間段中平均每秒的增量。

例如：獲取eth0網(wǎng)卡1m內(nèi)每秒流量的平均值

rate(node_network_receive_bytes_total{device="eth0"}[1m])

2、increase函數(shù)

increase() 函數(shù)表示某段時間內(nèi)數(shù)據(jù)的增量

rate() 函數(shù)則表示某段時間內(nèi)數(shù)據(jù)的平均值

兩個函數(shù)如何選取使用？

當(dāng)我們獲取數(shù)據(jù)比較精細的時候類似于1m取樣推薦使用rate()

當(dāng)我們獲取數(shù)據(jù)比較粗糙的時候類似于5m，10m甚至更長時間取樣推薦使用increase()

例如：獲取eth0網(wǎng)卡1m內(nèi)流量的增量

increase(node_network_receive_bytes_total{device="eth0"}[1m])

3、sum函數(shù)

sum()函數(shù)就是求和函數(shù),注意點是當(dāng)你使用sum后是將所有的監(jiān)控的服務(wù)器的值進行取和，所以當(dāng)我們只看某一臺時需要進行拆分

拆分常用方法：

by increase()
2 by (cluster_name) 屬于自定義標(biāo)簽不是標(biāo)準(zhǔn)標(biāo)簽，我們可以手動將不痛功能的服務(wù)器進行分組展示
例如：獲取所有主機eth0網(wǎng)卡1m內(nèi)每秒流量的平均值的和

sum(rate(node_network_receive_bytes_total{device="eth0"}[1m]))

4、topk函數(shù)

topk() 函數(shù)的定義是：取前面x位的最高值,最簡單理解就是數(shù)學(xué)的top3 ，當(dāng)我們有很多服務(wù)器我們想要獲取某個key的數(shù)據(jù)排在前3位的服務(wù)器。

Gauge類型使用方式：

topk(3,key)

Counter類型使用方式

topk(3,rate(key[1m]))

注意：此種函數(shù)獲得數(shù)據(jù)并不是很適用圖形化展示

5、count函數(shù)

count() 是找出當(dāng)前或者歷史數(shù)據(jù)中某個key的數(shù)值大于或小于某個值的統(tǒng)計，

例如：

count(node_netstat_Tcp_CurrEstab >50)

6、irate函數(shù)

irate(v range-vector)計算范圍向量中時間序列的每秒即時增長率。這基于最后兩個數(shù)據(jù)點。單調(diào)性中斷（例如由于目標(biāo)重啟而導(dǎo)致的計數(shù)器重置）會自動調(diào)整

例如：5m內(nèi)http請求的每秒速率

irate(http_requests_total{job=”linux-01″}[5m])

irate只應(yīng)在繪制易失性快速移動計數(shù)器時使用。使用rate警報和緩慢移動的柜臺，因為在房價短暫變化可以重設(shè)FOR條款和圖表完全由罕見尖峰難以閱讀。

注意，當(dāng)irate()與聚合運算符（例如sum()）或隨時間聚合的函數(shù)（以任何結(jié)尾的函數(shù)_over_time）組合時，總是先取irate()第一個，然后聚合。否則irate()在目標(biāo)重啟時無法檢測到計數(shù)器重置。

二、CPU使用率的計算方法

1、CPU模式
一顆CPU要通過分時復(fù)用的方式運行于不同的模式中，這些模式可以用我們常用的top命令進行查看，其中包括：

us：用戶進程使用cpu的時間
sy：內(nèi)核進程使用cpu的時間
ni：用戶進程空間內(nèi)改變過優(yōu)先級的進程使用的cpu時間
id：空閑（沒人用）的cpu時間
wa：等待io的cpu時間
hi：硬中斷的cpu時間
si：軟中斷的cpu時間
st：虛擬機管理程序使用的cpu時間
這些時間加在一起是總的cpu時間。

2、CPU時間
通過node-exporter抓取的指標(biāo)中cpu相關(guān)主要是各個node_cpu_seconds_total，可以通過如下的方式查看所有的metrics。

curl http://localhost:9100/metrics

在請求之后，會返回各種監(jiān)控的內(nèi)容，這里只截取出cpu相關(guān)的部分。

# HELP node_cpu_seconds_total Seconds the cpus spent in each mode. # TYPE node_cpu_seconds_total counter node_cpu_seconds_total{cpu="0",mode="idle"} 26659.41 node_cpu_seconds_total{cpu="0",mode="iowait"} 4.79 node_cpu_seconds_total{cpu="0",mode="irq"} 0 node_cpu_seconds_total{cpu="0",mode="nice"} 0 node_cpu_seconds_total{cpu="0",mode="softirq"} 2.69 node_cpu_seconds_total{cpu="0",mode="steal"} 0 node_cpu_seconds_total{cpu="0",mode="system"} 31.65 node_cpu_seconds_total{cpu="0",mode="user"} 8.67 node_cpu_seconds_total{cpu="1",mode="idle"} 26634.43 node_cpu_seconds_total{cpu="1",mode="iowait"} 54.14 node_cpu_seconds_total{cpu="1",mode="irq"} 0 node_cpu_seconds_total{cpu="1",mode="nice"} 0.02 node_cpu_seconds_total{cpu="1",mode="softirq"} 1.23 node_cpu_seconds_total{cpu="1",mode="steal"} 0 node_cpu_seconds_total{cpu="1",mode="system"} 34.07 node_cpu_seconds_total{cpu="1",mode="user"} 9 node_cpu_seconds_total{cpu="2",mode="idle"} 26629.89 node_cpu_seconds_total{cpu="2",mode="iowait"} 6.57 node_cpu_seconds_total{cpu="2",mode="irq"} 0 node_cpu_seconds_total{cpu="2",mode="nice"} 0 node_cpu_seconds_total{cpu="2",mode="softirq"} 1.95 node_cpu_seconds_total{cpu="2",mode="steal"} 0 node_cpu_seconds_total{cpu="2",mode="system"} 24.66 node_cpu_seconds_total{cpu="2",mode="user"} 7.2 node_cpu_seconds_total{cpu="3",mode="idle"} 26699.96 node_cpu_seconds_total{cpu="3",mode="iowait"} 5.72 node_cpu_seconds_total{cpu="3",mode="irq"} 0 node_cpu_seconds_total{cpu="3",mode="nice"} 0.01 node_cpu_seconds_total{cpu="3",mode="softirq"} 1.27 node_cpu_seconds_total{cpu="3",mode="steal"} 0 node_cpu_seconds_total{cpu="3",mode="system"} 22.32 node_cpu_seconds_total{cpu="3",mode="user"} 7.33

上面的某一行就是某一核cpu的某個模式的運行時間，單位是秒。把某一核各個模式的cpu時間加起來就是執(zhí)行uptime得到的系統(tǒng)開機以來運行運行的總的秒數(shù)了。例如：

node_cpu_seconds_total{cpu=”0″,mode=”idle”} 26659.41

3、推導(dǎo)CPU使用率的公式

1）cpu0 5分鐘內(nèi)處于空閑狀態(tài)的時間

increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m])
1
increase表示增量，所以這個公式表示的是當(dāng)前時間點的node_cpu_seconds_total減去5分鐘之前的node_cpu_seconds_total的值，也就是這5分鐘內(nèi)處于idle狀態(tài)的cpu時間。

2）cpu0 5分鐘內(nèi)處于空閑狀態(tài)的時間占比：

increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m]) / increase(node_cpu_seconds_total{cpu=”0″}[5m])
3）一臺主機所有cpu 5分鐘內(nèi)處于空閑狀態(tài)的時間占比：

sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) / sum (increase(node_cpu_seconds_total{mode=”idle”}[5m]))

4）如果 Prometheus 監(jiān)控多臺主機，要根據(jù)每臺主機做 sum：

sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) / sum (increase(node_cpu_seconds_total[5m])) by (instance)
1
5）cpu使用率 = 1 – cpu空閑率

100 * (1 – sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) / sum (increase(node_cpu_seconds_total[5m])) by (instance))
1
6）根據(jù)irate()函數(shù)，可以簡化計算公式為：

100 – (avg(irate(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) * 100)

三、常用計算公式

1、CPU使用率

100 – (avg(irate(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) * 100)
2、空閑內(nèi)存剩余率

(node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
3、內(nèi)存使用率

100 – (node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
4、磁盤使用率

100 – (node_filesystem_free_bytes{mountpoint=”/”,fstype=~”ext4|xfs”} / node_filesystem_size_bytes{mountpoint=”/

我是樂樂，關(guān)注尊龍時凱社區(qū)，學(xué)習(xí)prometheus不迷路，專注zabbix和prometheus技術(shù)研究與分享，更多開源技術(shù)內(nèi)容敬請留意后續(xù)文章，或查閱尊龍時凱技術(shù)文檔。如有prometheus問題還可以到尊龍時凱社區(qū)提問留言，也可以加入社區(qū)有問有答技術(shù)交流QQ群:617295020，一起交流開源技術(shù)心得。

The prev: Prometheus技術(shù)分享——prometheus高可用架構(gòu)介紹The next: 尊龍時凱監(jiān)控 x Prometheus：解鎖高效運維新技能

Related recommendations

Prometheus技術(shù)分享——如何監(jiān)控宿主機和容器
2022/12/14 6756
prometheus監(jiān)控宿主機，使用node_exporter工具來暴露主機和因公程序上的指標(biāo)； prometheus監(jiān)控docker容器，通過Cadviso
View details
Prometheus技術(shù)分享——監(jiān)控各個指標(biāo)的含義，類型，以及格式
2022/11/16 10572
前面幾期尊龍時凱君已經(jīng)跟大家介紹了prometheus的安裝配置、告警規(guī)則等等，本期將重點介紹prometheus監(jiān)控各個指標(biāo)的含義、類型以及格式。
View details
Prometheus 簡介
2022/11/08 5115
Prometheus是一個最初在SoundCloud上構(gòu)建的開源系統(tǒng)監(jiān)視和警報工具包。
View details
Prometheus技術(shù)分享——Prometheus特點，組件，局限探討
2022/11/11 6517
這一期尊龍時凱君主要跟大家來探討新一代的開源監(jiān)控prometheus，我們知道 zabbix 在監(jiān)控界占有不可撼動的地位，功能強大。但是對容器監(jiān)控顯得力不從心。為解決監(jiān)...
View details

Expand more!

快速導(dǎo)航

首頁
產(chǎn)品介紹
成功案例
行業(yè)方案
技術(shù)白皮書
- 納管能力
- 技術(shù)文檔
- zabbix技術(shù)分享
- Prometheus技術(shù)分享
關(guān)于尊龍時凱
- 運維如詩
- 企業(yè)動態(tài)
- 視頻中心
- 行業(yè)新聞
- 招聘精英
尊龍時凱社區(qū)
免費下載
免費體驗

成功案例

【實踐】有效告警提升75%！電信巨頭愛上尊龍時凱多Server+多Proxy架構(gòu)
2022/06/07 9283
采用分布式架構(gòu)：多server +?多?proxy?架構(gòu)，服務(wù)器優(yōu)化、增加表分區(qū)、采集方式優(yōu)化等。
View details
案例解讀 | 某大型國際企業(yè)智能運維平臺建設(shè)實踐
2024/03/08 6895
基于企業(yè)IT系統(tǒng)結(jié)構(gòu)特點，結(jié)合客戶運維痛點與實際需求，尊龍時凱為該客戶打造了涵蓋全局監(jiān)控、資產(chǎn)梳理、大屏視圖、專線鏈路、管理門戶、告警中心等于一...
View details
案例解讀 | 某三甲醫(yī)院運維監(jiān)控體系升級實例
2024/01/17 7680
基于客戶醫(yī)院原有的運維體系、運維痛點和對監(jiān)控的需求，尊龍時凱為其量身打造了一套一站式智能運維監(jiān)控解決方案，搭建統(tǒng)一監(jiān)控平臺，引入智能化告警管理系統(tǒng)、可...
View details
武漢市某醫(yī)院項目案例
2022/06/07 9251
尊龍時凱建立監(jiān)控平臺，做到及早發(fā)現(xiàn)故障、合理利用信息化基礎(chǔ)資源，達到最大化資源使用，使得醫(yī)院系統(tǒng)信息化建設(shè)健康發(fā)展。
View details

View all

掃碼咨詢
微信公眾號
熱線電話
- 咨詢熱線：
  13631560190
  020-28192830
回到頂部

我們在我們的網(wǎng)站上使用cookie，通過記住您的偏好和重復(fù)訪問，給您最相關(guān)的經(jīng)驗。通過點擊“接受所有”，您同意使用所有cookie。但是，您可以訪問“Cookie設(shè)置”來提供受控同意。

Cookie設(shè)置接受全部

管理同意

掃碼咨詢
微信公眾號
熱線電話
- 咨詢熱線：
  13631560190
  020-28192830
回到頂部

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	此cookie由GDPR cookie Consent插件設(shè)置。該cookie用于在“分析”類別中存儲用戶對cookie的同意。
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	此cookie由GDPR cookie Consent插件設(shè)置。該cookie用于存儲用戶在“其他”類別中對cookie的同意。
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	該cookie由GDPR cookie Consent插件設(shè)置，用于存儲用戶是否同意使用cookie。它不存儲任何個人數(shù)據(jù)。

91最新网站-91最新网址-91最新在线-91最新在线播放-91最新自拍-97cao碰-97dyy伦理-97mm草莓视频-97爱碰窝窝-97不卡无码影院

尊龍時凱

Prometheus技術(shù)分享——prometheus的函數(shù)與計算公式詳解

一、常用函數(shù)

二、CPU使用率的計算方法

三、常用計算公式

Related recommendations

Prometheus技術(shù)分享——如何監(jiān)控宿主機和容器

Prometheus技術(shù)分享——監(jiān)控各個指標(biāo)的含義，類型，以及格式

Prometheus 簡介

Prometheus技術(shù)分享——Prometheus特點，組件，局限探討

快速導(dǎo)航

成功案例

【實踐】有效告警提升75%！電信巨頭愛上尊龍時凱多Server+多Proxy架構(gòu)

案例解讀 | 某大型國際企業(yè)智能運維平臺建設(shè)實踐

案例解讀 | 某三甲醫(yī)院運維監(jiān)控體系升級實例

武漢市某醫(yī)院項目案例

產(chǎn)品

解決方案

關(guān)于我們

尊龍時凱自媒體號

關(guān)注我們

Privacy Overview

一、常用函數(shù)

二、CPU使用率的計算方法

三、常用計算公式

Related recommendations

快速導(dǎo)航

成功案例

一、常用函數(shù)

二、CPU使用率的計算方法

三、常用計算公式