熱門搜索 Zabbix技術資料 Zabbix常見問、答討論成功案例 Zabbix交流區 Prometheus交流區

Prometheus技術分享——prometheus的函數與計算公式詳解

2022/12/28 Prometheus技術資料 prometheus函數 prometheus技術分享 prometheus計算公式7946

Prometheus與zabbix相比，它的強大之處就在于可以它可以使用的很多計算公式去獲取自己需要的數據。當然，這里所涉及到的計算公式，也是我們普遍認為的難點所在。比如，我們要獲取CPU使用率，使用zabbix就可以輕易獲得，但是在Prometheus中卻需要通過計算公式來完成CPU使用率的計算。

如果要統計CPU的使用：node_exporter會抓取CPU常用你的8種狀態的累計工作時間，然后再用（所有非空閑狀態的CPU時間總和）/（所有狀態的CPU時間總和）= CPU使用率。而如果想要獲取中間某一分鐘的CPU使用時間還需要用到Counter數據類型。由于Counter的數據一致是增量，所以需要截取其中一段增量值，然后再拿這個值去套用公式進行計算。

一、常用函數

Prometheus為不同的數據提供了非常多的計算函數，其中有個小技巧就是遇到counter數據類型，在做任何操作之前，先套上一個rate()或者increase()函數。下面是一些比較常用的函數：

1、rate函數

rate() 函數是專門搭配counter數據類型使用函數，功能是取counter在這個時間段中平均每秒的增量。

例如：獲取eth0網卡1m內每秒流量的平均值

rate(node_network_receive_bytes_total{device="eth0"}[1m])

2、increase函數

increase() 函數表示某段時間內數據的增量

rate() 函數則表示某段時間內數據的平均值

兩個函數如何選取使用？

當我們獲取數據比較精細的時候類似于1m取樣推薦使用rate()

當我們獲取數據比較粗糙的時候類似于5m，10m甚至更長時間取樣推薦使用increase()

例如：獲取eth0網卡1m內流量的增量

increase(node_network_receive_bytes_total{device="eth0"}[1m])

3、sum函數

sum()函數就是求和函數,注意點是當你使用sum后是將所有的監控的服務器的值進行取和，所以當我們只看某一臺時需要進行拆分

拆分常用方法：

by increase()
2 by (cluster_name) 屬于自定義標簽不是標準標簽，我們可以手動將不痛功能的服務器進行分組展示
例如：獲取所有主機eth0網卡1m內每秒流量的平均值的和

sum(rate(node_network_receive_bytes_total{device="eth0"}[1m]))

4、topk函數

topk() 函數的定義是：取前面x位的最高值,最簡單理解就是數學的top3 ，當我們有很多服務器我們想要獲取某個key的數據排在前3位的服務器。

Gauge類型使用方式：

topk(3,key)

Counter類型使用方式

topk(3,rate(key[1m]))

注意：此種函數獲得數據并不是很適用圖形化展示

5、count函數

count() 是找出當前或者歷史數據中某個key的數值大于或小于某個值的統計，

例如：

count(node_netstat_Tcp_CurrEstab >50)

6、irate函數

irate(v range-vector)計算范圍向量中時間序列的每秒即時增長率。這基于最后兩個數據點。單調性中斷（例如由于目標重啟而導致的計數器重置）會自動調整

例如：5m內http請求的每秒速率

irate(http_requests_total{job=”linux-01″}[5m])

irate只應在繪制易失性快速移動計數器時使用。使用rate警報和緩慢移動的柜臺，因為在房價短暫變化可以重設FOR條款和圖表完全由罕見尖峰難以閱讀。

注意，當irate()與聚合運算符（例如sum()）或隨時間聚合的函數（以任何結尾的函數_over_time）組合時，總是先取irate()第一個，然后聚合。否則irate()在目標重啟時無法檢測到計數器重置。

二、CPU使用率的計算方法

1、CPU模式
一顆CPU要通過分時復用的方式運行于不同的模式中，這些模式可以用我們常用的top命令進行查看，其中包括：

us：用戶進程使用cpu的時間
sy：內核進程使用cpu的時間
ni：用戶進程空間內改變過優先級的進程使用的cpu時間
id：空閑（沒人用）的cpu時間
wa：等待io的cpu時間
hi：硬中斷的cpu時間
si：軟中斷的cpu時間
st：虛擬機管理程序使用的cpu時間
這些時間加在一起是總的cpu時間。

2、CPU時間
通過node-exporter抓取的指標中cpu相關主要是各個node_cpu_seconds_total，可以通過如下的方式查看所有的metrics。

curl http://localhost:9100/metrics

在請求之后，會返回各種監控的內容，這里只截取出cpu相關的部分。

# HELP node_cpu_seconds_total Seconds the cpus spent in each mode. # TYPE node_cpu_seconds_total counter node_cpu_seconds_total{cpu="0",mode="idle"} 26659.41 node_cpu_seconds_total{cpu="0",mode="iowait"} 4.79 node_cpu_seconds_total{cpu="0",mode="irq"} 0 node_cpu_seconds_total{cpu="0",mode="nice"} 0 node_cpu_seconds_total{cpu="0",mode="softirq"} 2.69 node_cpu_seconds_total{cpu="0",mode="steal"} 0 node_cpu_seconds_total{cpu="0",mode="system"} 31.65 node_cpu_seconds_total{cpu="0",mode="user"} 8.67 node_cpu_seconds_total{cpu="1",mode="idle"} 26634.43 node_cpu_seconds_total{cpu="1",mode="iowait"} 54.14 node_cpu_seconds_total{cpu="1",mode="irq"} 0 node_cpu_seconds_total{cpu="1",mode="nice"} 0.02 node_cpu_seconds_total{cpu="1",mode="softirq"} 1.23 node_cpu_seconds_total{cpu="1",mode="steal"} 0 node_cpu_seconds_total{cpu="1",mode="system"} 34.07 node_cpu_seconds_total{cpu="1",mode="user"} 9 node_cpu_seconds_total{cpu="2",mode="idle"} 26629.89 node_cpu_seconds_total{cpu="2",mode="iowait"} 6.57 node_cpu_seconds_total{cpu="2",mode="irq"} 0 node_cpu_seconds_total{cpu="2",mode="nice"} 0 node_cpu_seconds_total{cpu="2",mode="softirq"} 1.95 node_cpu_seconds_total{cpu="2",mode="steal"} 0 node_cpu_seconds_total{cpu="2",mode="system"} 24.66 node_cpu_seconds_total{cpu="2",mode="user"} 7.2 node_cpu_seconds_total{cpu="3",mode="idle"} 26699.96 node_cpu_seconds_total{cpu="3",mode="iowait"} 5.72 node_cpu_seconds_total{cpu="3",mode="irq"} 0 node_cpu_seconds_total{cpu="3",mode="nice"} 0.01 node_cpu_seconds_total{cpu="3",mode="softirq"} 1.27 node_cpu_seconds_total{cpu="3",mode="steal"} 0 node_cpu_seconds_total{cpu="3",mode="system"} 22.32 node_cpu_seconds_total{cpu="3",mode="user"} 7.33

上面的某一行就是某一核cpu的某個模式的運行時間，單位是秒。把某一核各個模式的cpu時間加起來就是執行uptime得到的系統開機以來運行運行的總的秒數了。例如：

node_cpu_seconds_total{cpu=”0″,mode=”idle”} 26659.41

3、推導CPU使用率的公式

1）cpu0 5分鐘內處于空閑狀態的時間

increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m])
1
increase表示增量，所以這個公式表示的是當前時間點的node_cpu_seconds_total減去5分鐘之前的node_cpu_seconds_total的值，也就是這5分鐘內處于idle狀態的cpu時間。

2）cpu0 5分鐘內處于空閑狀態的時間占比：

increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m]) / increase(node_cpu_seconds_total{cpu=”0″}[5m])
3）一臺主機所有cpu 5分鐘內處于空閑狀態的時間占比：

sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) / sum (increase(node_cpu_seconds_total{mode=”idle”}[5m]))

4）如果 Prometheus 監控多臺主機，要根據每臺主機做 sum：

sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) / sum (increase(node_cpu_seconds_total[5m])) by (instance)
1
5）cpu使用率 = 1 – cpu空閑率

100 * (1 – sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) / sum (increase(node_cpu_seconds_total[5m])) by (instance))
1
6）根據irate()函數，可以簡化計算公式為：

100 – (avg(irate(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) * 100)

三、常用計算公式

1、CPU使用率

100 – (avg(irate(node_cpu_seconds_total{mode=”idle”}[5m])) by (instance) * 100)
2、空閑內存剩余率

(node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
3、內存使用率

100 – (node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
4、磁盤使用率

100 – (node_filesystem_free_bytes{mountpoint=”/”,fstype=~”ext4|xfs”} / node_filesystem_size_bytes{mountpoint=”/

我是樂樂，關注尊龍時凱社區，學習prometheus不迷路，專注zabbix和prometheus技術研究與分享，更多開源技術內容敬請留意后續文章，或查閱尊龍時凱技術文檔。如有prometheus問題還可以到尊龍時凱社區提問留言，也可以加入社區有問有答技術交流QQ群:617295020，一起交流開源技術心得。

The prev: Prometheus技術分享——prometheus高可用架構介紹The next: 尊龍時凱監控 x Prometheus：解鎖高效運維新技能

Related recommendations

Prometheus技術分享——Prometheus特點，組件，局限探討
2022/11/11 6595
這一期尊龍時凱君主要跟大家來探討新一代的開源監控prometheus，我們知道 zabbix 在監控界占有不可撼動的地位，功能強大。但是對容器監控顯得力不從心。為解決監...
View details
Prometheus技術分享——Prometheus通過Nginx加密登陸
2022/11/08 7327
通過Nginx反向代理是一個不錯的選擇。本文尊龍時凱君將介紹通過Nginx反向代理增加401認證方式來實現加密登錄。
View details
Zabbix對接Prometheus實操教程——基于Prometheus pattern方式
2025/04/11 2435
Zabbix對接Prometheus實操教程
View details
Prometheus技術分享——prometheus自定義告警規則解析和配置
2022/11/08 7059
對于運維監控而言，除了監控展示以外，另一個重要的需求無疑就是告警了。良好的告警可以幫助運維人員及時的發現問題，處理問題并防范于未然，是運維工作中不...
View details

Expand more!

快速導航

首頁
產品介紹
成功案例
行業方案
- 行業大屏
- 銀行
- 金融保險
- 先進制造
- 智慧城市
- 運營商
- 教育
- 醫療
- 混合云
技術白皮書
- 納管能力
- 技術文檔
- zabbix技術分享
- Prometheus技術分享
關于尊龍時凱
- 運維如詩
- 企業動態
- 視頻中心
- 行業新聞
- 招聘精英
尊龍時凱社區
免費下載
免費體驗

成功案例

尊龍時凱實踐|如何打造市值5000億元保險公司的智能運維平臺？
2022/06/07 9104
業務地圖、告警收斂、自動網絡拓撲、虛擬化監控、定制投屏、章節式報表、可持續消費知識庫等、資產管理、自動運維、服務管理等多個方面的功能和服務。
View details
武漢市某醫院項目案例
2022/06/07 9369
尊龍時凱建立監控平臺，做到及早發現故障、合理利用信息化基礎資源，達到最大化資源使用，使得醫院系統信息化建設健康發展。
View details
深圳市寶安某醫院統一監控平臺項目
2022/06/07 9361
尊龍時凱基于Zabbix和企業微信的網絡監控系統,通過實時獲取交換機、服務器等被監控對象的相關數據，及時發現并解決問題,保證醫院網絡的高可用性。
View details
案例解讀 | 某大型國際機場綜合運維管理平臺建設實踐
2024/09/06 5648
綜合運維管理平臺的落地，實現了統一門戶、統一監控、統一資產管理、統一運維、統一存儲等目標，為客戶解決了運維數據孤島、人力運維等問題。
View details

View all

掃碼咨詢
微信公眾號
熱線電話
- 咨詢熱線：
  13631560190
  020-28192830
回到頂部

我們在我們的網站上使用cookie，通過記住您的偏好和重復訪問，給您最相關的經驗。通過點擊“接受所有”，您同意使用所有cookie。但是，您可以訪問“Cookie設置”來提供受控同意。

Cookie設置接受全部

管理同意

掃碼咨詢
微信公眾號
熱線電話
- 咨詢熱線：
  13631560190
  020-28192830
回到頂部

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	此cookie由GDPR cookie Consent插件設置。該cookie用于在“分析”類別中存儲用戶對cookie的同意。
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	此cookie由GDPR cookie Consent插件設置。該cookie用于存儲用戶在“其他”類別中對cookie的同意。
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	該cookie由GDPR cookie Consent插件設置，用于存儲用戶是否同意使用cookie。它不存儲任何個人數據。

91最新网站-91最新网址-91最新在线-91最新在线播放-91最新自拍-97cao碰-97dyy伦理-97mm草莓视频-97爱碰窝窝-97不卡无码影院

尊龍時凱

Prometheus技術分享——prometheus的函數與計算公式詳解

一、常用函數

二、CPU使用率的計算方法

三、常用計算公式

Related recommendations

Prometheus技術分享——Prometheus特點，組件，局限探討

Prometheus技術分享——Prometheus通過Nginx加密登陸

Zabbix對接Prometheus實操教程——基于Prometheus pattern方式

Prometheus技術分享——prometheus自定義告警規則解析和配置

快速導航

成功案例

尊龍時凱實踐|如何打造市值5000億元保險公司的智能運維平臺？

武漢市某醫院項目案例

深圳市寶安某醫院統一監控平臺項目

案例解讀 | 某大型國際機場綜合運維管理平臺建設實踐

產品

解決方案

關于我們

尊龍時凱自媒體號

關注我們

Privacy Overview

一、常用函數

二、CPU使用率的計算方法

三、常用計算公式

Related recommendations

快速導航

成功案例

一、常用函數

二、CPU使用率的計算方法

三、常用計算公式