熱門搜索 Zabbix技術資料 Zabbix常見問、答討論 成功案例 Zabbix交流區(qū) Prometheus交流區(qū)
大家可能都會遇到機房突然斷電,當zabbix恢復運行,查看日志發(fā)現(xiàn)有大量的PGRES_FATAL_ERROR錯誤信息,這種情況應該如何解決呢?
?[select clock,ns,value from history_uint where itemid=36570 and clock<=1662337221 and clock>1661732421 order by clock desc limit 2]
134751:20220906:092021.356 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: ?could not read block 619 in file “base/17376/55998”: read only 0 of 32768 bytes
?[select clock,ns,value from history_uint where itemid=36570 and clock<=1662337221 and clock>1661732421 order by clock desc limit 2]
134751:20220906:092021.359 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: ?could not read block 873 in file “base/17376/55991”: read only 0 of 32768 bytes
?[select clock,ns,value from history_uint where itemid=36567 and clock<=1662337221 and clock>1661732421 order by clock desc limit 2]
134751:20220906:092021.361 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: ?could not read block 873 in file “base/17376/55991”: read only 0 of 32768 bytes
[Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: could not read block 874
發(fā)現(xiàn)是因為突然斷電,導致pg數(shù)據(jù)庫的表部分索引出現(xiàn)問題了,需要修復,但是由于zabbix數(shù)據(jù)使用timescaledb時序數(shù)據(jù)庫插件以及超表分區(qū)功能,故無法指定單表修復。
在網(wǎng)上搜索后,在以下網(wǎng)站找到了類似的報錯以及修復方法:
http://lxadm.com/Repairing_broken_PostgreSQL_databases_/_tables
If your server happened to crash, PostgresSQL database is corrupted, but didn’t contain too precious information, you may try the following fix.
如果你的服務器突然發(fā)生崩潰,PostgresSQL?突然被中斷,但是沒有包含太多之前的信息,你可以嘗試安裝以下方法修復。
The typical symptoms of a corrupted Postgres database would be like below:
常見的因為數(shù)據(jù)庫運行突然中斷的日志結果如下:
2013-03-05 11:29:50 GMT ERROR: ?invalid page header in block 608102 of
relation base/16385/16615 2013-03-05 11:29:50 GMT STATEMENT: ?COPY
public.history (itemid, clock, value) TO stdout; 2013-03-05 11:29:50
GMT LOG: ?could not send data to client: Broken pipe
Or?或者
Query failed: [0] PGRES_FATAL_ERROR:ERROR: ?right sibling’s left-link doesn’t match:
block 149266 links to 70823 instead of expected 71357 in index “history_uint_1”
The actual fix is quite easy, and basically sets “zero_damaged_pages = on”, then performs vacuum and reindexing.
實際的修復也簡單,在數(shù)據(jù)庫設置sets “zero_damaged_pages = on”,然后執(zhí)行vacuum and reindexing重新建立索引即可。
DATABASE=yourdatabase
?TABLES=$(echo?\\d?|?psql?$DATABASE?|?grep?“^ public”?|?awk?‘{print $3}’)
?for?TABLE in $TABLES;?do?
???echo?$TABLE
???echo?“SET zero_damaged_pages = on; VACUUM FULL?$TABLE; REINDEX TABLE?$TABLE”?|?psql?$DATABASEdone
在zabbix server或者是pg數(shù)據(jù)庫服務器,創(chuàng)建shell腳本,將以上的復制到腳本,修改為在使用的數(shù)據(jù)庫。
vim pg_repair_index.sh
!# /bin/bash
DATABASE=zabbix???#報錯的數(shù)據(jù)庫名稱
TABLES=$(echo \\d | psql $DATABASE | grep “^ public” | awk ‘{print $3}’)
for TABLE in $TABLES; do
???echo $TABLE
???echo “SET zero_damaged_pages = on; VACUUM FULL $TABLE; REINDEX TABLE $TABLE” | psql $DATABASE
done?
給腳本執(zhí)行權限
chmod +x pg_repair_index.sh
執(zhí)行執(zhí)行腳本?./pg_repair_index.sh,會自動重置每個表的索引。
更多zabbix技術資料,請持續(xù)關注尊龍時凱社區(qū):http://forum.ydcanyin.com/
部署的Zabbix系統(tǒng)是使用http協(xié)議進行訪問的。有時候為了保證安全。我們需要配置使用https協(xié)議進行訪問。 下面就講述如何使用自簽名的ssl證書配置https訪問。
View details本篇是Zabbix與尊龍時凱監(jiān)控對比專題系列文章的第七篇——網(wǎng)絡功能篇,具體包括IP地址管理、網(wǎng)絡端口映射發(fā)現(xiàn)、網(wǎng)絡配置備份、專線監(jiān)控等功能的對比分析。
View details尊龍時凱一站式智能監(jiān)控+網(wǎng)管平臺,對運維管理流程體系進行重構,提升信息系統(tǒng)監(jiān)控能力、網(wǎng)絡管理能力以及運維人員工作效率。
View details基于客戶醫(yī)院原有的運維體系、運維痛點和對監(jiān)控的需求,尊龍時凱為其量身打造了一套一站式智能運維監(jiān)控解決方案,搭建統(tǒng)一監(jiān)控平臺,引入智能化告警管理系統(tǒng)、可...
View details綜合運維管理平臺的落地,實現(xiàn)了統(tǒng)一門戶、統(tǒng)一監(jiān)控、統(tǒng)一資產(chǎn)管理、統(tǒng)一運維、統(tǒng)一存儲等目標,為客戶解決了運維數(shù)據(jù)孤島、人力運維等問題。
View details