1、安裝Monit
Debian可以直接:apt-get?install?monit安裝。
官網下載最新版:http://mmonit.com/download/
源碼編譯安裝:
mysql數據庫監控,相關依賴: yum install pam-devel
tar zxvf monit-5.7.tar.gz
cd monit-5.7
./configure --prefix=/usr/local/monit --sysconfdir=/usr/local/monit/etc
make
mysql數據庫客戶端,make install
mkdir -p /usr/local/monit/etc
cp monitrc /usr/local/monit/etc
chmod 600 /usr/local/monit/etc/monitrc
2、修改配置文件
自己寫mysql客戶端,CentOS用RPM包安裝的話配置文件在:/etc/monit.conf;
Debian?apt-get安裝的話配置文件在:/etc/monit/monitrc;
源碼編譯在安裝目錄?/etc/monit.conf
下面是一個簡單的例子,配置文件中已經包含了大量的例子,可以看配置文件參考。
set daemon 120 #設置檢測時間
docker監控工具?set logfile /var/log/monit.log #monit日志
set httpd port 2812 and # monit內置了一個用于查看被監視服務
use address 192.168.10.197 #設置這個http服務器的地址或注釋使用所有本機IP
allow 192.168.10.0/24 #允許本地訪問
allow admin:monit #設置使用用戶名admin和密碼monit
dubbo監控,set mailserver smtp.12320.tv port 25 USERNAME "zsxlmonitor" PASSWORD "123456" #(設置發送郵件的服務器及郵箱)
#制定報警郵件的格式
set mail-format {
from: zsxlmonitor@12320.tv
subject: [From Monit]$SERVICE $EVENT at $DATE
mysql集群監控,message:Date:$DATE
ServerHost: $HOST
Item:$SERVICE
Problem: $DESCRIPTION.
Action:$ACTION
爬蟲監控。}
#指定郵件接收者
set alert zsxlmonitor@12320.tv with reminder on 3 cycles #每3個周期發送一次警告
或者
set alert zsxlmonitor@12320.tv #每個錯誤都發送警告郵件
mysql怎么用?#####################監控實例#####################
#檢測sshd服務
check process sshd with pidfile /var/run/sshd.pid
start program "/etc/init.d/sshd start"
stop program "/etc/init.d/sshd stop"
mysql監控項、if failed port 22 protocol ssh then restart
if 5 restarts within 5 cycles then timeout
#檢測mysql服務
check process mysql with pidfile /usr/local/mysql/var/vpser.pid
group database
mysql客戶端和服務端。start program = "/etc/init.d/mysql start"
stop program = "/etc/init.d/mysql stop"
if failed host 127.0.0.1 port 3306 then restart
if 5 restarts within 5 cycles then timeout
#檢測nginx服務
mysql自帶的客戶端工具,check process nginx with pidfile /usr/local/nginx/logs/nginx.pid
start program = "/etc/init.d/nginx start"
stop program = "/etc/init.d/nginx stop"
if failed host localhost port 80 protocol http
then restart
mysql客戶端哪個好用,自定義腳本:
if failed host localhost port 80 protocol http then exec “/usr/bin/restart.sh”
配置中的pid及啟動、關閉腳本的路徑一定要是全路徑,且參數一定要正確否則會造成無法正常檢測或正常啟動。
如果添加了http可以直接使用http://192.168.10.197:2812進行管理。
可以看到監控服務的運行狀態:
mongodb客戶端工具、3、啟動/停止Monit
/usr/local/monit/bin/monit啟動
/usr/local/monit/bin/monit quit關閉
Optional action arguments for non-daemon mode are as follows:
start all?????????? - Start all services
start name????????? - Only start the named service
stop all??????????? - Stop all services
stop name?????????? - Only stop the named service
restart all???????? - Stop and start all services
restart name??????? - Only restart the named service
monitor all???????? - Enable monitoring of all services
monitor name??????? - Only enable monitoring of the named service
unmonitor all?????? - Disable monitoring of all services
unmonitor name????? - Only disable monitoring of the named service
reload????????????? - Reinitialize monit
status????????????? - Print full status information for each service
summary???????????? - Print short status information for each service
quit??????????????? - Kill monit daemon process
validate??????????? - Check all services and start if not running
procmatch - Test process matching pattern
4、參數配置語法
4.1 監控模塊
監控性能[運行中的程序,需指定PID]
CHECK PROCESS | MATCHING >
監控文件
CHECK FILE PATH
CHECK FIFO PATH
監控文件系統
CHECK FILESYSTEM PATH
監控目錄
CHECK DIRECTORY PATH
監控主機
CHECK HOST ADDRESS
監控系統
CHECK SYSTEM
監控程序
CHECK PROGRAM PATH [TIMEOUT SECONDS]
4.2 動作[action]
ALERT ????????????? 執行報警動作
RESTART?????????? 執行重啟[根據定義的start program和stop program,先執行stop,然后start]
START??????????????? 執行啟動[直接執行定義的start program]
STOP????????????????? 執行關閉[直接執行定義的stop program]
EXEC????????????????? 執行腳本[直接執行 “ ” 內指定腳本(全路徑)]
UNMONITOR?? 停止監控
4.3 資源項目[RESOURCE]
--PROCESS
CPU([user|system|wait])????????????????? CPU明細
CPU???????????????????????????????????????????????????????? CPU使用率
TOTALCPU???????????????????????????????????????????? CPU使用率【含子進程】
SWAP?????????????????????????????????????????????????????? 交換分區使用< Byte, kB, MB, GB >
CHILDREN???????????????????????????????????????????? 子進程
MEMORY??????????????????????????????????????????????????????? 內存使用< Byte, kB, MB, GB >
TOTALMEMORY?????????????????????????????????? 內存使用【含子進程】< Byte, kB, MB, GB >
LOADAVG([1min|5min|15min])?????? 系統負載
UPTIME????????????????????????????????????????????????? 運行時間< "SECONDS", "MINUTES", "HOURS", or "DAYS" >
--FILE
SIZE???????????????????????????????????????????????????????? 大小< "B","KB","MB","GB" >
Permission???????????????????????????????????????????? 權限
UID
GID
PID
PPID
TIMESTAMP????????????????????????????????????????? 時間戳< "SECONDS", "MINUTES", "HOURS", or "DAYS" >
--SYSTEMFILE
usage????????????????????????????????????????????????????? 已使用
SPACE????????????????????????????????????????????????????? 距離項存在[值]
INODE???????????????????????????????????????????????????? inode值
--HOST
host []???????????????????????????????????????????????????? 主機[IP、域名]
port []????????????????????????????????????????????????????? 端口[值]
type []???????????????????????????????????????????????????? 傳輸協議[TCP|UDP|TCPSSL]
protocol []????????????????????????????????????????????? 服務協議[APACHE-STATUS DNS DWP FTP GPS HTTP IMAP CLAMAV LDAP2 LDAP3 LMTP MEMCACHE MYSQL NNTP NTP3 POP POSTFIX-POLICY RADIUS RDATE RSYNC SIP SMTP SSH TNS PGSQL]
The HTTP protocol supports in addition:
REQUEST
HOSTHEADER
CHECKSUM
The Apache-status protocol supports in addition:
logging (loglimit)
closing connections (closelimit)
performing DNS lookups (dnslimit)
in keepalive with a client (keepalivelimit)
replying to a client (replylimit)
receiving a request (requestlimit)
initialising (startlimit)
waiting for incoming connections (waitlimit)
gracefully closing down (gracefullimit)
performing cleanup procedures (cleanuplimit)
-- PROGRAM
status???????????????????????????????????????????????????? 程序執行狀態
4.4 判斷測量[TEST]
FAILED [RESOURCE] 項的值為錯誤
CHANGED [RESOURCE] 項存在變化
EXIST [項] 存在[項]
DOES NOT EXIST 不存在[項]
4.4 比較語法
"", "!=", "=="
"gt", "lt", "eq", "ne"
"greater", "less", "equal", "notequal" then 值
4.5 監測時間
EVERY [number] CYCLES
every 2 cycles #每2個周期監測一次
EVERY [cron]
every "* * * * *"分時日月周,*所有 x-y表示X至y,“,”指定某個點
every "* 8-19 * * 1-5" #每周1至5,每天早上8點至晚上7點內,按周期間隔監測
NOT EVERY [cron]用法與VERY [cron]相反
5、實例語法
1.系統性能
#監控系統性能,定義監控名稱myhost
check system myhost
#如果1分鐘內系統負載大于4,則執行報警
if loadavg (1min) > 4 then alert
#如果5分鐘內系統負載大于2,則執行報警
if loadavg (5min) > 2 then alert
#如果總內存使用率高于75%,則執行報警
if memory usage > 75% then alert
#如果交換空間使用率大于25%,則執行報警
if swap usage > 25% then alert
#如果CPU(user)使用率高于70%,則執行報警
if cpu usage (user) > 70% then alert
#如果CPU(system)使用率高于30%,則執行報警
if cpu usage (system) > 30% then alert
#如果CPU(wait)使用率高于20%,則執行報警
if cpu usage (wait) > 20% then alert
2.硬盤監控
#監控文件系統:/dev/sdb1,定義監控名稱為datafs
check filesystem datafs with path /dev/sdb1
#掛載、卸載文件系統,注意開啟這個功能。
#start program? = "/bin/mount /data"
#stop program? = "/bin/umount /data"
#檢測文件系統的權限不為660時,則停止監控
if failed permission 660 then unmonitor
#檢測文件系統的UID不為root時,則停止監控
if failed uid root then unmonitor
#檢測文件系統的GID不為disk時,則停止監控
if failed gid disk then unmonitor
#檢測文件系統的空間使用率超過80%,則執行報警
if space usage > 80% for 5 times within 15 cycles then alert
#檢測文件系統的空間使用率超過90%,則執行卸載文件系統
#if space usage > 99% then stop
#檢測文件系統的inode使用數超過30000,則執行報警
if inode usage > 30000 then alert
#檢測文件系統的inode使用率超過99%,則執行卸載文件系統
if inode usage > 99% then stop
3.文件監控
#監控文件:/data/mydatabase.db,定義監控名稱為database
check file database with path /data/mydatabase.db
#檢測文件系統的權限不為700時,則停止監控
if failed permission 700 then alert
#檢測文件的UID不為data時,則執行報警
if failed uid data then alert
#檢測文件的GID不為data時,則執行報警
if failed gid data then alert
#檢測文件的時間戳大于15分鐘時,則執行報警
if timestamp > 15 minutes then alert
#檢測文件的大小如果大于100M,則執行腳本
if size > 100 MB then exec "/my/cleanup/script" as uid dba and gid dba
4.目錄監控
#監控目錄:/bin,定義監控名稱為bin
check directory bin with path /bin
#如果目錄權限不為755,則停止監控
#??? if failed permission 755 then unmonitor
#如果目錄的UID不為0,則停止監控
#??? if failed uid 0 then unmonitor
#如果目錄的GID不為0,則停止監控
#??? if failed gid 0 then unmonitor
5.進程監控
#監控進程,指定進程的pid文件:/usr/local/apache/logs/httpd.pid,定義監控名稱為Apache
check process Apache with pidfile /usr/local/apache/logs/httpd.pid
start program = "/usr/local/apache/bin/httpd -k start"
stop program? = "/usr/local/apache/bin/httpd -k stop"
#檢測進程的CPU占用率高于60%時,則執行報警
if cpu > 60% for 5 cycles then alert
#檢測進程的CPU占用率高于80%時,則執行重啟
if cpu > 80% for 10 cycles then restart
#檢測進程的總內存占用高于200MB時,則執行重啟
#??? if totalmem > 200.0 MB for 5 cycles then restart
#檢測進程的子進程數高于200個,則執行報警
if children > 200 for 3 times within 5 cycles then alert
#檢測進程的子進程數高于500個,則執行重啟
if children > 500 for 5 times within 15 cycles then restart
#檢測進程5分鐘內的平均負載大于10時,則執行停止
#if loadavg(5min) greater than 10 for 8 cycles then stop
#檢測127.0.0.1的80端口如果超時[5s]、錯誤,則執行重啟
if failed host 127.0.0.1 port 80 protocol http for 5 times within 10 cycles then restart
#檢測訪問http://127.0.0.1/check.php,如果返回內容不是”OK”,則執行報警
if failed url http://127.0.0.1/check.php
and content == 'ok'
then alert
#檢測訪問127.0.0.1:80//somefile.html如果錯誤,則執行重啟
if failed host 127.0.0.1 port 80 protocol http and request "/somefile.html" then restart
#檢測指定主機,發送請求,判斷返回值,執行指令
if failed host 127.0.0.1 port 80
send "GET / HTTP/1.0\r\nHost: 127.0.0.1\r\n\r\n"
expect "HTTP/[0-9\.]{3} 200 OK"
then alert
#檢測apache-status判斷值,執行指令
if failed host 127.0.0.1 port 80 protocol apache-status
loglimit > 10% or
dnslimit > 50% or
waitlimit < 20%
then alert
#if failed port 443 type tcpssl protocol http with timeout 15 seconds then restart
#如果在5個周期內重啟3次進程,則判斷為超時
#if 3 restarts within 5 cycles then timeout
#depends on apache_bin
#group server
6.監控主機
check host myserver with address 192.168.1.1
if failed icmp type echo count 3 with timeout 3 seconds then alert
if failed port 3306 protocol mysql with timeout 15 seconds then alert
if failed url http://user:password@www.foo.bar:8080/?querystring and content == 'action="j_security_check"' then alert
版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。
工作时间:8:00-18:00
客服电话
电子邮件
admin@qq.com
扫码二维码
获取最新动态