Setting up Telegram Alerts from the Grafana Dashboard (last article of the logger series)
We set up Grafana dashboard as our UI to the log ingestion stack for our Python applications:
but no matter how beautiful your dashboard may be - I suspect you are not going to sit in front of it and marvel at it.
After all - the purpose of the log stack is to quickly diagnose and fix system issues. We want to set up alerts. Using these alerts, we want to be quickly aware of issues…then move to the dashboard, analyze our logs - and zoom into the problem areas.
Now, let’s continue from the previous post and set up telegram alerts. I am going to assume you know how to Google, and by extension - know how to talk to telegram botfather to set up a telegram bot and retrieve your chat id.
There are 3 things concerned with grafana alerts - we will make this one crisp and quick. Alert rules, notification policies and contact points. Minimally, we just need alert rules and contact points - we will stick to the minimum.
The first is your alert rules - which rules trigger alert/notifications? Then there is your contact points. Given the set of alerts that are triggered - where are we sending these alerts?
Go to Grafana > contact point and go ahead and plop your API tokens and chat id in there,
now let us go to alert rules. Make it fire when there is at least one error in the last five minutes. To do that - we sum errors over 5min interval:
sum(count_over_time({filename="/var/log/example.log"} | json | level = `ERROR` [5m]))
Our query already gives us an aggregated value, so we can get rid of query B, and use the threshold (query C) as our query B. Set the threshold to zero. Now it looks something like this:
we see that it is `firing` - because there have been errors in the last five minutes. You can of course set some threshold greater than zero for some error tolerance.
I create some sample group and folder -
set the evaluation and pending intervals to both 5 minutes. Under the contact point - I select the telegram one.
For 5. onwards - setting labels and what not, I complete ignore it. Who even knows what that is about? Lol. It is about formatting and making your telegram message nice and informative. I am going to ignore that - I just use telegram as an alert system to notify me - and when I know there are errors, I just spin up my dashboard.
I save and exit. Then I go for a cuppa coffee.
On the way to the coffee shop - roughly 10 minutes later - you should receive a notification on telegram to sit your ass back down on the desk and fix some issues.
Welcome to the bug free life.