Skip to content

qyeah98/voi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Voi Network

Voi a layer-1 blockchain solution that starts with Algorand's technology

Hardware Requirement 12

  • CPU : 8 vCPU
  • RAM : 16 GB
  • Storage : 100 GB NVMe SSD or equivalent
  • Network :
    • Minimum : 100 Mbps
    • Recommend : 1 Gbps connection with low latency

Tip

CPU, Network and Disk I/O become a potential bottleneck during High TPS. 3
If could, it is better to choose higher spec server than requirement.

Install Algorand node

Currently in production.
Please use D13 guide and VNBnode guide as a reference.

Check the node’s status

Run this command:

goal node status

Check that the node status is the fine.

  1. Genesis ID: must be voitest-v1
  2. Check that the node is synced/caught up.
    The Sync Time: will display Sync Time: 0.0s when the node is fully caught up.
    Comparing this Last committed block: number to what is shown using an Voi Explorer.

----- Result -----

Last committed block: 2899408    <===== CHECK HERE
Time since last block: 1.3s
Sync Time: 0.0s    <=================== CHECK HERE
Last consensus protocol: https://github.com/algorandfoundation/specs/tree/abd3d4823c6f77349fc04c3af7b1e99fe4df699f
Next consensus protocol: https://github.com/algorandfoundation/specs/tree/abd3d4823c6f77349fc04c3af7b1e99fe4df699f
Round for next consensus protocol: 2899409
Next consensus protocol supported: true
Last Catchpoint: 
Genesis ID: voitest-v1   <============= CHECK HERE
Genesis hash: IXnoWtviVVJW5LGivNFc0Dq14V3kqaXuK2u5OQrdVZo=

Note

If your node status is not correct, please re-run node by following community guides45 step by step.
If the problem is not resolved, please ask at Voi Network Discord #node-runner-help

Set up monitoring system: Prometheus + Grafana

Tip

Purpose
The purpose of system monitoring is to detect potential issues or anomalies in the system, so that they can be addressed proactively before they cause significant problems on network.

Prometheus
An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

Grafana
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus and Loki.

Caution

This guide is supported for Ubuntu22.04 LTS.

Install Prometheus

Run this command:

sudo apt-get install -y prometheus prometheus-node-exporter

Check if prometheus and prometheus-node-exporter has been installed:

sudo dpkg -l prometheus prometheus-node-exporter

----- Result -----

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                     Version                     Architecture Description
+++-========================-===========================-============-==========================================
ii  prometheus               2.31.2+ds1-1ubuntu1.22.04.2 amd64        monitoring system and time series database
ii  prometheus-node-exporter 1.3.1-1ubuntu0.22.04.2      amd64        Prometheus exporter for machine metrics

Tip

The command install both Prometheus and Prometheus Node Exporter.
The Prometheus Node Exporter exposes a wide variety of hardware- and kernel-related metrics. 6

Install Grafana

Run this command:

sudo wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo echo "deb https://packages.grafana.com/oss/deb stable main" > grafana.list
sudo mv grafana.list /etc/apt/sources.list.d/grafana.list

sudo apt-get update && sudo apt-get install -y grafana

Check if grafana has been installed:

sudo dpkg -l grafana

----- Result -----

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  grafana        10.2.3       amd64        Grafana

Start Prometheus and Grafana

Run this command:

sudo systemctl enable grafana-server.service prometheus.service prometheus-node-exporter.service
sudo systemctl start grafana-server.service prometheus.service prometheus-node-exporter.service
  1. Check if Grafana status is active (running):
sudo systemctl status grafana-server.service --no-pager -l

----- Result -----

● grafana-server.service - Grafana instance
     Loaded: loaded (/lib/systemd/system/grafana-server.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-26 04:15:18 CET; 1 day 7h ago       <===== CHECK HERE
       Docs: http://docs.grafana.org
   Main PID: 69210 (grafana)
      Tasks: 21 (limit: 76910)
     Memory: 50.5M
        CPU: 2min 1.834s
     CGroup: /system.slice/grafana-server.service
             └─69210 /usr/share/grafana/bin/grafana server --config=/etc/grafana/grafana.ini --pidfile=/run/grafana/grafana-server.pid --packaging=deb cfg:default.paths.logs=/var/l…

Dec 27 11:05:26 voi-node-testnet grafana[69210]: logger=grafana.update.checker t=2023-12-27T11:05:26.769563461+01:00 level=info msg="Update check succeeded" duration=7.663637ms
Dec 27 11:05:26 voi-node-testnet grafana[69210]: logger=cleanup t=2023-12-27T11:05:26.808206779+01:00 level=info msg="Completed cleanup jobs" duration=64.861523ms
Dec 27 11:05:26 voi-node-testnet grafana[69210]: logger=plugins.update.checker t=2023-12-27T11:05:26.874721247+01:00 level=info msg="Update check succeeded" duration=70.901991ms
Dec 27 11:15:26 voi-node-testnet grafana[69210]: logger=grafana.update.checker t=2023-12-27T11:15:26.769039867+01:00 level=info msg="Update check succeeded" duration=6.935579ms
Dec 27 11:15:26 voi-node-testnet grafana[69210]: logger=cleanup t=2023-12-27T11:15:26.810085388+01:00 level=info msg="Completed cleanup jobs" duration=66.294104ms
Dec 27 11:15:26 voi-node-testnet grafana[69210]: logger=plugins.update.checker t=2023-12-27T11:15:26.875407205+01:00 level=info msg="Update check succeeded" duration=72.228362ms
Dec 27 11:16:42 voi-node-testnet grafana[69210]: logger=infra.usagestats t=2023-12-27T11:16:42.755259475+01:00 level=info msg="Usage stats are ready to report"
Dec 27 11:25:26 voi-node-testnet grafana[69210]: logger=grafana.update.checker t=2023-12-27T11:25:26.768865417+01:00 level=info msg="Update check succeeded" duration=6.73339ms
Dec 27 11:25:26 voi-node-testnet grafana[69210]: logger=cleanup t=2023-12-27T11:25:26.809052346+01:00 level=info msg="Completed cleanup jobs" duration=65.2849ms
Dec 27 11:25:26 voi-node-testnet grafana[69210]: logger=plugins.update.checker t=2023-12-27T11:25:26.860347684+01:00 level=info msg="Update check succeeded" duration=56.999845ms
  1. Check if Prometheus status is active (running):
sudo systemctl status prometheus.service --no-pager -l

----- Result -----

● prometheus.service - Monitoring system and time series database
     Loaded: loaded (/lib/systemd/system/prometheus.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-26 04:15:19 CET; 1 day 7h ago       <===== CHECK HERE
       Docs: https://prometheus.io/docs/introduction/overview/
             man:prometheus(1)
   Main PID: 69220 (prometheus)
      Tasks: 18 (limit: 76910)
     Memory: 130.9M
        CPU: 4min 19.124s
     CGroup: /system.slice/prometheus.service
             └─69220 /usr/bin/prometheus

Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.716Z caller=compact.go:459 level=info component=tsdb msg="compact blocks" count=3 mint=1703613603258 maxt=1703635200000 ulid=01HJMT9WENTPNRFKS8V7AV04KN sources="[01HJKYTZ67785VT052NA9T7P1T 01HJM5PHHZPQ3NF1QEQFQ5PGAF 01HJMCJ8T01V9EDKPT5QVACB1S]" duration=183.420025ms
Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.725Z caller=db.go:1293 level=info component=tsdb msg="Deleting obsolete block" block=01HJKYTZ67785VT052NA9T7P1T
Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.731Z caller=db.go:1293 level=info component=tsdb msg="Deleting obsolete block" block=01HJM5PHHZPQ3NF1QEQFQ5PGAF
Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.740Z caller=db.go:1293 level=info component=tsdb msg="Deleting obsolete block" block=01HJMCJ8T01V9EDKPT5QVACB1S
Dec 27 08:00:03 voi-node-testnet prometheus[69220]: ts=2023-12-27T07:00:03.400Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1703649603257 maxt=1703656800000 ulid=01HJN15EHZ5AMEWV02TT0K5SM0 duration=136.587432ms
Dec 27 08:00:03 voi-node-testnet prometheus[69220]: ts=2023-12-27T07:00:03.402Z caller=head.go:805 level=info component=tsdb msg="Head GC completed" duration=1.514546ms
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.421Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1703656803257 maxt=1703664000000 ulid=01HJN81AP7G92NC8F4CHQP4X60 duration=157.185812ms
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.423Z caller=head.go:805 level=info component=tsdb msg="Head GC completed" duration=1.538992ms
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.452Z caller=checkpoint.go:97 level=info component=tsdb msg="Creating checkpoint" from_segment=12 to_segment=13 mint=1703664000000
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.541Z caller=head.go:974 level=info component=tsdb msg="WAL checkpoint complete" first=12 last=13 duration=88.867604ms
  1. Check if Prometheus Node Exporter status is active (running):
sudo systemctl status prometheus-node-exporter.service --no-pager -l

----- Result -----

● prometheus-node-exporter.service - Prometheus exporter for machine metrics
     Loaded: loaded (/lib/systemd/system/prometheus-node-exporter.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-26 04:15:18 CET; 1 day 7h ago       <===== CHECK HERE
       Docs: https://github.com/prometheus/node_exporter
   Main PID: 69211 (prometheus-node)
      Tasks: 29 (limit: 76910)
     Memory: 13.7M
        CPU: 11min 42.505s
     CGroup: /system.slice/prometheus-node-exporter.service
             └─69211 /usr/bin/prometheus-node-exporter

Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=thermal_zone
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=time
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=timex
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=udp_queues
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=uname
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=vmstat
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=xfs
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=zfs
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:199 level=info msg="Listening on" address=:9100
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false

Add Voi metrics target to Prometheus config file

Run this command:

sudo sh -c "cat << EOT >> /etc/prometheus/prometheus.yml

  - job_name: voi
    static_configs:
      - targets: ['localhost:8080']
EOT"

Tip

The port 8080 is REST API port.
You can get accounts, assets and block info by using REST API. 7

Command to check account
curl 127.0.0.1:8080/v2/accounts/RWNOVLS5ZJM5GHM5WXIRMHG7NHXCYWLWYB5E4DTPILHQ7IIVNM5CVR4PVE -H "X-Algo-API-Token: $(cat /var/lib/algorand/algod.token)" | jq

Check Prometheus config file:

sudo cat /etc/prometheus/prometheus.yml

----- Result -----

# Sample config for Prometheus.

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'example'

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ['localhost:9093']

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    scrape_timeout: 5s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

  - job_name: node
    # If prometheus-node-exporter is installed, grab stats about the local
    # machine by default.
    static_configs:
      - targets: ['localhost:9100']

  - job_name: voi       # <======================== CHECK HERE
    static_configs:     # <======================== CHECK HERE
      - targets: ['localhost:8080']      # <======= CHECK HERE

Reload Prometheus

Run this command:

sudo systemctl reload prometheus.service

Restart Prometheus and Grafana

Run this command:

sudo systemctl restart grafana-server.service prometheus.service prometheus-node-exporter.service
  1. Check if Grafana status is active (running):
sudo systemctl status grafana-server.service --no-pager -l

----- Result -----

● grafana-server.service - Grafana instance
     Loaded: loaded (/lib/systemd/system/grafana-server.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-26 04:15:18 CET; 1 day 7h ago       <===== CHECK HERE
       Docs: http://docs.grafana.org
   Main PID: 69210 (grafana)
      Tasks: 21 (limit: 76910)
     Memory: 50.5M
        CPU: 2min 1.834s
     CGroup: /system.slice/grafana-server.service
             └─69210 /usr/share/grafana/bin/grafana server --config=/etc/grafana/grafana.ini --pidfile=/run/grafana/grafana-server.pid --packaging=deb cfg:default.paths.logs=/var/l…

Dec 27 11:05:26 voi-node-testnet grafana[69210]: logger=grafana.update.checker t=2023-12-27T11:05:26.769563461+01:00 level=info msg="Update check succeeded" duration=7.663637ms
Dec 27 11:05:26 voi-node-testnet grafana[69210]: logger=cleanup t=2023-12-27T11:05:26.808206779+01:00 level=info msg="Completed cleanup jobs" duration=64.861523ms
Dec 27 11:05:26 voi-node-testnet grafana[69210]: logger=plugins.update.checker t=2023-12-27T11:05:26.874721247+01:00 level=info msg="Update check succeeded" duration=70.901991ms
Dec 27 11:15:26 voi-node-testnet grafana[69210]: logger=grafana.update.checker t=2023-12-27T11:15:26.769039867+01:00 level=info msg="Update check succeeded" duration=6.935579ms
Dec 27 11:15:26 voi-node-testnet grafana[69210]: logger=cleanup t=2023-12-27T11:15:26.810085388+01:00 level=info msg="Completed cleanup jobs" duration=66.294104ms
Dec 27 11:15:26 voi-node-testnet grafana[69210]: logger=plugins.update.checker t=2023-12-27T11:15:26.875407205+01:00 level=info msg="Update check succeeded" duration=72.228362ms
Dec 27 11:16:42 voi-node-testnet grafana[69210]: logger=infra.usagestats t=2023-12-27T11:16:42.755259475+01:00 level=info msg="Usage stats are ready to report"
Dec 27 11:25:26 voi-node-testnet grafana[69210]: logger=grafana.update.checker t=2023-12-27T11:25:26.768865417+01:00 level=info msg="Update check succeeded" duration=6.73339ms
Dec 27 11:25:26 voi-node-testnet grafana[69210]: logger=cleanup t=2023-12-27T11:25:26.809052346+01:00 level=info msg="Completed cleanup jobs" duration=65.2849ms
Dec 27 11:25:26 voi-node-testnet grafana[69210]: logger=plugins.update.checker t=2023-12-27T11:25:26.860347684+01:00 level=info msg="Update check succeeded" duration=56.999845ms
  1. Check if Prometheus status is active (running):
sudo systemctl status prometheus.service --no-pager -l

----- Result -----

● prometheus.service - Monitoring system and time series database
     Loaded: loaded (/lib/systemd/system/prometheus.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-26 04:15:19 CET; 1 day 7h ago       <===== CHECK HERE
       Docs: https://prometheus.io/docs/introduction/overview/
             man:prometheus(1)
   Main PID: 69220 (prometheus)
      Tasks: 18 (limit: 76910)
     Memory: 130.9M
        CPU: 4min 19.124s
     CGroup: /system.slice/prometheus.service
             └─69220 /usr/bin/prometheus

Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.716Z caller=compact.go:459 level=info component=tsdb msg="compact blocks" count=3 mint=1703613603258 maxt=1703635200000 ulid=01HJMT9WENTPNRFKS8V7AV04KN sources="[01HJKYTZ67785VT052NA9T7P1T 01HJM5PHHZPQ3NF1QEQFQ5PGAF 01HJMCJ8T01V9EDKPT5QVACB1S]" duration=183.420025ms
Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.725Z caller=db.go:1293 level=info component=tsdb msg="Deleting obsolete block" block=01HJKYTZ67785VT052NA9T7P1T
Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.731Z caller=db.go:1293 level=info component=tsdb msg="Deleting obsolete block" block=01HJM5PHHZPQ3NF1QEQFQ5PGAF
Dec 27 06:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T05:00:08.740Z caller=db.go:1293 level=info component=tsdb msg="Deleting obsolete block" block=01HJMCJ8T01V9EDKPT5QVACB1S
Dec 27 08:00:03 voi-node-testnet prometheus[69220]: ts=2023-12-27T07:00:03.400Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1703649603257 maxt=1703656800000 ulid=01HJN15EHZ5AMEWV02TT0K5SM0 duration=136.587432ms
Dec 27 08:00:03 voi-node-testnet prometheus[69220]: ts=2023-12-27T07:00:03.402Z caller=head.go:805 level=info component=tsdb msg="Head GC completed" duration=1.514546ms
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.421Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1703656803257 maxt=1703664000000 ulid=01HJN81AP7G92NC8F4CHQP4X60 duration=157.185812ms
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.423Z caller=head.go:805 level=info component=tsdb msg="Head GC completed" duration=1.538992ms
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.452Z caller=checkpoint.go:97 level=info component=tsdb msg="Creating checkpoint" from_segment=12 to_segment=13 mint=1703664000000
Dec 27 10:00:08 voi-node-testnet prometheus[69220]: ts=2023-12-27T09:00:08.541Z caller=head.go:974 level=info component=tsdb msg="WAL checkpoint complete" first=12 last=13 duration=88.867604ms
  1. Check if Prometheus Node Exporter status is active (running):
sudo systemctl status prometheus-node-exporter.service --no-pager -l

----- Result -----

● prometheus-node-exporter.service - Prometheus exporter for machine metrics
     Loaded: loaded (/lib/systemd/system/prometheus-node-exporter.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-26 04:15:18 CET; 1 day 7h ago       <===== CHECK HERE
       Docs: https://github.com/prometheus/node_exporter
   Main PID: 69211 (prometheus-node)
      Tasks: 29 (limit: 76910)
     Memory: 13.7M
        CPU: 11min 42.505s
     CGroup: /system.slice/prometheus-node-exporter.service
             └─69211 /usr/bin/prometheus-node-exporter

Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=thermal_zone
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=time
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=timex
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=udp_queues
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=uname
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=vmstat
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=xfs
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:115 level=info collector=zfs
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=node_exporter.go:199 level=info msg="Listening on" address=:9100
Dec 26 04:15:18 voi-node-testnet prometheus-node-exporter[69211]: ts=2023-12-26T03:15:18.990Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false

Set up Grafana Dashboard

Warning

If you set ufw at your server, You should open Grafana port (default: 3000).
ufw allow 3000/tcp

Important

The following steps should be executed from your Client, NOT server.

Access your Grafana

Launch browser and go to http://<YOUR-SERVER-GLOBAL-IP>:3000/

Tip

Run this command to check your server global ip.
curl inet-ip.info

grafana-01

Login to Grafana

Tip

Default username / password is admin / admin

grafana-02

Go to Connections -> Data Sources

grafana-04

Click Add new data source and select Prometheus

grafana-05 grafana-06

Enter http://localhost:9090 into Prometheus server URL

grafana-07

Scroll to the bottom and click Save & test

grafana-08

Go to Dashboards to create dashboard

grafana-10 grafana-11

Click New > Import

grafana-12

Click Upload JSON file

Please download the voi grafana json file and upload it. grafana-13

Select prometheus datasource

grafana-15 grafana-16

Complete

grafana-17

(Optional) Set up notification system: Grafana + PagerDuty

Tip

PagerDuty
PagerDuty can automatically group related alerts into a single incident to minimize noise while centralizing relevant context. Incident notifications are automatically sent using any preferred combination of phone calls, SMS, push notifications and emails.

Caution

Free plan does not support international phone / SMS notifications.
If you want to receive notification over call or SMS, you have to subscribe Professional plan ($ 21 per user/month) 8
This guide does not support how to sign up PagerDuty.
If you does not have the account, please sign up in advance.

You do not need to sign up PagerDuty to receive alert at Slack / Discord / Telegram.
Please skip "Generate PagerDuty API Key" selction.

Generate PagerDuty API Key

Go to Service Directory

pagerduty-01

Click +New Service

pagerduty-02

Input Service Name

pagerduty-03

Select Escalation Policy

Tip

If you want to assign existing escalation rule, please click "Select an existing Escalation Policy".

pagerduty-04

Reduce Noise

Tip

Not every alert should be an incident.
PagerDuty’s noise reduction capabilities leverage data science and machine learning to cut down on system noise and reduce alert fatigue.
This means fewer incidents and interruptions for on-call responders, richer context around the incidents that do trigger, and lower resolution times.

pagerduty-05 pagerduty-06

Select integration service

pagerduty-07 pagerduty-08

Copy Integration Key

pagerduty-09

Integration Grafana with PagerDuty

Access your Grafana

Launch browser and go to http://<YOUR-SERVER-GLOBAL-IP>:3000/

Go to Alerting > Contact point

pagerduty-10 pagerduty-11

Add Contact Point

pagerduty-12

Tip

When click "Test", you'll receive notification from PagerDuty.
If you'd like to change on-call way, please change notification rule at PagerDuty. 9

If you want to receive alert at Slack / Discord / Telegram, please select Slack / Discord / Telegram

pagerduty-13

Change Grafana Notification policies

pagerduty-14 pagerduty-15 pagerduty-16

Go to Voi Node Dashboard

pagerduty-17 pagerduty-18

Creating Alert Rule

Tip

Q. What is Grafana Alerting Rule?
A. An alert rule consists of one or more queries and expressions, a condition, and the duration over which the condition needs to be met to start firing.
While queries and expressions select the data set to evaluate, a condition sets the threshold that an alert must meet or exceed to create an alert.

Important

What metrics you choose is the most important thing to monitoring your node.
This guide is monitoring Total Peers metrics to notify when peer is 0.
I don't have best procatice, so let's discuss at Voi Network discord.

pagerduty-19 pagerduty-20 pagerduty-21 pagerduty-22 pagerduty-23 pagerduty-24 pagerduty-25 pagerduty-26 pagerduty-27 pagerduty-28

Complete

Tip

You can find alerting status at Dashboard.

pagerduty-29

Important

To Keep network healhy, it is the important to detect aberrations in node metrics as soon as possible.
If you set monitor and notification, it lead to make network robustness.
I believe this is the responsibility for node operators.

Footnotes

  1. Voi Node Running Guide

  2. Algorand Developer Portal

  3. Discord Voi Network chris.voi

  4. D13 Set Up Voi Participation Node on Ubuntu 22.04

  5. VNBnode Guide

  6. Prometheus: MONITORING LINUX HOST METRICS WITH THE NODE EXPORTER

  7. Alorand REST API

  8. PagerDuty Plan and pricing

  9. PagerDuty User Profile

About

Voi a layer-1 blockchain solution that starts with Algorand's technology

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors