Skip to content

Commit 924033a

Browse files
committed
++Day-3 && --Typo in Day-2
1 parent a389e05 commit 924033a

File tree

2 files changed

+91
-1
lines changed

2 files changed

+91
-1
lines changed

day-2/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ helm repo update
8888

8989
### 🚀 Step 3: Deploy the chart into a new namespace "monitoring"
9090
```bash
91-
kubeclt create ns monitoring
91+
kubectl create ns monitoring
9292
```
9393
```bash
9494
helm install monitoring \

day-3/readme.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
2+
## Metrics in Prometheus:
3+
- Metrics in Prometheus are the core data objects that represent measurements collected from monitored systems.
4+
- These metrics provide insights into various aspects of **system performance, health, and behavior**.
5+
6+
## Labels:
7+
- Metrics are paired with Labels.
8+
- Labels are key-value pairs that allow you to differentiate between dimensions of a metric, such as different services, instances, or endpoints.
9+
10+
11+
## Example:
12+
```bash
13+
container_cpu_usage_seconds_total{namespace="kube-system", endpoint="https-metrics"}
14+
```
15+
- `container_cpu_usage_seconds_total` is the metric.
16+
- `{namespace="kube-system", endpoint="https-metrics"}` are the labels.
17+
18+
## Types of Metrics in Prometheus
19+
- **Counter**:
20+
- A Counter is a cumulative metric that represents a single numerical value that only ever goes up. It is used for counting events like the number of HTTP requests, errors, or tasks completed.
21+
- **Example**: Counting the number of times a container restarts in your Kubernetes cluster
22+
- **Metric Example**: `kube_pod_container_status_restarts_total`
23+
24+
- **Gauge**:
25+
- A Gauge is a metric that represents a single numerical value that can go up and down. It is typically used for things like memory usage, CPU usage, or the current number of active users.
26+
- **Example**: Monitoring the memory usage of a container in your Kubernetes cluster.
27+
- **Metric Example**: `container_memory_usage_bytes`
28+
29+
- **Histogram**:
30+
- A Histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
31+
- It also provides a sum of all observed values and a count of observations.
32+
- **Example**: Measuring the response time of Kubernetes API requests in various time buckets.
33+
- **Metric Example**: `apiserver_request_duration_seconds_bucket`
34+
35+
- Summary:
36+
- Similar to a Histogram, a Summary samples observations and provides a total count of observations, their sum, and configurable quantiles (percentiles).
37+
- **Example**: Monitoring the 95th percentile of request durations to understand high latency in your Kubernetes API.
38+
- **Metric Example**: `apiserver_request_duration_seconds_sum`
39+
40+
## What is PromQL?
41+
- PromQL (Prometheus Query Language) is a powerful and flexible query language used to query data from Prometheus.
42+
- It allows you to retrieve and manipulate time series data, perform mathematical operations, aggregate data, and much more.
43+
44+
- Key Features of PromQL:
45+
- Selecting Time Series: You can select specific metrics with filters and retrieve their data.
46+
- Mathematical Operations: PromQL allows for mathematical operations on metrics.
47+
- Aggregation: You can aggregate data across multiple time series.
48+
- Functionality: PromQL includes a wide range of functions to analyze and manipulate data.
49+
50+
## Basic Examples of PromQL
51+
- `container_cpu_usage_seconds_total`
52+
- Return all time series with the metric container_cpu_usage_seconds_total
53+
- `container_cpu_usage_seconds_total{namespace="kube-system",pod=~"kube-proxy.*"}`
54+
- Return all time series with the metric `container_cpu_usage_seconds_total` and the given `namespace` and `pod` labels.
55+
- `container_cpu_usage_seconds_total{namespace="kube-system",pod=~"kube-proxy.*"}[5m]`
56+
- Return a whole range of time (in this case 5 minutes up to the query time) for the same vector, making it a range vector.
57+
58+
## Aggregation & Functions in PromQL
59+
- Aggregation in PromQL allows you to combine multiple time series into a single one, based on certain labels.
60+
- **Sum Up All CPU Usage**:
61+
```bash
62+
sum(rate(node_cpu_seconds_total[5m]))
63+
```
64+
- This query aggregates the CPU usage across all nodes.
65+
66+
- **Average Memory Usage per Namespace:**
67+
```bash
68+
avg(container_memory_usage_bytes) by (namespace)
69+
```
70+
- This query provides the average memory usage grouped by namespace.
71+
72+
- **rate() Function:**
73+
- The rate() function calculates the per-second average rate of increase of the time series in a specified range.
74+
```bash
75+
rate(container_cpu_usage_seconds_total[5m])
76+
```
77+
- This calculates the rate of CPU usage over 5 minutes.
78+
- **increase() Function:**
79+
- The increase() function returns the increase in a counter over a specified time range.
80+
```bash
81+
increase(kube_pod_container_status_restarts_total[1h])
82+
```
83+
- This gives the total increase in container restarts over the last hour.
84+
85+
- **histogram_quantile() Function:**
86+
- The histogram_quantile() function calculates quantiles (e.g., 95th percentile) from histogram data.
87+
```bash
88+
histogram_quantile(0.95, sum(rate(apiserver_request_duration_seconds_bucket[5m])) by (le))
89+
```
90+
- This calculates the 95th percentile of Kubernetes API request durations.

0 commit comments

Comments
 (0)