- Training
- Telemetry in Demo Apps (Local Machine)
- Telemetry in IB (In a FTE - Dockerised Functional Test Envinroment)
- Telemetry in Tauspace (Standardised Pipeline for Telemetry in Tauspace Apps )
- Telemetry in RAIN (Telemetry setup and tested in Rain SIT environment)
Docker -> Do the training on the Docker App quickly. Read through and understand the basics so that you can get the app running.
- Research and learning on Telemetry and OpenTelemetry. (including sample apps and demos)
- Get basic telemetry system working on a mini app
- Get end-to-end telemetry system working in self-contained IB (in a ) using Opentelemetry: Phoenix, Ecto, cowboy.
- Standardized architecture for setting up telemetry in a Tauspace Project
- Create a phased deployment strategy for integrating telemetry into IB in production.
- Notifications
- Log management
Define coding standard for Tauspace Define business objects method
AWS opensearch working with sample projects - feeding in telemetry from: Phoenix, Etco, Application, Traces
AWS open search (Kibana) - get tutorials working On premise - Elastic search + Grafana or Kibanan
https://www.youtube.com/watch?v=4OBtc_eIKIE https://github.com/kamilkowalski/opentelemetry-demo https://drive.google.com/file/d/1oNUli4gvUmP56UPTa3fvRELi9ahlM3fr/view
Get this running, and observe the different areas. Go through the code and understand what is happening.
https://opentelemetry.io/docs/erlang/getting-started/ https://opentelemetry.io/docs/demo/
- https://elixirschool.com/blog/instrumenting-phoenix-with-telemetry-part-one
- https://elixirschool.com/blog/instrumenting-phoenix-with-live-dashboard
https://elixirschool.com/en/lessons/advanced/telemetry/
https://blog.miguelcoba.com/series/elixir-telemetry
https://opensearch.org/docs/latest/opensearch/install/docker/
Telemetry and observability are critical tools for the modern software development landscape. These tools offer invaluable insights that are essential for effective debugging, monitoring, and performance optimization. In an era where software applications are increasingly complex and distributed, having real-time insights into the behavior and performance of your application is not just an added advantage—it's a requirement.
Before embarking on this learning path, it's imperative to have a solid understanding of Elixir and the OTP (Open Telecom Platform) framework. This foundational knowledge will serve as the basis upon which you'll build your telemetry and observability skills. Additionally, a rudimentary understanding of distributed systems and microservices architecture will be highly beneficial, as these are the types of environments where telemetry and observability are most crucial.
https://samuelmullen.com/articles/the-hows-whats-and-whys-of-elixir-telemetry
Telemetry is the automated process of collecting data from various components of a system and transmitting it to a centralized location for analysis. In software systems, this data can range from system metrics like CPU usage, memory allocation, and network latency to events such as user logins, system errors, and database queries.
Types of Data:
- System Metrics: CPU usage, memory allocation, network latency
- Events: User logins, system errors, database queries
Reading: Introduction to Telemetry
In Elixir, the Telemetry library serves as a dynamic dispatching library for metrics and events. It allows you to instrument your code with custom events and metrics, which can then be processed by various handlers for logging, monitoring, or other forms of data analysis.
Key Concepts:
- Instrumentation: Adding code to collect custom events and metrics.
- Handlers: Processes that receive telemetry data for logging, monitoring, or other forms of analysis.
Reading: Telemetry Events and Metrics in Elixir
Metrics are quantitative measurements that provide a snapshot of your system at any given time. These could include latency, error rates, or throughput. Events, on the other hand, are significant occurrences within your system that you may want to monitor, such as a user login or a system failure.
Metrics: Quantitative measurements like latency, error rates, or throughput.
Events: Significant occurrences such as a user login or a system failure.
Reading: Telemetry Metrics
Telemetry Poller is a specialized package in Elixir that periodically gathers metrics from the Erlang VM and other sources. This is particularly useful for collecting metrics that are not event-driven but need to be sampled at regular intervals, such as memory usage or CPU utilization.
Purpose: To periodically gather metrics from the Erlang VM and other sources.
Common Metrics:
- Memory usage
- CPU utilization
Reading: Telemetry Poller Documentation
-
Telemetry Reporters: These libraries help you visualize or store the metrics you've collected. Examples include
Telemetry.Metrics.ConsoleReporterandTelemetry.Metrics.Prometheus.Core. -
Dynamic Telemetry: This involves attaching and detaching telemetry handlers at runtime, allowing for more flexible monitoring strategies.
-
Telemetry Attachments: Learn how to use
:telemetry.attach/4to attach functions to telemetry events dynamically. -
Telemetry Reporters: Libraries for visualizing or storing metrics.
-
Dynamic Telemetry: Attaching and detaching telemetry handlers at runtime.
-
Telemetry Attachments: Using
:telemetry.attach/4for dynamic event handling.
Reading:
OpenTelemetry is an open-source observability framework designed for cloud-native software. It provides APIs, libraries, agents, and instrumentation for distributed tracing and metrics collection.
Features:
- APIs
- Libraries
- Agents
- Instrumentation
Reading: OpenTelemetry Overview OpenTelemetry Elixir Erlang Docs OpenTelemetry Elixir Github OpenTelemetry API Reference Erlang OTP Design Principles
OpenTelemetry provides a more comprehensive solution for observability. It offers auto-instrumentation features, meaning it can automatically collect traces and metrics without requiring you to modify your existing codebase significantly.
Key Features:
- Auto-instrumentation: Automatically collect traces and metrics.
- Custom Spans and Traces: For specific workflows in your application.
Reading: Getting Started with OpenTelemetry in Elixir
- Metrics Collection: Both Elixir Telemetry and OpenTelemetry are capable of collecting metrics to monitor the performance of your application.
- Event-based: Both are event-based systems, meaning they react to events that occur within your application to collect data.
- Native to Elixir vs Cross-language: Elixir Telemetry is native to the Elixir language, whereas OpenTelemetry is a cross-language standard.
- Tracing: OpenTelemetry supports tracing to track the flow of requests through various services, while Elixir Telemetry does not have built-in support for tracing.
- Overhead: Elixir Telemetry is designed to have low overhead, making it lightweight and efficient. OpenTelemetry, being more feature-rich, tends to have a higher overhead.
- Instrumentation: Elixir Telemetry allows for custom instrumentation tailored to your application. OpenTelemetry, being a standard, comes with a set of standard instrumentation protocols.
- You're Working Solely in Elixir: If your entire stack is in Elixir, using a native tool can be more efficient.
- Low Overhead is Crucial: For applications where performance is a key concern, the low overhead of Elixir Telemetry can be beneficial.
- Custom Instrumentation: If you need to tailor your metrics and events very specifically to your application, Elixir Telemetry provides more flexibility.
- Cross-Language Support is Needed: If your application involves multiple languages or services written in different languages, OpenTelemetry provides a unified standard.
- Tracing is Required: For microservices architectures or complex systems where tracing the flow of requests is important, OpenTelemetry is the better choice.
- Standardization: If you aim for a standardized way of collecting metrics and traces that can be understood across different teams or even different companies, OpenTelemetry is more suitable.
OpenTelemetryPhoenix OpenTelemetry Cowboy OpenTelemetry Ecto https://hexdocs.pm/opentelemetry_ecto/OpentelemetryEcto.html
Distributed tracing is a technique used to profile and monitor applications, especially those built using a microservices architecture. It helps you understand how a single request flows through your complex system of microservices.
Reading: Distributed Tracing in OpenTelemetry
Understanding the best practices can help you make the most out of your telemetry setup. This includes guidelines on what metrics to collect, how to name them, what their cardinality should be, and how to handle errors and exceptions.
Reading: OpenTelemetry Best Practices
Reading: Building Observability Into Your Elixir Project
- Elixir Telemetry Workflow:
graph TD;
A[Start] --> B[Instrument Code];
B --> C[Collect Metrics and Events];
C --> D[Process with Handlers];
D --> E[Visualize/Store];
E --> F[Analyze];
F --> G[Optimize Code];
G --> A;
- OpenTelemetry Workflow:
graph TD;
A[Start] --> B[Auto-instrumentation];
B --> C[Collect Traces and Metrics];
C --> D[Send to Backend];
D --> E[Visualize/Store];
E --> F[Analyze];
F --> G[Optimize Code];
G --> A;
Logical Architecture 3 parts Producer, Processing, Console (Display+search)
DOD
- Get onprem telemetry system working
Define coding standard for Tauspace Define business objects method
AWS opensearch working with sample projects - feeding in telemetry from: Phoenix Etco Application - Chicken Traces Deploy plan for RAIN Deploy V1 to RAIN
AWS open search (Kibana) - get tutorials working On premise - Elastic search + Grafana or Kibanan
Elixir Telemetry module
Ecto telemetry
Phoenix Telemetry
Blog Elixir schools
Core telemetry elements
Log management
Notifications
RAIN tie back IB tie back
