Skip to content

Commit daec884

Browse files
authored
[Issue pixelsdb#1297] add retina docs (pixelsdb#1303)
1 parent 3f2de0c commit daec884

File tree

1 file changed

+50
-0
lines changed

1 file changed

+50
-0
lines changed

pixels-retina/README.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Pixels Retina
2+
3+
Retina (<ins>re</ins>al-<ins>ti</ins>me a<ins>na</ins>lytics) is the real-time data synchronization framework in Pixels.
4+
It supports replaying data-change operations from a log-based CDC (change-data-capture) source as mirror transactions on the columnar table data.
5+
It proposes a light-weight MVCC mechanism and corresponding version storage to support parallel mirror transaction replay and
6+
concurrent analytical query processing, and a vectorized-filter-on-read (VFoR) approach for analytical queries to read consistent
7+
data snapshots.
8+
9+
Compared to the merge-on-read (MoR) based on catalog snapshots in existing lakehouse systems,
10+
such as Apache Iceberg and Apache Paimon, Retina supports real row-granular (instead of batch-granular) transactional
11+
data change replay without the expensive version merging and data compaction mechanisms.
12+
Evaluations show that Retina simultaneously provides 10-ms-level data freshness and over 3.2M row/s scalable data-change
13+
replay throughput, without compromising query performance or resource cost-efficiency,
14+
significantly outperforming state-of-the-art lakehouses, Iceberg and Paimon, which provides minute-level data freshness
15+
and one order of magnitude lower data-change throughput.
16+
17+
## Retina Components
18+
19+
The components related to Retina are:
20+
21+
- Sink: It connects to CDC streams from Debezium, reconstructs the data-change messages in the
22+
CDC stream into mirror transactions, and send the data-change operations in mirror transaction through stream RPC to Pixels-Retina.
23+
[Source code](https://github.com/pixelsdb/pixels-sink);
24+
- Replayer: It receives the data-change operations from Pixels-Sink and replays them on the columnar data tables.
25+
Source code:
26+
[core data structures and operations](../cpp/pixels-retina),
27+
[top-level replay and garbage collection](.),
28+
[client](../pixels-common/src/main/java/io/pixelsdb/pixels/common/retina),
29+
and [server](../pixels-daemon/src/main/java/io/pixelsdb/pixels/daemon/retina).
30+
RPC handling are in this directory and .
31+
- Transaction Service: It allocates transaction timestamps for mirror transactions and analytical queries, and manages the timestamp watermarks
32+
for the MVCC protocol of Retina.
33+
Source code:
34+
[client](../pixels-common/src/main/java/io/pixelsdb/pixels/common/transaction) and
35+
[server](../pixels-daemon/src/main/java/io/pixelsdb/pixels/daemon/transaction).
36+
- Index Service: It is a multi-version index that mapping the index key (e.g., primary key or secondary key) to row location.
37+
Replayer looks up and updates the primary index during data-change replay.
38+
Source code:
39+
[framework and interfaces](../pixels-common/src/main/java/io/pixelsdb/pixels/common/index),
40+
[pluggable implementations](../pixels-index),
41+
[clients](../pixels-common/src/main/java/io/pixelsdb/pixels/common/index/service)
42+
[server](../pixels-daemon/src/main/java/io/pixelsdb/pixels/daemon/index)
43+
- Catalog Service (i.e., metadata service): It manages the schema, statistics, and data catalog of tables.
44+
Source code: [client](../pixels-common/src/main/java/io/pixelsdb/pixels/common/metadata) and
45+
[server](../pixels-daemon/src/main/java/io/pixelsdb/pixels/daemon/metadata).
46+
- Columnar file format: It provides the file format definition, reader, writer of the Pixels file format. [Source code](../pixels-core).
47+
- Trino connector: It runs inside the Trino cluster to access the services of Retina/Pixels, and calls the file reader to read data.
48+
[Source code](https://github.com/pixelsdb/pixels-trino).
49+
50+
## Usage

0 commit comments

Comments
 (0)