Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions blogs/announcing-r1-cross-cluster-join.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Announcing R1 — Cross-cluster JOIN on Elasticsearch

![SoftClient4ES Logo](https://raw.githubusercontent.com/SOFTNETWORK-APP/SoftClient4ES/main/logo_375x300.png)

*R1 blog series — Post 1 of 4 (launch day). Next: "The JOIN matrix, explained: rows 1, 2, 3."*

<!-- Channel B (Medium / personal LinkedIn). Hero image staged in elasticsql/linkedin/ at publish time. -->

---

## Elasticsearch can't JOIN. So everybody ETLs.

You have orders in one Elasticsearch index and customers in another. You want the total revenue per customer name. In SQL that is one line. In Elasticsearch it is a project: denormalize at index time, or stand up a pipeline that copies both indices into a warehouse, JOIN them there, and ship the result back.

That is the tax. You bought a search engine and ended up running an ETL job just to answer a question a JOIN would have answered for free.

**R1 ends that.** SoftClient4ES R1 ships query-time **cross-index JOIN** on Elasticsearch — across indices, and across clusters — over the surfaces you already use: JDBC, ADBC, Arrow Flight SQL, and the REPL.

> Stop ETL'ing Elasticsearch into your warehouse just to JOIN it.

---

## Two deployment shapes — pick your scale

R1 has exactly two shapes, and you self-select:

**Single-cluster (free in Community).** Drop in a driver and JOIN across the indices of your existing cluster. No new infrastructure. Up to two cross-index JOINs per query are free.

```sql
SELECT e.name, d.dept_name
FROM employees e
JOIN departments d ON e.dept_id = d.id;
```

**Multi-cluster federation (Pro+).** Deploy the federation server and JOIN across *separate* regional Elasticsearch clusters from one query. The catalog prefix routes each leg to its cluster:

```sql
SELECT o.id, c.name
FROM `prod_us`.orders o
JOIN `prod_eu`.customers c ON o.customer_id = c.id;
```

Community gets single-cluster cross-index JOINs for free. Cross-cluster federation is Pro+. The boundary is honest and it is the meter, not a feature switch.

---

## The two ES-impossible superpowers

R1 is built around two things Elasticsearch cannot do natively and a do-it-yourself client cannot do either:

1. **Query-time cross-index JOIN** on every surface — REPL, JDBC, ADBC, Arrow Flight SQL, Federation.
2. **Persisted Materialized Views** — a pre-joined, pre-aggregated index you query instantly.

The drivers are the *free* delivery channel for superpower #1. JOIN depth is *metered* — a ladder, not a wall:

- query-time JOINs (`maxJoins`)
- across clusters (`maxClusters`)
- persisted as a Materialized View (`maxMaterializedViews`)

Community sits at the bottom rung of each (2 JOINs / 1 cluster / 1 MV); Pro and Enterprise climb the ladder.

---

## Arrow-native, end to end

The Arrow Flight SQL surface streams results as Arrow columnar batches — zero-copy, no JSON in the hot path. The full benchmark lands in R1.1; for now, know that the columnar path exists and that DuckDB, Pandas, and Polars consume it directly through ADBC.

---

## Get started

- **Run your first JOIN in five minutes:** the [JDBC quickstart](https://softclient4es.dev/integrations/jdbc/).
- **See what each edition includes:** [editions & pricing](https://softclient4es.dev/licensing/).
<!-- pending 17.1: /sql/joins/ (JOIN matrix walkthrough — web PR #12, base release-r1) -->
- **Understand the JOIN matrix (rows 1, 2, 3):** the JOIN matrix walkthrough — `https://softclient4es.dev/sql/joins/` (publishing alongside R1).

R1 is the release where Elasticsearch learned to JOIN. The next post walks the JOIN matrix row by row.

🔗 GitHub: https://github.com/SOFTNETWORK-APP/SoftClient4ES
💼 Follow for more: https://www.linkedin.com/company/softnetwork-app/
48 changes: 48 additions & 0 deletions blogs/how-customers-use-r1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# How teams are using R1

![SoftClient4ES Logo](https://raw.githubusercontent.com/SOFTNETWORK-APP/SoftClient4ES/main/logo_375x300.png)

*R1 blog series — Post 4 of 4 (week 3). Previously: "SRE cross-cluster incident triage in one SQL query."*

<!-- Channel B (Medium / personal LinkedIn). Hero image staged in elasticsql/linkedin/ at publish time. -->
<!-- P6: vignettes ship anonymized by default. Replace with a named reference ONLY if the lead confirms one in writing before publish. -->

---

## Patterns, not logos

A few weeks in, the usage patterns are clear enough to share. These are anonymized vignettes drawn from opt-in telemetry signals and early conversations — the shapes are real, the names are not. (If a team agrees to be named, we will say so explicitly.)

---

### A team running analytics off live Elasticsearch

A product-analytics team had two indices — events and accounts — and a standing weekly report that JOINed them. The old workflow exported both to a warehouse on a schedule just to run that one JOIN. With R1 they point Superset at the JDBC driver and run the JOIN against the live cluster. The warehouse hop is gone; the report is current instead of a day old.

The free Community tier covers it: one JOIN, single cluster, well under the result meter.

### A team correlating across two regions

A platform team runs Elasticsearch in two regions and kept hitting the "is this one incident or two?" wall during triage. They deployed the federation server and now JOIN the two regional clusters in a single query when an incident looks cross-regional. Two clusters puts them on Pro; the value is the minutes saved per incident, not the licence line.

### A team replacing a denormalization pipeline

A data team had been denormalizing at index time — duplicating customer fields onto every order document — purely so they could "JOIN" later. With cross-index JOIN they stopped duplicating and let the query do the work. Smaller indices, no re-index when a customer attribute changes.

---

## The common thread

Every one of these started the same way: a JOIN that Elasticsearch could not do, worked around with ETL or denormalization. R1 removed the workaround. The teams that adopt fastest are the ones who already had the JOIN in their head and just needed somewhere to type it.

---

## Where to go next

- **See which tier fits your shape:** [editions & pricing](https://softclient4es.dev/licensing/).
- **What we collect, and how to opt out:** [privacy & telemetry](https://softclient4es.dev/privacy/telemetry/).

Want to be a named reference? Reach out — we would rather show your real numbers than an anonymized sketch.

🔗 GitHub: https://github.com/SOFTNETWORK-APP/SoftClient4ES
💼 Follow for more: https://www.linkedin.com/company/softnetwork-app/
79 changes: 79 additions & 0 deletions blogs/join-matrix-walkthrough.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# The JOIN matrix, explained: rows 1, 2, 3

![SoftClient4ES Logo](https://raw.githubusercontent.com/SOFTNETWORK-APP/SoftClient4ES/main/logo_375x300.png)

*R1 blog series — Post 2 of 4 (week 1). Previously: "Announcing R1 — Cross-cluster JOIN on Elasticsearch." Next: "SRE cross-cluster incident triage in one SQL query."*

<!-- Channel B (Medium / personal LinkedIn). Hero image staged in elasticsql/linkedin/ at publish time. -->

---

## One feature, three execution shapes

"Cross-index JOIN on Elasticsearch" is one promise, but underneath it runs three different ways depending on *where* the data lives. We call them the three rows of the JOIN matrix. You never pick a row — the planner does, by counting how many distinct clusters your query touches. But understanding the rows tells you what to expect.

---

## Row 1 — same-cluster passthrough

Both indices live in the *same* Elasticsearch cluster. The driver pushes the scans down to ES, hands the rows to an embedded DuckDB engine in-process, and DuckDB does the JOIN. No coordinator, no network hop between clusters.

```sql
SELECT e.emp_id, e.name, d.dept_name
FROM jdbc_join_emp e
JOIN jdbc_join_dept d ON e.dept_id = d.id;
```

This is the free path in Community. INNER / LEFT / RIGHT / FULL all work; so do `INSERT ... SELECT`, `CREATE TABLE AS SELECT`, `ON CONFLICT` upserts, and prepared statements. The same DuckDB engine backs JDBC, ADBC, Arrow Flight SQL, and the REPL — Row 1 looks identical on every surface.

---

## Row 2 — cross-cluster conveyor

The JOIN spans two clusters, and you are *writing* the result somewhere (an `INSERT` or `CTAS` whose target is on one of them). The coordinator runs the source `SELECT`, streams the rows to the target cluster's sidecar, and bulk-loads them. Source scan → coordinator → target sidecar bulk-load. That conveyor is Row 2.

---

## Row 3 — multi-source coordinator

The JOIN reads from several clusters at once. Each leg is scanned independently, materialized to a per-query scratch view in the coordinator's DuckDB, and joined coordinator-local. The catalog prefix names the cluster:

```sql
SELECT o.id, c.name
FROM `prod_us`.orders o
JOIN `prod_eu`.customers c ON o.customer_id = c.id;
```

R1's Row 3 is multi-**ES**-cluster. Heterogeneous sources (Postgres, MySQL, Snowflake) are an R2b concern — the architecture is ready, the connectors come later.

---

## Counting JOINs: the meter

The quota counts JOIN operators, not tables. An N-table query has N−1 JOINs:

- 2 tables → 1 JOIN
- 3 tables → 2 JOINs
- 4 tables → 3 JOINs

Community's `maxJoins=2` means up to a 3-table query is free. `UNNEST` (flattening a nested array) does **not** count against the meter. Exceed your tier and the query is rejected with a clear upgrade message — no silent truncation of the JOIN itself.

| Tier | maxJoins | Largest free query |
|---|---|---|
| Community | 2 | 3-table |
| Pro | 5 | 6-table |
| Enterprise | ∞ | unbounded |

---

## Where to go next

- **Run Row 1 yourself in five minutes:** the [JDBC quickstart](https://softclient4es.dev/integrations/jdbc/).
- **See how the meter maps to price:** [editions & pricing](https://softclient4es.dev/licensing/).
<!-- pending 17.1: /sql/joins/ (JOIN matrix walkthrough — web PR #12, base release-r1) -->
- **The full matrix reference, with every variant:** the JOIN matrix walkthrough — `https://softclient4es.dev/sql/joins/`.

Next week: one SQL query that triages an incident spanning three regions.

🔗 GitHub: https://github.com/SOFTNETWORK-APP/SoftClient4ES
💼 Follow for more: https://www.linkedin.com/company/softnetwork-app/
55 changes: 55 additions & 0 deletions blogs/sre-cross-cluster-incident-triage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# SRE cross-cluster incident triage in one SQL query

![SoftClient4ES Logo](https://raw.githubusercontent.com/SOFTNETWORK-APP/SoftClient4ES/main/logo_375x300.png)

*R1 blog series — Post 3 of 4 (week 2). Previously: "The JOIN matrix, explained: rows 1, 2, 3." Next: "How teams are using R1."*

<!-- Channel B (Medium / personal LinkedIn). Hero image staged in elasticsql/linkedin/ at publish time. -->

---

## 2 a.m., three regions, one incident

Your platform runs Elasticsearch per region — `prod_us`, `prod_eu`, `prod_fr`. Latency spikes in EU; checkout errors climb in the US. Are they the same incident? With a cluster per region, the honest answer is usually "open three dashboards and eyeball the timestamps."

That is the wedge R1 closes. With the federation server in front of your regional clusters, you JOIN across them in a single query and read the correlation straight across — no per-region tab-switching, no exporting to a third system first.

---

## One query instead of N dashboards

The catalog prefix routes each leg to its regional cluster. The shape below is illustrative — **substitute your own per-region index names**; the mechanics are exactly the verified cross-cluster JOIN:

```sql
-- Illustrative: per-region index names are yours to choose.
SELECT u.user_id, u.action, e.error_code, e.region
FROM `prod_us`.user_events u
JOIN `prod_eu`.error_events e
ON u.user_id = e.user_id
WHERE e.ts > NOW() - INTERVAL '15' MINUTE
ORDER BY e.ts;
```

The narrative — correlate logs, metrics, and traces across regions — is the use case. The engine underneath is the same Row 3 multi-source coordinator from last week's post: each leg scanned independently, joined coordinator-local. No data leaves a region except the rows the JOIN actually needs.

---

## What it takes

Cross-cluster correlation runs on the federation server, which is Pro+ (it spans two or more clusters, and the cluster meter is the paywall). A single-cluster team gets the same JOIN mechanics for free within one cluster — you only need federation once an incident genuinely crosses cluster boundaries.

No perf claims here on purpose: the quantified benchmark is R1.1's beat, not this one. The point of this post is *expressiveness* — one query where there used to be a manual cross-reference.

---

## Where to go next

- **See what Pro+ includes and why federation sits there:** [editions & pricing](https://softclient4es.dev/licensing/) (LIVE).
- **Run the single-cluster JOIN first, for free:** the [JDBC quickstart](https://softclient4es.dev/integrations/jdbc/) (LIVE).
<!-- pending 17.2: /integrations/federation-helm/ (federation operator guide — web PR #8, base release-r1) -->
- **Deploy the federation server:** the federation operator guide — `https://softclient4es.dev/integrations/federation-helm/` (publishing with R1).

Next week: how teams are putting R1 to work.

🔗 GitHub: https://github.com/SOFTNETWORK-APP/SoftClient4ES
💼 Follow for more: https://www.linkedin.com/company/softnetwork-app/
38 changes: 38 additions & 0 deletions linkedin/r1-bi-showcase.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# LinkedIn — R1 BI-Tool Showcase

> **Angle** : Intégration BI — tested vs compatible. Connecter ses outils existants à Elasticsearch via JDBC / Arrow Flight SQL, cadrage honnête.
> **Cible** : Data analysts, BI engineers, équipes data platform
> **Demo** : `docker compose --profile superset-flight up`
> **Timing** : J+14 — deux semaines après le lancement

---

🔌 Your BI stack already speaks SQL. Now Elasticsearch does too.

R1 turns Elasticsearch into a SQL source your existing tools can query — with cross-index JOINs they could never do before. Here's where each tool stands, honestly:

✅ Tested — we ran them through verification:
• Apache Superset (dedicated dialect)
• DBeaver
• DataGrip
• Grafana (via Arrow Flight SQL)

🔄 Compatible — the protocol works, formal regression is best-effort:
• Tableau
• Power BI
• Metabase
• DbVisualizer

We don't upgrade a tool's tier to sound better. "Tested" means we tested it. "Compatible" means it should work and we'll tell you what to watch for.

A note for the Power BI folks: use Import mode with explicit JOIN SQL. (No, there is no magic "default driver" — you write the JOIN, the driver runs it against ES.)

Honest gap, on purpose: explicit JOIN SQL works today. Full subquery and CTE support arrives in R2a. Every compatible-tier page says so up front, so you're never surprised mid-dashboard.

Connect your tool 👉 https://softclient4es.dev/integrations/jdbc/
Tested integrations 👉 https://softclient4es.dev/integrations/superset/ · https://softclient4es.dev/integrations/dbeaver/ · https://softclient4es.dev/integrations/grafana/

🔗 GitHub: https://github.com/SOFTNETWORK-APP/SoftClient4ES
💼 Follow for more: https://www.linkedin.com/company/softnetwork-app/

#Elasticsearch #BusinessIntelligence #Tableau #PowerBI #Metabase #SQL #DataAnalytics
38 changes: 38 additions & 0 deletions linkedin/r1-demo-teaser.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# LinkedIn — R1 Demo Teaser

> **Angle** : Démo/screenshot — un JOIN single-cluster live dans Superset (ou le REPL), une seule commande docker-compose.
> **Cible** : Data analysts, data engineers, équipes BI sur Elasticsearch
> **Demo** : `docker compose --profile superset-flight up`
> **Timing** : J+7 — une semaine après le lancement

---

📊 A cross-index JOIN on Elasticsearch — running live in Apache Superset. One docker-compose command.

Last week R1 shipped cross-index JOIN. This week, watch it.

```bash
docker compose --profile superset-flight up
```

Superset opens fully provisioned. Point it at the Arrow Flight SQL endpoint and run:

```sql
SELECT e.name, d.dept_name, e.salary
FROM jdbc_join_emp e
JOIN jdbc_join_dept d ON e.dept_id = d.id
ORDER BY e.salary DESC;
```

Two Elasticsearch indices. One JOIN. No Lucene, no JSON DSL, no warehouse in between.

[SCREENSHOT: Superset results grid showing the joined employees × departments rows — capture from the demo profile before posting.]

This is the free single-cluster path — up to 2 cross-index JOINs per query in Community. The same query runs unchanged from the REPL, JDBC, and ADBC.

Try it yourself 👉 https://softclient4es.dev/integrations/jdbc/

🔗 GitHub: https://github.com/SOFTNETWORK-APP/SoftClient4ES
💼 Follow for more: https://www.linkedin.com/company/softnetwork-app/

#Elasticsearch #ApacheSuperset #SQL #BI #DataEngineering #ArrowFlightSQL
46 changes: 46 additions & 0 deletions linkedin/r1-editorial-calendar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# R1 Editorial Calendar — Blog + LinkedIn

Planning note for the R1 launch marketing waves. J+0 = R1 launch day. Blog posts are
Channel B (Medium / personal LinkedIn first; company page reshares). LinkedIn drafts live
in `elasticsql/linkedin/`. Nothing here is auto-published — each beat is posted manually
by the project lead.

## Cadence

| When | Blog post (`elasticsql/blogs/`) | LinkedIn draft (`elasticsql/linkedin/`) |
|---|---|---|
| **J+0** launch day | `announcing-r1-cross-cluster-join.md` | `r1-launch-announcement.md` |
| **J+7** week 1 | `join-matrix-walkthrough.md` | `r1-demo-teaser.md` |
| **J+14** week 2 | `sre-cross-cluster-incident-triage.md` | `r1-bi-showcase.md` |
| **J+21** week 3 | `how-customers-use-r1.md` | `r1-federation-reveal.md` |
| **J+28** week 4 | — *(Epic 18 owns this beat)* | — *(Epic 18: R1.1 Arrow zero-copy benchmark)* |

Publish on the **personal LinkedIn account first** (far higher reach than the company page),
then reshare from the company page (`linkedin.com/company/softnetwork-app`).

## Timing guards (hard, not suggestions)

### Epic-18 non-overlap guard
- The R1.1 Arrow Flight SQL **zero-copy benchmark** is Epic 18's beat, scheduled for **week 4 (J+28)**.
- The 17.8 marketing waves **END at week 3** (federation reveal). They do NOT spill into week 4.
- **The week-3 federation reveal carries NO performance claim** — no "faster", no Nx, no ms,
no rows/sec. Cross-cluster is sold on *expressiveness* (one query instead of N dashboards),
not speed. The speed story is reserved for R1.1 so the benchmark lands with full impact.

### R2a 90-day quiet-window hold
- **HOLD: J+0 → J+90 — no R2a teasers.** No subqueries, no CTEs, no "coming soon: WITH",
no set-op (UNION-dedup / INTERSECT / EXCEPT) previews on any marketing surface for the
first 90 days post-R1. This protects the R1 launch window from being diluted.
- Honest *limitation* framing (e.g. "explicit JOIN works today; subqueries land in R2a")
is allowed and encouraged on docs pages — that is expectation-setting, not a teaser.
The hold is on *promotional* R2a content, not on honest gap disclosure.
- **Policing owner: the project lead / founder marketer.** If launch momentum tempts an
early R2a reveal, the lead arbitrates and the default is "wait." Same owner decides
whether Blog Post 4 names a real customer (default: ships anonymized — see P6).

## Notes

- LinkedIn account credentials / scheduling access: open question (epic OQ#3) — drafts are
committed regardless, so they can be posted manually at each beat even if no scheduler exists.
- Hero images for the blog posts and the LinkedIn carousel are staged in `elasticsql/linkedin/`
(`softclient4es-carousel.pdf`, `web1.png`..`web5.png`) — reuse or refresh per beat.
Loading
Loading