You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a SQL query uses ORDER BY on an aggregation alias (e.g., ORDER BY "Total Spend" DESC), the ordering is not applied to the Elasticsearch terms aggregation. The resulting ES query has no "order" clause in the terms aggregation, so results come back in the default doc_count order.
Failing Query (generated by Apache Superset)
SELECT customer_name AS customer_name, country AS country,
SUM(total_price) AS"Total Spend", COUNT(*) AS"Orders"FROM ecommerce
GROUP BY customer_name, country
ORDER BY"Total Spend"DESCLIMIT10
In ElasticAggregation.apply() (bridge/.../ElasticAggregation.scala), the direction for an aggregation is resolved by looking up the aggregation's Identifier properties in bucketsDirection (which comes from request.sorts):
valdirection=
bucketsDirection
.get(identifier.identifierName) // e.g., "SUM(total_price)" — no match
.orElse(bucketsDirection.get(identifier.aliasOrName)) // e.g., "total_price" — no match
bucketsDirection contains "Total Spend" -> Desc (from ORDER BY "Total Spend" DESC).
However:
identifier.identifierName = "SUM(total_price)" (reconstructed via functions) — no match
identifier.aliasOrName = identifier.fieldAlias.getOrElse(identifier.name) = "total_price" — no match
The Identifier.fieldAlias is still None because SingleSearch.update() has not been called at this point. The alias "Total Spend" exists only on the Field level (Field.fieldAlias = Some(Alias("Total Spend"))), not on the Identifier level.
So direction = None, and the aggregation has no ordering. Later, in buildBuckets, aggregationsDirection is empty because no aggregation reported a direction.
Fix
In ElasticAggregation.apply(), also look up the Field-level alias in bucketsDirection:
Since import sqlAgg._ brings Field members into scope, fieldAlias is Field.fieldAlias = Some(Alias("Total Spend")). This correctly resolves bucketsDirection.get("Total Spend") = Some(Desc).
Files to modify
File
Change
bridge/.../ElasticAggregation.scala
Add .orElse(fieldAlias.flatMap(a => bucketsDirection.get(a.alias))) to direction resolution
Tests
Integration test: Query with ORDER BY "Total Spend" DESC should produce ES aggregation with "order": {"Total Spend": "desc"} on the terms bucket
Parser + bridge test: Verify direction is Some(Desc) for aliased aggregation with ORDER BY on alias
Additional Problem — ORDER BY aggregation not in SELECT
A related gap exists: if an aggregation appears only in ORDER BY (not in SELECT), the aggregation is never created in the Elasticsearch query.
SELECTprofile.cityFROM dql_users
GROUP BYprofile.cityORDER BYCOUNT(*) DESC
Here COUNT(*) is only in ORDER BY. SingleSearch.aggregates does not extract aggregations from ORDER BY (only from SELECT, HAVING, and WHERE since issue #53). The aggregation is never passed to the bridge layer, so the terms aggregation cannot reference it in its "order" clause.
Description
When a SQL query uses
ORDER BYon an aggregation alias (e.g.,ORDER BY "Total Spend" DESC), the ordering is not applied to the Elasticsearch terms aggregation. The resulting ES query has no"order"clause in thetermsaggregation, so results come back in the defaultdoc_countorder.Failing Query (generated by Apache Superset)
Expected ES aggregation
{ "aggs": { "customer_name": { "terms": { "field": "customer_name", "size": 11, "min_doc_count": 1, "order": { "Total Spend": "desc" } }, "aggs": { "country": { "terms": { "field": "country", "size": 11, "min_doc_count": 1 }, "aggs": { "Total Spend": { "sum": { "field": "total_price" } }, "Orders": { "value_count": { "field": "_index" } } } } } } } }Actual ES aggregation (no
"order"){ "aggs": { "customer_name": { "terms": { "field": "customer_name", "size": 11, "min_doc_count": 1 }, "aggs": { "country": { "terms": { "field": "country", "size": 11, "min_doc_count": 1 }, "aggs": { "Total Spend": { "sum": { "field": "total_price" } }, "Orders": { "value_count": { "field": "_index" } } } } } } } }Root Cause
In
ElasticAggregation.apply()(bridge/.../ElasticAggregation.scala), thedirectionfor an aggregation is resolved by looking up the aggregation's Identifier properties inbucketsDirection(which comes fromrequest.sorts):bucketsDirectioncontains"Total Spend" -> Desc(fromORDER BY "Total Spend" DESC).However:
identifier.identifierName="SUM(total_price)"(reconstructed via functions) — no matchidentifier.aliasOrName=identifier.fieldAlias.getOrElse(identifier.name)="total_price"— no matchThe
Identifier.fieldAliasis stillNonebecauseSingleSearch.update()has not been called at this point. The alias"Total Spend"exists only on the Field level (Field.fieldAlias = Some(Alias("Total Spend"))), not on the Identifier level.So
direction = None, and the aggregation has no ordering. Later, inbuildBuckets,aggregationsDirectionis empty because no aggregation reported a direction.Fix
In
ElasticAggregation.apply(), also look up the Field-level alias inbucketsDirection:Since
import sqlAgg._bringsFieldmembers into scope,fieldAliasisField.fieldAlias=Some(Alias("Total Spend")). This correctly resolvesbucketsDirection.get("Total Spend")=Some(Desc).Files to modify
bridge/.../ElasticAggregation.scala.orElse(fieldAlias.flatMap(a => bucketsDirection.get(a.alias)))todirectionresolutionTests
ORDER BY "Total Spend" DESCshould produce ES aggregation with"order": {"Total Spend": "desc"}on the terms bucketdirectionisSome(Desc)for aliased aggregation withORDER BYon aliasAdditional Problem — ORDER BY aggregation not in SELECT
A related gap exists: if an aggregation appears only in ORDER BY (not in SELECT), the aggregation is never created in the Elasticsearch query.
Here
COUNT(*)is only in ORDER BY.SingleSearch.aggregatesdoes not extract aggregations from ORDER BY (only from SELECT, HAVING, and WHERE since issue #53). The aggregation is never passed to the bridge layer, so thetermsaggregation cannot reference it in its"order"clause.Suggested approach
extractAggregationFieldstoFieldSort(similar toCriteria.extractAggregationFieldsadded in Aggregations referenced only in HAVING or WHERE are not created #53)SingleSearch.aggregatesto also collect ORDER BY aggregationsKey observations
FieldSortinOrderBy.scalaalready hashasAggregationandisBucketScriptbut noextractAggregationFields"order"on thetermsagg) than HAVING (bucket_selector)Impact
ORDER BY <agg_alias>gets wrong orderingsize(from LIMIT) is correct, but without proper ordering, the top-N results are meaninglessRelated
SingleSearch.aggregates)