Skip to content

force_external_table_query_rewrite semantics are misleading: DISABLE only affects non-ref external base tables for partitioned MVs #71963

@HangyuanLiu

Description

@HangyuanLiu

Describe the bug

The MV property force_external_table_query_rewrite is documented and named as if it controls whether query rewrite is enabled for external-table-based materialized views in general.

However, the current implementation does not behave like a global rewrite switch:

  • For partitioned MVs, DISABLE only affects external non-ref base tables.
  • If the external table is the ref base table that provides the MV partition column, DISABLE does not suppress rewrite.
  • For this property itself, the accepted enum values effectively collapse into two behaviors:
    • DISABLE
    • everything else (CHECKED / LOOSE / NOCHECK / FORCE_MV / legacy true)
  • The actual freshness / consistency behavior for ref-base rewrite is controlled by query_rewrite_consistency, not by force_external_table_query_rewrite.

As a result, the property name and current docs are misleading for users.

Expected behavior

One of the following should be true:

  1. force_external_table_query_rewrite=DISABLE should consistently disable rewrite for external-table-based MVs, including the common single-table ref-base topology.
  2. The property should be renamed or documented clearly to reflect its real scope, for example that it only gates rewrite behavior for external non-ref base tables in partitioned MV timeliness checks.

At minimum, users should not reasonably interpret this property as a global on/off switch for external MV rewrite if that is not what it does.

Actual behavior

For a partitioned MV built on a single external table, where the MV partition column is derived from that table:

  • force_external_table_query_rewrite=DISABLE does not prevent rewrite.
  • The MV can still be selected for rewrite.
  • Changing query_rewrite_consistency=DISABLE does suppress rewrite in the same topology.

This makes the two properties look overlapping from a user perspective, but they actually control different things.

Minimal reproduction

CREATE MATERIALIZED VIEW test_mv1
PARTITION BY dt
REFRESH DEFERRED MANUAL
PROPERTIES (
  "replication_num" = "1"
)
AS
SELECT dt, sum(val) AS sv
FROM ext_catalog.db.t1
GROUP BY dt;

REFRESH MATERIALIZED VIEW test_mv1 WITH SYNC MODE;

ALTER MATERIALIZED VIEW test_mv1
SET ("force_external_table_query_rewrite" = "DISABLE");

SELECT dt, sum(val)
FROM ext_catalog.db.t1
GROUP BY dt;

Observed: rewrite can still hit test_mv1.

In contrast, setting:

ALTER MATERIALIZED VIEW test_mv1
SET ("query_rewrite_consistency" = "DISABLE");

does suppress rewrite in the same topology.

Why this is confusing

From the property name and documentation, users will naturally expect:

  • force_external_table_query_rewrite=DISABLE means "do not rewrite using this external-table MV"

But the current behavior is closer to:

  • for partitioned MVs, only treat external non-ref base tables as a rewrite gate
  • do not gate the external ref base table path

This is especially confusing for the most common topology: a single external base table partitioned MV.

Code path / root cause

The property is parsed as QueryRewriteConsistencyMode:

  • true -> CHECKED
  • false -> DISABLE
  • enum values are also accepted

Relevant code:

  • fe/fe-core/src/main/java/com/starrocks/catalog/TableProperty.java
  • analyzeExternalTableQueryRewrite(...)

But the actual read sites only check whether the value is DISABLE:

  • fe/fe-core/src/main/java/com/starrocks/catalog/mv/MVTimelinessArbiter.java

  • needsRefreshOnNonRefBaseTables(...)

  • skips ref base tables first, then applies DISABLE only to external non-ref base tables

  • fe/fe-core/src/main/java/com/starrocks/catalog/mv/MVTimelinessNonPartitionArbiter.java

  • for non-partitioned MVs, any external base table under DISABLE returns fullRefresh, which blocks rewrite

Meanwhile, the real rewrite consistency control for query rewrite is handled by:

  • fe/fe-core/src/main/java/com/starrocks/catalog/MvRefreshArbiter.java
  • query_rewrite_consistency

So in practice, force_external_table_query_rewrite is not a general rewrite-consistency property even though it accepts the same enum values.

Additional inconsistency

The property accepts:

  • DISABLE
  • LOOSE
  • CHECKED
  • NOCHECK
  • FORCE_MV

But for this property's own current behavior, values other than DISABLE are not meaningfully distinguished in the timeliness gate. This makes the supported value set look much richer than the actual behavior.

Suggested fixes

Any of these would help:

  1. Make force_external_table_query_rewrite=DISABLE consistently disable rewrite for external-table MVs, including ref-base topology.
  2. Rename the property, keeping the old name as an alias for compatibility, so it reflects the real scope.
  3. Clearly distinguish in docs:
    • force_external_table_query_rewrite: current gate for external non-ref base tables
    • query_rewrite_consistency: actual rewrite consistency / freshness policy for ref-base rewrite
  4. Emit an analyzer warning if users set force_external_table_query_rewrite on a topology where it has no effect, such as a single external ref-base partitioned MV.

Impact

This looks primarily like a UX / naming / documentation bug rather than a correctness bug, but it is highly misleading:

  • users can believe rewrite was disabled when it was not
  • docs and property name encourage the wrong mental model
  • accepted enum values imply semantics that are not actually applied by this property

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions