Steps to reproduce the behavior (Required)
- Create an Iceberg catalog and base table with date partitions:
CREATE EXTERNAL CATALOG bug08_ice
PROPERTIES(
"type" = "iceberg",
"iceberg.catalog.type" = "hadoop",
"iceberg.catalog.warehouse" = "file:///tmp/starrocks-sql-test/iceberg/bug08_confirm"
);
SET catalog bug08_ice;
CREATE DATABASE bug08_ice_db;
USE bug08_ice_db;
CREATE TABLE t1 (id int, dt date, val int) PARTITION BY (dt);
INSERT INTO t1 VALUES
(1, '2024-01-01', 10),
(2, '2024-01-02', 20),
(3, '2024-01-03', 30),
(4, '2024-01-04', 40),
(5, '2024-01-05', 50);
- Create a partitioned MV with
partition_ttl_number = 2:
SET catalog default_catalog;
CREATE DATABASE bug08_mv_db;
USE bug08_mv_db;
CREATE MATERIALIZED VIEW test_mv1
PARTITION BY dt
REFRESH DEFERRED MANUAL
PROPERTIES (
"replication_num" = "1",
"partition_ttl_number" = "2"
)
AS SELECT dt, sum(val) AS sv
FROM bug08_ice.bug08_ice_db.t1
GROUP BY dt;
- Run the initial refresh:
REFRESH MATERIALIZED VIEW test_mv1 WITH SYNC MODE;
SELECT dt FROM test_mv1 ORDER BY dt;
The MV correctly keeps only the latest 2 partitions: 2024-01-04, 2024-01-05.
- Insert one new base partition and refresh again:
INSERT INTO bug08_ice.bug08_ice_db.t1 VALUES (6, '2024-01-06', 60);
REFRESH MATERIALIZED VIEW test_mv1 WITH SYNC MODE;
SELECT dt FROM test_mv1 ORDER BY dt;
SELECT count(*) FROM test_mv1;
SHOW PARTITIONS FROM test_mv1;
Expected behavior (Required)
After every refresh, including incremental refresh, the MV should keep at most partition_ttl_number partitions.
With partition_ttl_number = 2, after inserting 2024-01-06 and refreshing, the MV should contain only:
So SELECT count(*) FROM test_mv1 should return 2.
Real behavior (Required)
The initial refresh behaves correctly, but the incremental refresh does not trim the old MV partition.
After inserting 2024-01-06 and refreshing again, the MV contains:
2024-01-04
2024-01-05
2024-01-06
So SELECT count(*) FROM test_mv1 returns 3.
This means partition_ttl_number is honored when the MV is first populated, but is not enforced after later incremental refreshes.
Additional observations
I reproduced this locally and the FE logs show:
- initial refresh partition diff:
adds=p20240104,p20240105
- incremental refresh partition diff:
adds=p20240106, deletes=
- the incremental refresh only refreshes
p20240106
The current code path also looks suspicious:
RangePartitionDiffer trims the candidate add set by partition_ttl_number, which explains why the initial refresh only creates the latest N partitions.
MVPCTRefreshRangePartitioner.syncAddOrDropPartitions() only calls filterPartitionsByTTL(adds, true) on the newly added partitions, and does not trim already existing stale MV partitions after incremental refresh.
Relevant code paths:
fe/fe-core/src/main/java/com/starrocks/sql/common/RangePartitionDiffer.java
fe/fe-core/src/main/java/com/starrocks/scheduler/mv/pct/MVPCTRefreshRangePartitioner.java
StarRocks version (Required)
Reproduced on a local FE runtime with:
show variables like 'version_comment' = fix/bug-28-iceberg-row-dml-reject-e9c501d
- source checkout HEAD =
189283f334c (upstream/main)
I cannot provide select current_version() output because it is not supported in this local runtime environment.
Steps to reproduce the behavior (Required)
partition_ttl_number = 2:2024-01-04,2024-01-05.Expected behavior (Required)
After every refresh, including incremental refresh, the MV should keep at most
partition_ttl_numberpartitions.With
partition_ttl_number = 2, after inserting2024-01-06and refreshing, the MV should contain only:2024-01-052024-01-06So
SELECT count(*) FROM test_mv1should return2.Real behavior (Required)
The initial refresh behaves correctly, but the incremental refresh does not trim the old MV partition.
After inserting
2024-01-06and refreshing again, the MV contains:2024-01-042024-01-052024-01-06So
SELECT count(*) FROM test_mv1returns3.This means
partition_ttl_numberis honored when the MV is first populated, but is not enforced after later incremental refreshes.Additional observations
I reproduced this locally and the FE logs show:
adds=p20240104,p20240105adds=p20240106, deletes=p20240106The current code path also looks suspicious:
RangePartitionDiffertrims the candidate add set bypartition_ttl_number, which explains why the initial refresh only creates the latest N partitions.MVPCTRefreshRangePartitioner.syncAddOrDropPartitions()only callsfilterPartitionsByTTL(adds, true)on the newly added partitions, and does not trim already existing stale MV partitions after incremental refresh.Relevant code paths:
fe/fe-core/src/main/java/com/starrocks/sql/common/RangePartitionDiffer.javafe/fe-core/src/main/java/com/starrocks/scheduler/mv/pct/MVPCTRefreshRangePartitioner.javaStarRocks version (Required)
Reproduced on a local FE runtime with:
show variables like 'version_comment'=fix/bug-28-iceberg-row-dml-reject-e9c501d189283f334c(upstream/main)I cannot provide
select current_version()output because it is not supported in this local runtime environment.