Skip to content

Commit 635c7e5

Browse files
authored
[Issue pixelsdb#437] redesign the schema of metadata. (pixelsdb#474)
Support the new features in pixels metadata: 1. user management and role-based authentication; 2. range-based table partitioning; 3. schema versioning; 4. path-granular catalog management. Currently, only the RPCs for catalog management are fully implemented. In addition, the metadata of tables and layouts are optimized. I also fixed a bug in layout permission conversion and a bug in layout dao's update method.
1 parent 0bf317b commit 635c7e5

File tree

81 files changed

+3693
-1671
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+3693
-1671
lines changed

docs/TPC-H.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,20 +12,16 @@ The file(s) of each table are stored in a separate directory named by the table
1212

1313
## Create TPC-H Database
1414
Log in trino-cli and use the SQL statements in `scripts/sql/tpch_schema.sql` to create the TPC-H database in Pixels.
15-
Change the value of the `storage` table property in the create-table statement to `hdfs` if HDFS is used as the
16-
underlying storage system instead of S3.
17-
> Note that trino-cli can execute only one SQL statement at each time.
15+
In each `CREATE TABLE` statement, the table property `storage` defines the type of storage system used to store the
16+
files in this table, whereas the table property `paths` defines the URIs of the paths in which the table files are stored.
17+
Multiple URIs can be listed one after another, seperated by semicolon, in `paths`.
18+
> Note that the URIs in `paths` can have no storage scheme or their storage scheme must be consistent with `storage`.
1819
1920
Then, use `SHOW SCHEMAS` and `SHOW TABLES` statements to check if the tpch database has been
2021
created successfully.
2122

22-
Connect to MySQL using the user `pixels`, and execute the SQL statements in `scripts/sql/tpch_layouts.sql`
23-
to create the table layouts for the tables in the TPC-H database in Pixels. Actually, these layouts
24-
should be created by the storage layout optimizer ([Rainbow](https://ieeexplore.ieee.org/document/8509421)).
25-
However, we directly load the layouts here for simplicity.
26-
2723
Create the container to store the tables in S3. The container name is the same as the hostname
28-
(e.g., `pixels-tpch`) in the `LAYOUT_ORDER_PATH` and `LAYOUT_COMPACT_PATH` of each table layout.
24+
(e.g., `pixels-tpch`) in the `paths` of each table.
2925
Change the bucket name if it already exists.
3026

3127
During data loading, Pixels will automatically create the folders in the bucket to store the files in each table.
@@ -94,6 +90,10 @@ The last parameter `-c` of `COMPACT` command is the maximum number
9490
of threads used for data compaction. For large tables such as `lineitem`, you can increase `-c` to
9591
improve the compaction performance. Compaction is normally faster than loading with same number of threads.
9692

93+
> `compact.factor` in `$PIXELS_HOME/pixels.properties` determines how many row groups are compacted into a single
94+
> file. The default value is 32, which is appropriate in most conditions. An experimental evaluation of the effects
95+
> of compact factor on AWS S3 can be found in our [ICDE'22](https://ieeexplore.ieee.org/document/9835615) paper.
96+
9797
To avoid scanning the small files in the ordered path during query execution,
9898
create an empty bucket in S3 and change the ordered path in the metadata database
9999
to the empty bucket.
@@ -112,7 +112,7 @@ STAT -s tpch -t partsupp -o false -c true
112112
STAT -s tpch -t orders -o false -c true
113113
STAT -s tpch -t lineitem -o false -c true
114114
```
115-
After it is finished, statistics of each tpch column can be found in the `pixels_metadata.COLS` metadata table.
115+
When it is finished, statistics of each tpch column can be found in the `pixels_metadata.COLS` metadata table.
116116
Finally, manually update the row count for each tpch table in `pixels_metadata.TBLS.TBL_ROW_COUNT`.
117117

118118
Set `splits.index.type=cost_based` and restart Trino to benefit from cost-based query optimization.

pixels-cache/src/main/java/io/pixelsdb/pixels/cache/ColumnletId.java renamed to pixels-cache/src/main/java/io/pixelsdb/pixels/cache/ColumnChunkId.java

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,20 +30,20 @@
3030
* @author guodong
3131
* @author hank
3232
*/
33-
public class ColumnletId
33+
public class ColumnChunkId
3434
{
3535
public short rowGroupId;
3636
public short columnId;
3737
public boolean direct;
3838

39-
public ColumnletId(short rowGroupId, short columnId, boolean direct)
39+
public ColumnChunkId(short rowGroupId, short columnId, boolean direct)
4040
{
4141
this.rowGroupId = rowGroupId;
4242
this.columnId = columnId;
4343
this.direct = direct;
4444
}
4545

46-
public ColumnletId() {}
46+
public ColumnChunkId() {}
4747

4848
@Override
4949
public boolean equals(Object o)
@@ -56,7 +56,7 @@ public boolean equals(Object o)
5656
{
5757
return false;
5858
}
59-
ColumnletId other = (ColumnletId) o;
59+
ColumnChunkId other = (ColumnChunkId) o;
6060
return Objects.equals(rowGroupId, other.rowGroupId) &&
6161
Objects.equals(columnId, other.columnId);
6262
}

pixels-cache/src/main/java/io/pixelsdb/pixels/cache/PixelsCacheReader.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ public ByteBuffer get(long blockId, short rowGroupId, short columnId, boolean di
188188
return null;
189189
}
190190

191-
public void batchGet(List<ColumnletId> columnletIds, byte[][] container)
191+
public void batchGet(List<ColumnChunkId> columnChunkIds, byte[][] container)
192192
{
193193
// TODO batch get cache items. merge cache accesses to reduce the number of jni invocation.
194194
}

pixels-cache/src/main/java/io/pixelsdb/pixels/cache/PixelsCacheWriter.java

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -165,21 +165,21 @@ public PixelsCacheWriter build()
165165
MemoryMappedFile indexFile = new MemoryMappedFile(builderIndexLocation, builderIndexSize);
166166
PixelsRadix radix;
167167
// check if cache and index exists.
168-
Set<String> cachedColumnlets = new HashSet<>();
168+
Set<String> cachedColumnChunks = new HashSet<>();
169169
// if overwrite is not true, and cache and index file already exists, reconstruct radix from existing index.
170170
if (!builderOverwrite && PixelsCacheUtil.checkMagic(indexFile) && PixelsCacheUtil.checkMagic(cacheFile))
171171
{
172172
// cache exists in local cache file and index, reload the index.
173173
radix = PixelsCacheUtil.loadRadixIndex(indexFile);
174-
// build cachedColumnlets for PixelsCacheWriter.
174+
// build cachedColumnChunks for PixelsCacheWriter.
175175
int cachedVersion = PixelsCacheUtil.getIndexVersion(indexFile);
176176
MetadataService metadataService = new MetadataService(
177177
cacheConfig.getMetaHost(), cacheConfig.getMetaPort());
178178
Layout cachedLayout = metadataService.getLayout(
179179
cacheConfig.getSchema(), cacheConfig.getTable(), cachedVersion);
180-
Compact compact = cachedLayout.getCompactObject();
180+
Compact compact = cachedLayout.getCompact();
181181
int cacheBorder = compact.getCacheBorder();
182-
cachedColumnlets.addAll(compact.getColumnletOrder().subList(0, cacheBorder));
182+
cachedColumnChunks.addAll(compact.getColumnChunkOrder().subList(0, cacheBorder));
183183
metadataService.shutdown();
184184
}
185185
// else, create a new radix tree, and initialize the index and cache file.
@@ -193,7 +193,7 @@ public PixelsCacheWriter build()
193193
Storage storage = StorageFactory.Instance().getStorage(cacheConfig.getStorageScheme());
194194

195195
return new PixelsCacheWriter(cacheFile, indexFile, storage, radix,
196-
cachedColumnlets, etcdUtil, builderHostName);
196+
cachedColumnChunks, etcdUtil, builderHostName);
197197
}
198198
}
199199

@@ -321,9 +321,9 @@ private int internalUpdateAll(int version, Layout layout, String[] files)
321321
{
322322
int status = 0;
323323
// get the new caching layout
324-
Compact compact = layout.getCompactObject();
324+
Compact compact = layout.getCompact();
325325
int cacheBorder = compact.getCacheBorder();
326-
List<String> cacheColumnletOrders = compact.getColumnletOrder().subList(0, cacheBorder);
326+
List<String> cacheColumnletOrders = compact.getColumnChunkOrder().subList(0, cacheBorder);
327327
// set rwFlag as write
328328
logger.debug("Set index rwFlag as write");
329329
try
@@ -429,9 +429,9 @@ private int internalUpdateIncremental(int version, Layout layout, String[] files
429429
/**
430430
* Get the new caching layout.
431431
*/
432-
Compact compact = layout.getCompactObject();
432+
Compact compact = layout.getCompact();
433433
int cacheBorder = compact.getCacheBorder();
434-
List<String> nextVersionCached = compact.getColumnletOrder().subList(0, cacheBorder);
434+
List<String> nextVersionCached = compact.getColumnChunkOrder().subList(0, cacheBorder);
435435
/**
436436
* Prepare structures for the survived and new coming cache elements.
437437
*/

pixels-cache/src/main/java/io/pixelsdb/pixels/cache/PixelsPartitionCacheWriter.java

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -373,10 +373,10 @@ public int updateAll(int version, Layout layout)
373373
}
374374
String fileStr = keyValue.getValue().toString(StandardCharsets.UTF_8);
375375
String[] files = fileStr.split(";");
376-
Compact compact = layout.getCompactObject();
376+
Compact compact = layout.getCompact();
377377
int cacheBorder = compact.getCacheBorder();
378-
List<String> cacheColumnletOrders = compact.getColumnletOrder().subList(0, cacheBorder);
379-
return internalUpdateAll(version, cacheColumnletOrders, files);
378+
List<String> cacheColumnChunkOrders = compact.getColumnChunkOrder().subList(0, cacheBorder);
379+
return internalUpdateAll(version, cacheColumnChunkOrders, files);
380380
}
381381
catch (IOException | InterruptedException e)
382382
{
@@ -399,10 +399,10 @@ public int updateIncremental(int version, Layout layout)
399399
}
400400
String fileStr = keyValue.getValue().toString(StandardCharsets.UTF_8);
401401
String[] files = fileStr.split(";");
402-
Compact compact = layout.getCompactObject();
402+
Compact compact = layout.getCompact();
403403
int cacheBorder = compact.getCacheBorder();
404-
List<String> cacheColumnletOrders = compact.getColumnletOrder().subList(0, cacheBorder);
405-
return internalUpdateIncremental(version, cacheColumnletOrders, files);
404+
List<String> cacheColumnChunkOrders = compact.getColumnChunkOrder().subList(0, cacheBorder);
405+
return internalUpdateIncremental(version, cacheColumnChunkOrders, files);
406406
}
407407
catch (IOException | InterruptedException e)
408408
{

pixels-cli/src/main/java/io/pixelsdb/pixels/cli/Main.java

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ public static void main(String args[])
124124
.help("specify the number of consumer threads used for data generation");
125125
argumentParser.addArgument("-e", "--enable_encoding").setDefault(true)
126126
.help("specify the option of enabling encoding or not");
127-
argumentParser.addArgument("-l", "--loading_data_path")
127+
argumentParser.addArgument("-l", "--loading_data_paths")
128128
.help("specify the path of loading data");
129129

130130
Namespace ns = null;
@@ -330,12 +330,11 @@ public static long executeSQL(String jdbcUrl, Properties jdbcProperties, String
330330

331331
/**
332332
* Check if the order or compact path from pixels metadata is valid.
333-
* @param path the order or compact path from pixels metadata.
333+
* @param paths the order or compact pathw from pixels metadata.
334334
*/
335-
public static void validateOrderOrCompactPath(String path)
335+
public static void validateOrderOrCompactPath(String[] paths)
336336
{
337-
requireNonNull(path, "path is null");
338-
String[] paths = path.split(";");
337+
requireNonNull(paths, "paths is null");
339338
checkArgument(paths.length > 0, "path must contain at least one valid directory");
340339
try
341340
{

pixels-cli/src/main/java/io/pixelsdb/pixels/cli/executor/CompactExecutor.java

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
*/
2020
package io.pixelsdb.pixels.cli.executor;
2121

22+
import com.google.common.base.Joiner;
2223
import io.pixelsdb.pixels.common.metadata.MetadataService;
2324
import io.pixelsdb.pixels.common.metadata.domain.Compact;
2425
import io.pixelsdb.pixels.common.metadata.domain.Layout;
@@ -76,8 +77,8 @@ public void execute(Namespace ns, String command) throws Exception
7677

7778
requireNonNull(layout, String.format("writable layout is not found for table '%s.%s'.",
7879
schemaName, tableName));
79-
Compact compact = layout.getCompactObject();
80-
int numRowGroupInBlock = compact.getNumRowGroupInBlock();
80+
Compact compact = layout.getCompact();
81+
int numRowGroupInBlock = compact.getNumRowGroupInFile();
8182
int numColumn = compact.getNumColumn();
8283
CompactLayout compactLayout;
8384
if (naive.equalsIgnoreCase("yes") || naive.equalsIgnoreCase("y"))
@@ -91,15 +92,15 @@ public void execute(Namespace ns, String command) throws Exception
9192

9293
// get input file paths
9394
ConfigFactory configFactory = ConfigFactory.Instance();
94-
validateOrderOrCompactPath(layout.getOrderPath());
95-
validateOrderOrCompactPath(layout.getCompactPath());
95+
validateOrderOrCompactPath(layout.getOrderedPathUris());
96+
validateOrderOrCompactPath(layout.getCompactPathUris());
9697
// PIXELS-399: it is not a problem if the order or compact path contains multiple directories
97-
Storage orderStorage = StorageFactory.Instance().getStorage(layout.getOrderPath());
98-
Storage compactStorage = StorageFactory.Instance().getStorage(layout.getCompactPath());
98+
Storage orderStorage = StorageFactory.Instance().getStorage(layout.getOrderedPathUris()[0]);
99+
Storage compactStorage = StorageFactory.Instance().getStorage(layout.getCompactPathUris()[0]);
99100
long blockSize = Long.parseLong(configFactory.getProperty("block.size"));
100101
short replication = Short.parseShort(configFactory.getProperty("block.replication"));
101-
List<Status> statuses = orderStorage.listStatus(layout.getOrderPath());
102-
String[] targetPaths = layout.getCompactPath().split(";");
102+
List<Status> statuses = orderStorage.listStatus(layout.getOrderedPathUris());
103+
String[] targetPaths = layout.getCompactPathUris();
103104
int targetPathId = 0;
104105

105106
// compact
@@ -186,8 +187,8 @@ public void execute(Namespace ns, String command) throws Exception
186187
while (!compactExecutor.awaitTermination(100, TimeUnit.SECONDS));
187188

188189
long endTime = System.currentTimeMillis();
189-
System.out.println("Pixels files in '" + layout.getOrderPath() + "' are compacted into '" +
190-
layout.getCompactPath() + "' by " + threadNum + " threads in " +
190+
System.out.println("Pixels files in '" + Joiner.on(";").join(layout.getOrderedPathUris()) + "' are compacted into '" +
191+
Joiner.on(";").join(layout.getCompactPathUris()) + "' by " + threadNum + " threads in " +
191192
(endTime - startTime) / 1000 + "s.");
192193
}
193194
}

pixels-cli/src/main/java/io/pixelsdb/pixels/cli/executor/LoadExecutor.java

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
import java.util.concurrent.LinkedBlockingQueue;
3434

3535
import static io.pixelsdb.pixels.cli.Main.validateOrderOrCompactPath;
36+
import static java.util.Objects.requireNonNull;
3637

3738
/**
3839
* @author hank
@@ -48,13 +49,14 @@ public void execute(Namespace ns, String command) throws Exception
4849
String origin = ns.getString("original_data_path");
4950
int rowNum = Integer.parseInt(ns.getString("row_num"));
5051
String regex = ns.getString("row_regex");
51-
String loadingDataPath = ns.getString("loading_data_path");
52+
String[] loadingDataPaths = requireNonNull(
53+
ns.getString("loading_data_paths"), "paths is null").split(";");
5254
int threadNum = Integer.parseInt(ns.getString("consumer_thread_num"));
5355
boolean enableEncoding = Boolean.parseBoolean(ns.getString("enable_encoding"));
5456
System.out.println("enable encoding: " + enableEncoding);
55-
if (loadingDataPath != null && !loadingDataPath.isEmpty())
57+
if (loadingDataPaths.length > 0)
5658
{
57-
validateOrderOrCompactPath(loadingDataPath);
59+
validateOrderOrCompactPath(loadingDataPaths);
5860
}
5961

6062
if (!origin.endsWith("/"))
@@ -64,7 +66,7 @@ public void execute(Namespace ns, String command) throws Exception
6466

6567
Storage storage = StorageFactory.Instance().getStorage(origin);
6668

67-
Parameters parameters = new Parameters(schemaName, tableName, rowNum, regex, enableEncoding, loadingDataPath);
69+
Parameters parameters = new Parameters(schemaName, tableName, rowNum, regex, enableEncoding, loadingDataPaths);
6870

6971
// source already exist, producer option is false, add list of source to the queue
7072
List<String> fileList = storage.listPaths(origin);

pixels-cli/src/main/java/io/pixelsdb/pixels/cli/executor/StatExecutor.java

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -64,17 +64,17 @@ public void execute(Namespace ns, String command) throws Exception
6464
{
6565
if (orderedEnabled)
6666
{
67-
String orderedPath = layout.getOrderPath();
68-
validateOrderOrCompactPath(orderedPath);
69-
Storage storage = StorageFactory.Instance().getStorage(orderedPath);
70-
files.addAll(storage.listPaths(orderedPath));
67+
String[] orderedPaths = layout.getOrderedPathUris();
68+
validateOrderOrCompactPath(orderedPaths);
69+
Storage storage = StorageFactory.Instance().getStorage(orderedPaths[0]);
70+
files.addAll(storage.listPaths(orderedPaths));
7171
}
7272
if (compactEnabled)
7373
{
74-
String compactPath = layout.getCompactPath();
75-
validateOrderOrCompactPath(compactPath);
76-
Storage storage = StorageFactory.Instance().getStorage(compactPath);
77-
files.addAll(storage.listPaths(compactPath));
74+
String[] compactPaths = layout.getCompactPathUris();
75+
validateOrderOrCompactPath(compactPaths);
76+
Storage storage = StorageFactory.Instance().getStorage(compactPaths[0]);
77+
files.addAll(storage.listPaths(compactPaths));
7878
}
7979
}
8080
}

0 commit comments

Comments
 (0)