feat: add v1 spec by aheev · Pull Request #1 · Ladybug-Memory/icebug-format

aheev · 2026-03-25T13:05:16Z

added v1 specs for memory and disk

aheev · 2026-03-25T13:05:28Z

@adsharma could you PTAL?

adsharma · 2026-03-25T18:41:52Z

+```
+
+### Initialization script
+A script (typically containing node and relationship table creation rules) that can be executed by the query engine to create the graph in the database. This script is responsible for creating the graph structure, including node tables and relationship tables, and performing necessary validations.


create the graph in the database

Creates metadadata. Data stays in parquet files. Could be remote (in object storage)

adsharma · 2026-03-25T18:45:11Z

+    name STRING,
+    age INT
+) WITH (
+    format = 'parquet',


format could be optional. Could be inferred from the other parameter

adsharma · 2026-03-25T18:46:44Z

+    age INT
+) WITH (
+    format = 'parquet',
+    file_path = 'path/to/node_table_1.parquet'


a URL could be more appropriate here, since it could be a s3://... like URL and anything supported by VFS. Would love to see XET here as well, since some of our datasets live on huggingface.

This is currently called STORAGE. There is some cost to renaming it - so I'd leave it alone if its not the most important thing.

adsharma · 2026-03-25T18:52:41Z

+    name STRING,
+    age INT
+) WITH (
+    format = 'arrow',


Also suggest merging the two into a single url. Previous comment about STORAGE applies here as well.

aheev · 2026-03-26T02:36:24Z

@adsharma can we rid of init script and metadata file from the spec? Because these are completely impl specific. We are not really specifying anything here

aheev · 2026-03-26T02:45:04Z

Re: comments on examples

@adsharma those are just examples to show the tables / metadata might look, not really an impl of the spec. I will create a baseline impl, which would be based on ladybugDB impl, in a future PR

adsharma · 2026-03-26T16:19:26Z

can we rid of init script

This whole spec is modeled after DuckLake. Specification: https://ducklake.select/docs/stable/specification/introduction

DuckLake : SQL :: Icebug-Disk : Cypher

If we're going to spend some time writing specs, it makes sense to specify what data types are valid in schema.cypher.

adsharma · 2026-03-26T16:20:47Z

I will create a baseline impl, which would be based on ladybugDB impl, in a future PR

It's not clear that another impl needs to live in this repo. If lets say Grafeo or ArcadeDB want to implement this spec, the impl could live in their respective repos.

aheev · 2026-03-26T16:56:38Z

I will create a baseline impl, which would be based on ladybugDB impl, in a future PR

It's not clear that another impl needs to live in this repo. If lets say Grafeo or ArcadeDB want to implement this spec, the impl could live in their respective repos.

Initially, I had the same idea, but I was concerned about the helper scripts like icebug-format.py. If you’re planning to remove them, I'm okay with having impls in their own repos

aheev · 2026-03-26T17:02:32Z

can we rid of init script

This whole spec is modeled after DuckLake. Specification: https://ducklake.select/docs/stable/specification/introduction
DuckLake : SQL :: Icebug-Disk : Cypher
If we're going to spend some time writing specs, it makes sense to specify what data types are valid in schema.cypher.

isn't icebug-format about graph storage rather than full-fledged graph database 🤔 . How queries are executed is part of query engine right?

adsharma · 2026-03-26T22:47:15Z

isn't icebug-format about graph storage rather than full-fledged graph database

curl -LO https://datasets.ldbcouncil.org/snb-interactive-v1-ducklake/sf1.ducklake
duckdb
memory D ATTACH 'ducklake:sf1.ducklake' (AUTOMATIC_MIGRATION TRUE);
memory D .schema

Now sf1.ducklake contains only the schema. This is fundamental. Without this, there is no difference between icebug and graphar other than minor syntax.

I imagine a similar sf1.icebug after the spec is fully implemented. It'd be a valid lbdb database that you can open with lbug cli. Contains only the schema and no data.

aheev · 2026-03-27T05:57:08Z

isn't icebug-format about graph storage rather than full-fledged graph database
curl -LO https://datasets.ldbcouncil.org/snb-interactive-v1-ducklake/sf1.ducklake
duckdb
memory D ATTACH 'ducklake:sf1.ducklake' (AUTOMATIC_MIGRATION TRUE);
memory D .schema
Now sf1.ducklake contains only the schema. This is fundamental. Without this, there is no difference between icebug and graphar other than minor syntax.

I imagine a similar sf1.icebug after the spec is fully implemented. It'd be a valid lbdb database that you can open with lbug cli. Contains only the schema and no data.

got it now. Changes in the latest commit

feat: add v1 spec

3e38223

adsharma reviewed Mar 25, 2026

View reviewed changes

remove lbug impl specific details

0864f94

aheev mentioned this pull request May 9, 2026

feat: Implement icebug-disk format LadybugDB/ladybug#429

Closed

Conversation

aheev commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aheev commented Mar 25, 2026

Uh oh!

adsharma Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

adsharma Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

adsharma Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adsharma Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aheev commented Mar 26, 2026

Uh oh!

aheev commented Mar 26, 2026

Uh oh!

adsharma commented Mar 26, 2026

Uh oh!

adsharma commented Mar 26, 2026

Uh oh!

aheev commented Mar 26, 2026

Uh oh!

aheev commented Mar 26, 2026

Uh oh!

adsharma commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aheev commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aheev commented Mar 25, 2026 •

edited

Loading

adsharma commented Mar 26, 2026 •

edited

Loading