Skip to content

feat: add v1 spec#1

Open
aheev wants to merge 2 commits into
Ladybug-Memory:mainfrom
aheev:v1
Open

feat: add v1 spec#1
aheev wants to merge 2 commits into
Ladybug-Memory:mainfrom
aheev:v1

Conversation

@aheev
Copy link
Copy Markdown

@aheev aheev commented Mar 25, 2026

added v1 specs for memory and disk

@aheev
Copy link
Copy Markdown
Author

aheev commented Mar 25, 2026

@adsharma could you PTAL?

Comment thread format/icebug-disk/spec.md Outdated
```

### Initialization script
A script (typically containing node and relationship table creation rules) that can be executed by the query engine to create the graph in the database. This script is responsible for creating the graph structure, including node tables and relationship tables, and performing necessary validations.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create the graph in the database

Creates metadadata. Data stays in parquet files. Could be remote (in object storage)

Comment thread format/icebug-disk/spec.md Outdated
name STRING,
age INT
) WITH (
format = 'parquet',
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format could be optional. Could be inferred from the other parameter

Comment thread format/icebug-disk/spec.md Outdated
age INT
) WITH (
format = 'parquet',
file_path = 'path/to/node_table_1.parquet'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a URL could be more appropriate here, since it could be a s3://... like URL and anything supported by VFS. Would love to see XET here as well, since some of our datasets live on huggingface.

This is currently called STORAGE. There is some cost to renaming it - so I'd leave it alone if its not the most important thing.

Comment thread format/icebug-disk/spec.md Outdated
Comment thread format/icebug-memory/spec.md Outdated
name STRING,
age INT
) WITH (
format = 'arrow',
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also suggest merging the two into a single url. Previous comment about STORAGE applies here as well.

Comment thread format/icebug-memory/spec.md Outdated
@aheev
Copy link
Copy Markdown
Author

aheev commented Mar 26, 2026

@adsharma can we rid of init script and metadata file from the spec? Because these are completely impl specific. We are not really specifying anything here

@aheev
Copy link
Copy Markdown
Author

aheev commented Mar 26, 2026

Re: comments on examples

@adsharma those are just examples to show the tables / metadata might look, not really an impl of the spec. I will create a baseline impl, which would be based on ladybugDB impl, in a future PR

@adsharma
Copy link
Copy Markdown
Collaborator

can we rid of init script

This whole spec is modeled after DuckLake. Specification: https://ducklake.select/docs/stable/specification/introduction

DuckLake : SQL :: Icebug-Disk : Cypher

If we're going to spend some time writing specs, it makes sense to specify what data types are valid in schema.cypher.

@adsharma
Copy link
Copy Markdown
Collaborator

I will create a baseline impl, which would be based on ladybugDB impl, in a future PR

It's not clear that another impl needs to live in this repo. If lets say Grafeo or ArcadeDB want to implement this spec, the impl could live in their respective repos.

@aheev
Copy link
Copy Markdown
Author

aheev commented Mar 26, 2026

I will create a baseline impl, which would be based on ladybugDB impl, in a future PR

It's not clear that another impl needs to live in this repo. If lets say Grafeo or ArcadeDB want to implement this spec, the impl could live in their respective repos.

Initially, I had the same idea, but I was concerned about the helper scripts like icebug-format.py. If you’re planning to remove them, I'm okay with having impls in their own repos

@aheev
Copy link
Copy Markdown
Author

aheev commented Mar 26, 2026

can we rid of init script

This whole spec is modeled after DuckLake. Specification: https://ducklake.select/docs/stable/specification/introduction

DuckLake : SQL :: Icebug-Disk : Cypher

If we're going to spend some time writing specs, it makes sense to specify what data types are valid in schema.cypher.

isn't icebug-format about graph storage rather than full-fledged graph database 🤔 . How queries are executed is part of query engine right?

@adsharma
Copy link
Copy Markdown
Collaborator

adsharma commented Mar 26, 2026

isn't icebug-format about graph storage rather than full-fledged graph database

curl -LO https://datasets.ldbcouncil.org/snb-interactive-v1-ducklake/sf1.ducklake
duckdb
memory D ATTACH 'ducklake:sf1.ducklake' (AUTOMATIC_MIGRATION TRUE);
memory D .schema

Now sf1.ducklake contains only the schema. This is fundamental. Without this, there is no difference between icebug and graphar other than minor syntax.

I imagine a similar sf1.icebug after the spec is fully implemented. It'd be a valid lbdb database that you can open with lbug cli. Contains only the schema and no data.

@aheev
Copy link
Copy Markdown
Author

aheev commented Mar 27, 2026

isn't icebug-format about graph storage rather than full-fledged graph database

curl -LO https://datasets.ldbcouncil.org/snb-interactive-v1-ducklake/sf1.ducklake
duckdb
memory D ATTACH 'ducklake:sf1.ducklake' (AUTOMATIC_MIGRATION TRUE);
memory D .schema

Now sf1.ducklake contains only the schema. This is fundamental. Without this, there is no difference between icebug and graphar other than minor syntax.

I imagine a similar sf1.icebug after the spec is fully implemented. It'd be a valid lbdb database that you can open with lbug cli. Contains only the schema and no data.

got it now. Changes in the latest commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants