An opinionated Graph Database for serving millions of GitOIDs (Git Object Identifiers).
Big Tent stores software artifacts as Items connected by typed Edges, enabling efficient queries about software composition, provenance, and dependencies.
# Build from source
cargo build --release
# Or use Docker
docker pull spicelabs/bigtent:latest
# Run server with a cluster
./target/release/bigtent --rodeo /path/to/cluster/ --port 3000
# Batch lookup identifiers
./target/release/bigtent --rodeo /path/to/cluster/ --lookup identifiers.json
# Query an item
curl http://localhost:3000/item/gitoid:blob:sha256:abc123...
# Get OpenAPI documentation
curl http://localhost:3000/openapi.jsonFor detailed setup instructions, see GETTING_STARTED.md.
| Document | Description |
|---|---|
| GETTING_STARTED.md | Installation, building, and first steps |
| ARCHITECTURE.md | System design, components, and data flow |
| BENCHMARKING.md | Merge benchmarking suite and performance tracking |
| PERFORMANCE.md | Performance tuning and optimization |
| info/config.md | Configuration reference |
| info/files_and_formats.md | File format specifications |
Big Tent is both a Rust crate (usable to read ADGs from files) and a server with a REST API.
# Serve a cluster directory
bigtent --rodeo /path/to/cluster/
# With custom host and port
bigtent --rodeo /path/to/cluster/ --host 0.0.0.0 --port 8080
# Pre-cache index for faster queries (uses more memory)
bigtent --rodeo /path/to/cluster/ --cache-index trueuse bigtent::rodeo::goat::GoatRodeoCluster;
use bigtent::rodeo::goat_trait::GoatRodeoTrait;
let clusters = GoatRodeoCluster::cluster_files_in_dir(path, false, vec![]).await?;
if let Some(item) = clusters[0].item_for_identifier("gitoid:blob:sha256:...") {
println!("Found: {:?}", item);
}Look up identifiers from a JSON file without starting a server:
# Output to stdout
bigtent --rodeo /path/to/cluster/ --lookup identifiers.json
# Output to a file
bigtent --rodeo /path/to/cluster/ --lookup identifiers.json --output results.jsonThe input file must be a JSON array of identifier strings:
["gitoid:blob:sha256:abc123...", "pkg:npm/lodash@4.17.21"]Output is a JSON object mapping each identifier to its Item or null:
{
"gitoid:blob:sha256:abc123...": { "identifier": "...", "connections": [...], ... },
"pkg:npm/lodash@4.17.21": null
}All endpoints are available at both / and /omnibor/ prefixes.
GET /openapi.json- Full OpenAPI 3.1 specification (auto-generated from code)
GET /item/{gitoid}orGET /item?identifier=...- Get a single ItemPOST /bulk- Get multiple Items (POST array of GitOID strings)
GET /aa/{gitoid}orGET /aa?identifier=...- Resolve alias to canonical ItemPOST /aa- Bulk alias resolution
GET /north/{gitoid}- Find containers/builders (traverse upward viabuild:up,alias:to,contained:up)POST /north- Bulk north traversalGET /north_purls/{gitoid}- Same as north, but return only Package URLsGET /flatten/{gitoid}- Find contained items (traverse downward)GET /flatten_source/{gitoid}- Flatten with source information
GET /node_count- Total items in the clusterGET /purls- Download all Package URLs as text file
In path parameters ({gitoid}), identifiers are not URL encoded. This allows
copy/pasting Package URLs directly without escaping.
Combine multiple clusters into one:
bigtent --fresh-merge /path/to/cluster1/ /path/to/cluster2/ --dest /output/See PERFORMANCE.md for tuning merge operations.
To create an OmniBOR Corpus (cluster files), use Goat Rodeo.
BigTent Items are identified by GitOIDs or Package URLs (PURLs):
gitoid:blob:sha256:fee53a18d32820613c0527aa79be5cb30173c823a9b448fa4817767cc84c6f03
pkg:npm/lodash@4.17.21
pkg:maven/org.apache.logging.log4j/log4j-core@2.17.0
A GitOID has the form gitoid:<object_type>:<hash_algorithm>:<hex_digest>.
See GETTING_STARTED.md for details.
Multi-architecture images (amd64, arm64) are published to Docker Hub:
docker pull spicelabs/bigtent:latest
docker run -p 3000:3000 -v /path/to/clusters:/data \
spicelabs/bigtent --rodeo /data --host 0.0.0.0 --port 3000See GETTING_STARTED.md for full Docker usage including merges and lookups.
BigTent does not include built-in authentication or authorization. All API endpoints are publicly accessible. For production, place BigTent behind a reverse proxy or use network-level access controls.
BigTent outputs structured JSON logs to stderr. Control verbosity with the
RUST_LOG environment variable:
RUST_LOG=info bigtent --rodeo /path/to/cluster/Every HTTP request is logged with URI, status code, and response time.
There is no built-in metrics endpoint; parse the JSON logs or use an
external observability tool. The GET /node_count endpoint serves as a
basic health check.
| Variable | Description | Example |
|---|---|---|
RUST_LOG |
Log level filter | RUST_LOG=info, RUST_LOG=bigtent=debug |
RUST_BACKTRACE |
Enable stack traces | RUST_BACKTRACE=1 |
No config.toml or other configuration files are needed. All configuration
is via CLI arguments. See info/config.md for the complete
reference.
Send SIGHUP to reload cluster files without restarting:
kill -HUP <bigtent_pid>Apache 2.0