LibExcessive

High-performance, parallel file I/O with RAII semantics
High-performance on-disk data structures
A bunch of utility functions and classes

Try it in 5 minutes

Create a CMakeLists.txt:

cmake_minimum_required(VERSION 3.14)
project(BTreeDemo)

# Adds excessive to the project
include(FetchContent)

FetchContent_Declare(
        excessive
        GIT_REPOSITORY https://gitea.com/komozoi/excessive.git
        GIT_TAG v0.2.1
        GIT_SHALLOW TRUE
        GIT_PROGRESS ON
        SYSTEM
)

FetchContent_MakeAvailable(excessive)

# Create the demo executable and link excessive into it
add_executable(demo demo.cpp)
target_link_libraries(demo PRIVATE excessive)

And create a file demo.cpp:

#include <fs/FdHandle.h>
#include <fs/BTree.h>
#include <fcntl.h>
#include <random>
#include <cstdio>


struct btree_entry_s {
    int key;
    int value;

    static int compare(const btree_entry_s &a, const btree_entry_s &b) {
        return a.key - b.key;
    }
};

int main() {
    FdHandle file = FdHandle::open("btree.bin", O_RDWR | O_CREAT, 0644);
	if (!file) {
		printf("Failed to open file!\n");
		return 1;
	}

    // It is easy to check if the file already existed or was just created
    printf("File is %s.\n", file.isNew() ? "newly created" : "existing");

    BTree<btree_entry_s> tree(file, 0, btree_entry_s::compare);

    // Add 100 random elements to the tree
    std::random_device rd;
    for (int i = 0; i < 100; ++i) {
        btree_entry_s new_entry{(int) rd() % 5000, (int) rd()};
        tree.insert(new_entry);
    }

    // See what the next highest values are for given inputs
    // Each run this would change as the BTree grows
    for (int i = 0; i < 5000; i += 500) {
        btree_entry_s result{i, 0};
        if (tree.findNext(result)) {
            printf("Next highest entry from %i is (%i, %i)\n", i, result.key, result.value);
        } else {
            printf("No next highest entry found for %i\n", i);
        }
    }

    // All data is already written to the file (although not necessarily flushed)
    // For this reason, no cleanup is needed here for the BTree.

    // File automatically closes when all references go out of scope,
    // but can be closed manually with:
    // write_handle.close();

	return 0;
}

If you are using an IDE, this may be enough to import the project and run it - very convenient!

If not, run these commands to compile and run:

# Setup
mkdir build && cd build && cmake ..

# Compile
make

# Run the demo to create the data file
./demo

# Run the demo again to add more to the data file and see the effects
./demo

And that's it - efficient and persistent data storage in less than 60 lines of code. No extra installation steps or complex APIs. It just works.

Overview

LibExcessive is intended for large, data-heavy backend applications such as servers and data processing tools where speed and reliability matter. The design goals and features are:

Provide familiar, Java-like APIs and richer helper types, especially containers
- With less verbosity than Java APIs, thankfully.
Provide rich, threadsafe, and extremely efficient utilities for interacting with files
- Threadsafe file handles and transactions
- Mmap handles
- Open file reference counting
- Utilities for keeping data on-disk
  - BTree
  - Files with dynamically allocated regions
(planned) Utilities for building on-disk indexes and databases
Favor explicit memory and performance control. Many components are designed to be friendly to custom allocators and memory pools.
- Tracking memory separately for different components of an application, which helps to find memory hogs
- Safer allocation and memory management with less heap fragmentation
Small, focused algorithms and helpers for string handling, byte buffers, serialization, and more
Support modern C++ compilers from C++17 and up.

Much of the code was originally written to run on the Teensy 4.1, which is extremely memory constrained compared to our desktop computers, having a mere 1MiB of RAM. For this reason, there is a lot of code for tightly controlling memory usage and performance.

Everything is tested on Debian Linux currently, although any Unix flavor should work. I do not plan on supporting Windows.

Main repository is on Gitea at https://gitea.com/komozoi/excessive, but is mirrored to GitHub at https://github.com/komozoi/libexcessive.

Building

I develop this with CLion, which imports the CMake project for me, but if you prefer to build on the terminal, it's not hard:

git clone https://gitea.com/komozoi/excessive.git
cd libexcessive
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make

Usage Examples

Tests are included for almost everything, which you can use as a larger reference if needed.

File Access with Mmap

#include "fs/FdHandle.h"


int main() {
    const char* temp_filename = "mmap_test.tmp";

    // All file handles are smart pointers!
    FdHandle write_handle = FdHandle::open(temp_filename, O_RDWR | O_CREAT, 0660);
    MmapHandle write_mmap = write_handle.getMmapHandle(0, sizeof(my_struct_t));

    // Writing structs with mmap is easy and safe
    my_struct_t value{1, true, {}};
    write_mmap.write(value);

    // File automatically closes when all references go out of scope,
    // but can be closed manually with:
    // write_handle.close();
}

Bigint

Unlike in other libraries, the bigint implementation uses a fixed width. The bigint code was originally designed for use in an EVM implementation, where most values are 256-bit. This is designed to work like a typical register or even normal fixed-width datatype, just bigger, and acts like you would expect with truncation and such.

There are plenty of great libraries that implement variable-width bigint; there is no reason to add that to this library. Originally I wasn't going to include my own bigint implementation as I figured existing libraries were sufficient, but I changed my mind when I saw that they were all variable-length, which does not work for my typical applications.

Examples:

#include "bigint.h"

int main() {
    uint256_t a = uint256_t("0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF");
    uint256_t b("0x1234567890ABCDEF1234567890ABCDEF");

    uint256_t sum  = a + b;
    uint256_t prod = a * b;
    uint256_t pow  = b.pow(4);

    printf("a     = %s\n", a.toHexString().c_str());
    printf("b     = %s\n", b.toHexString().c_str());
    printf("a + b = %s\n", sum.toHexString().c_str());
    printf("a * b = %s\n", prod.toHexString().c_str());
    printf("b**4  = %s\n", pow.toHexString().c_str());

    // Output:
    // a     = ffffffffffffffffffffffffffffffff
    // b     = 1234567890abcdef1234567890abcdef
    // a + b = 0000000000000000000000001234567890abcdef1234567890abcdefffffffff
    // a * b = 1234567890abcdef1234567890abcdeeffffffffffffffedcba98765432111
    // b**4  = 33ee0e405772f4bd1fa6d7a4e8c14117ea371272c23e2b10

    // Default sizes include uint128_t, uint192_t, and uint256_t.  Custom sizes are also possible.
    uint192_t threeWordValue = "0x6935282358963433459348abcdef1ee7";
    UnsignedFixedWidthBigInt<7> sevenWordValue = "0x8b20159b1c579b1088048f054bedebfd02de6b23919371be36d872ec46fe9cebe684edd2675ab1101262b78877b3c09966366c07df0fcccf";

    // It is also possible to multiply directly with doubles
    // The result of the multiplication is floored and almost exact
    uint192_t productWithDouble = threeWordValue * 0.7311;
    // productWithDouble would be something like 0x4cead4b7be06ccb79c0e92711d09028c

    uint256_t x3("0x783924abc37678847777fcba");
    x3 = x3.root(3);  // 0xc6fc718e
}

Smart Pointer with Copy on Write Behavior

#include <alloc/pointer.h>

struct Data {
    int value;
};

int main() {
    // UNIQUE (default-style ownership)
    sp<Data> x(SpPointerType::UNIQUE, Data{10});

    // Copying a UNIQUE pointer does NOT immediately copy the data
    sp<Data> y = x;
    // x stays UNIQUE
    // y becomes COPY_ON_WRITE
    // both point to the same underlying object (for now)

    // First write triggers a deep copy
    y.mut().value = 20;
    // now:
    // x->value == 10
    // y->value == 20
    // they no longer share memory


    // You can keep copying before mutation is needed
    sp<Data> z = x;
    // still sharing with x

    z.mut().value = 30;
    // z detaches and becomes independent
    // x is still unchanged


    // SHARED mode = always shared, no copy-on-write
    sp<Data> sharedA(SpPointerType::SHARED, Data{100});
    sp<Data> sharedB = sharedA;

    sharedB.mut().value = 200;
    // both see the change:
    // sharedA->value == 200
    // sharedB->value == 200


    // Move = transfer ownership, no copies
    sp<Data> moved = std::move(sharedA);
    // sharedA is now null
    // moved owns the data


    // Scoped lifetime (RAII)
    {
        sp<Data> temp(SpPointerType::UNIQUE, Data{5});
        sp<Data> alias = temp;
        // alias is COPY_ON_WRITE

        // object is destroyed exactly once when both go out of scope
    }

    return 0;
}

Simple Containers

Simple examples showing the style of usage that matches the library design:

#include "ds/ArrayList.h"


int main() {
    ArrayList<int> list;
    list.add(1);
    list.add(2);
    list.addCopies(5, 3); // add three copies of 5

    for (int element: list)
        printf("value %d\n", element);

    return 0;
}

#include "ds/ArrayList.h"


int main() {
    // Initialize list as {2, 3, 4}
    ArrayList<int> list{2,3,4};

    // Add 1 to the beginning
	list.addFirst(1);

    // 1, 2, 3, 4
    for (int element: list)
        printf("value %d\n", element);
}

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.gitea/workflows		.gitea/workflows
.github		.github
include		include
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LibExcessive

Try it in 5 minutes

Overview

Building

Usage Examples

File Access with Mmap

Bigint

Smart Pointer with Copy on Write Behavior

Simple Containers

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LibExcessive

Try it in 5 minutes

Overview

Building

Usage Examples

File Access with Mmap

Bigint

Smart Pointer with Copy on Write Behavior

Simple Containers

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages