SumEstimation

SumEstimation is a framework for estimating the sum of scores across large-scale embedding datasets using a variety of sampling strategies. It is designed to support both approximate and hybrid estimation techniques across multiple similarity functions (KDE, Softmax) and dataset types (image or text embeddings).

Motivation

Computing exact sums over large embedding datasets (millions of vectors) is computationally expensive. This repo explores sampling-based estimators that can reliably approximate the total sum with fewer computations, enabling scalable deployment in ranking, evaluation, and retrieval workflows.

Supported Methods

TopK: Uses nearest neighbors by similarity.
Random: Uniform random sampling of dataset points.
OurAlgorithm: An adaptive sampler that selects a budgeted number of items per query.
Combined: Combines TopK and Random sampling (hybrid).

Each method supports per-query evaluation and produces score estimates under a variety of hyperparameter and estimator configurations.

Installation

Clone the repository and install the required dependencies:

git clone https://github.com/your-org/sum-estimation-public.git
pip install -r requirements.txt

Citation

If you use this repository in your work, please cite the paper or contact us via the repository.

Contact

For questions, feedback, or contributions:

Steve Mussmann mussmann@gatech.edu

Mehul Smriti Raje mehul@coactive.ai, mehul.raje@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
1. create_embeddings		1. create_embeddings
2. create qdrant cluster		2. create qdrant cluster
3. run experiments		3. run experiments
4. plotting		4. plotting
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SumEstimation

Motivation

Supported Methods

Installation

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SumEstimation

Motivation

Supported Methods

Installation

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages