Releases · tabularis-ai/be_great

What's Changed (v0.0.12)

Added conditions parameter for constrained sampling, enabling logical constraints during generation (e.g. conditions={"age": ">= 30", "sex": "== 'Female'"}). Constraints are enforced at the token level using a trie-based LogitsProcessor, guaranteeing that every generated row satisfies the specified conditions. This closes GitHub issue #62.

Added a comprehensive metrics suite (be_great.metrics) for evaluating synthetic data quality across four dimensions:

Statistical: ColumnShapes, ColumnPairTrends, BasicStatistics
Privacy: DistanceToClosestRecord, kAnonymization, lDiversity, IdentifiabilityScore, DeltaPresence, MembershipInference
Utility: MLEfficiency (train-on-synthetic, test-on-real)
Discriminator: DiscriminatorMetric

Revamped LoRA fine-tuning support with a new lora_config parameter for full control over LoRA hyperparameters, automatic detection of target modules across model architectures, and proper save/load of LoRA adapter weights. peft is now an optional dependency installable via pip install be_great[lora].

Added random_conditional_col parameter to fit() (enabled by default). A different random column is selected for preconditioning in each training epoch, preventing any single column from being overfitted and producing more balanced synthetic data.

minor: Added scipy as a required dependency. Updated default model in Colab example to tabularisai/Qwen3-0.3B-distil. Added new examples for constrained sampling and random preconditioning. Improved device management with centralized _resolve_device(). Fixed typo in _partial_df_to_prompts. Cleaned up old dist artifacts from the repository.

What's Changed

Added guided_sampling, a new functionality for more reliable data generation using feature-by-feature guidance patterns. This addresses several sampling issues reported in GitHub issue #45.

Added float_precision parameter to the GReaT class which allows controlling the decimal precision of floating-point values. Setting this parameter helps reduce token usage and improve generation quality for numerical data.

Improved error handling with clearer, more actionable feedback when sampling fails.

minor: Set report_to=[] as default to disable Weights & Biases logging unless explicitly enabled.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed (v0.0.12)

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Uh oh!

Releases: tabularis-ai/be_great

be_great v0.0.13

What's Changed (v0.0.12)

Uh oh!

be_great v0.0.9

What's Changed

Uh oh!