Skip to content

Partition summary for embeddings #12

@vahuja4

Description

@vahuja4

Interesting approach for drift detection! Can you please tell me if the partition summary in the case of embeddings is the same as below (https://dm4ml.github.io/gate/how-it-works/) or are you taking into account other factors:
coverage: The fraction of the column that has non-null values.
mean: The mean of the column.
p50: The median of the column.
num_unique_values: The number of unique values in the column.
occurrence_ratio: The count of the most frequent value divided by the total count.
p95: The 95th percentile of the column.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions