Skip to content

Storage and clean up #713

Description

@Didayolo

Implementing a storage quota and effectively managing capacity could be crucial for enhancing Codabench's scalability and ensuring long-term service availability in production.

Features

Deleting files from the MinIO

@OhMaley: Add a button, in admin analytics page, to manually starts a script removing all orphans files.

Quota

We need to:

  • Impose a storage quota to each user (e.g. 15GB, platform setting) Quota #1254
  • Idea: separated quota for each joined competition
  • Admins should be able to manually increase the quota of any user (on Django admin interface) Quota #1254
  • Have an user interface to manage and delete submissions
    Solved by: Resource interface : Cleanup and Quota #918
  • Actually, submissions can't be removed if there are part of a benchmark (which makes sense to avoid spamming submissions when there is a daily limit).
  • Solution: submissions are counted on the quota of the participants, but they can remove a submission (even if it's part of a benchmark) if it's not on the leaderboard. Submission soft-delete functionality added #1738
  • Be able to delete tasks when deleting a benchmark Task deletion #810
    Partially Solved by: Resource interface : Cleanup and Quota #918
  • Be able to re-use past submissions Re-using resources to save time and storage space #632
  • Failed submission when deleted still remains as Data (deleted from leaderboard/submission panel but can be found in submissions in resources)
  • The dumps should be counted too
  • Refresh the quota figure automatically when submisison/dataset is uploaded in resource interface
  • Readable format for the quota in the user table in admin interface .e.g MB

Light/archive mode after competition end

Option to clean competitions when there are completed. It would for instance keep only leaderboard submissions.

Automatic cleaning

  • Failed submissions, orphans datasets, useless benchmarks, etc. should be marked and cleaned automatically. Users could manually unmark the object to keep it (from the resources interface). Deleted object should be kept in a trash can for one month. This feature needs to be discussed.
  • Clean up old failed uploads #302 (schedule)

Monitoring and statistics

Per benchmark submission size limit

Migrate the CodaLab feature on Codabench. This may be less important once the quota feature is implemented.

  • Limit the size of submissions
  • Django admin interface to define the limit independantly for each benchmark

Interface

@ihsaan-ullah

Improve resources interface

The interface for managing datasets, submissions and tasks could be improved. Indeed, improving the resources interface is also part of helping participants to manage their storage space.

As shown in the screenshot below, we can manage and remove submissions from the "resources" interface. However, we do not track what the benchmark on which the submission was made, so it is hard to know what we are removing.

Capture d’écran 2023-04-29 à 01 41 54

Some issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementFeature suggestions and improvementsPost-itInternal ideas

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions