Skip to content

Commit 214b059

Browse files
committed
Init
1 parent 626e351 commit 214b059

22 files changed

+2260
-1
lines changed

README.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,30 @@
1-
# MiniLongBench
1+
# *MiniLongBench*
2+
3+
Welcome to the "*MiniLongBench* tutorials"!
4+
5+
## Getting Started
6+
7+
8+
Ensure that you have Python 3.11.7 installed on your machine. You can download it from the [official Python website](https://www.python.org/downloads/).
9+
10+
After ensuring you have the correct Python version, install all the required packages listed in `requirements.txt` using `pip`. Run the following command:
11+
12+
```shell
13+
pip install -r requirements.txt
14+
```
15+
16+
17+
## Tutorials Overview
18+
19+
This repository contains three Jupyter notebooks that serve as tutorials:
20+
21+
1. **Predict the scores of LLMs on the full LongBench benchmark (`eval_by_pred.ipynb`):**
22+
- This notebook show how to obtain minilongbench socres by predicting the scores of LLMs on the full LongBench benchmark
23+
24+
2. **Directly calculate the scores of LLMs on MiniLongBench (`eval_directly.ipynb`):**
25+
- This notebook show how to obtain minilongbench socres directly
26+
27+
28+
29+
30+

data/anchor.pkl

729 Bytes
Binary file not shown.

data/irt_dataset.jsonlines

Lines changed: 20 additions & 0 deletions
Large diffs are not rendered by default.

data/irt_model/best_parameters.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

data/irt_model/parameters.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

data/longbench.pickle

1.45 MB
Binary file not shown.

data/longbench_embeddings.pkl

37.1 MB
Binary file not shown.

data/longbench_embeddings_pca.pkl

371 KB
Binary file not shown.

data/new_anchor.pkl

1.09 KB
Binary file not shown.

data/new_irt_model/best_parameters.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)