Re-update the LLM implementation by yilunzhao · Pull Request #48 · Yale-LILY/NLP4Code

yilunzhao · 2023-04-25T20:50:57Z

#46, re-update the implementation for llama, alpaca, santacoder

niansong1996 · 2023-04-25T20:57:53Z

Seems like there is an error from CI. I've seem this before, check here to see if it's useful.

yilunzhao · 2023-04-26T01:57:38Z

Hi @niansong1996, sorry for the late reply. I have resolved the CI error. It seems that I have to change the transformers version in requirements.txt to avoid the error.

niansong1996 · 2023-04-26T02:17:43Z

That is okay, what we can do is to use this branch to evaluate the new models before we decide the upgrade the transformers version in the main branch.

niansong1996 · 2023-04-27T06:02:02Z

@yilunzhao I am getting the following error when testing LLaMA:
RuntimeError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:46, unhandled cuda error, NCCL version 2.10.3

The command I ran is:
python finetuning/trainer.py validate --config finetuning/training_configs/few_shot/spider.yaml --model.beam_size 1 --data.val_max_instances 1 --data.val_batch_size 1 --model.print_generation_results true --model.print_eval_every_n_batches 1 --model.init_args.transformer_model_name decapoda-research/llama-7b-hf --data.init_args.transformer_model_name decapoda-research/llama-7b-hf --trainer.devices 2

Now if I use one GPU, I will get this error:
RuntimeError: CUDA error: no kernel image is available for execution on the device

Can you see if you can replicate those errors and figure out why they are happening?

…t llama-based model uses empty string as tokenizer_eos_token

yilunzhao · 2023-04-27T21:04:57Z

Hi @niansong1996, I think this error raised because the installed torch is incompatible with CUDA in ziva. Could you please try to re-install the torch by pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116, and see if it can resolve the issue?

And this is my pip freeze:

absl-py==1.4.0
aiohttp==3.8.4
aiosignal==1.3.1
appdirs==1.4.4
astunparse==1.6.3
async-timeout==4.0.2
attrs==23.1.0
cachetools==5.3.0
certifi==2022.12.7
charset-normalizer==3.1.0
click==8.1.3
deepspeed==0.6.7
docker-pycreds==0.4.0
docopt==0.6.2
docstring-parser==0.15
filelock==3.12.0
frozenlist==1.3.3
fsspec==2023.4.0
func-timeout==4.3.5
gitdb==4.0.10
GitPython==3.1.31
google-auth==2.17.3
google-auth-oauthlib==1.0.0
grpcio==1.54.0
hjson==3.1.0
huggingface-hub==0.14.1
idna==3.4
importlib-metadata==6.6.0
joblib==1.2.0
jsonargparse==4.15.0
Markdown==3.4.3
MarkupSafe==2.1.2
multidict==6.0.4
ninja==1.11.1
nltk==3.8.1
numpy==1.24.3
oauthlib==3.2.2
openai==0.27.5
overrides==7.3.1
packaging==23.1
pandas==2.0.1
pathtools==0.1.2
Pillow==9.5.0
pipreqs==0.4.13
protobuf==4.22.3
psutil==5.9.5
py-cpuinfo==9.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pydantic==1.10.7
pyDeprecate==0.3.2
python-dateutil==2.8.2
pytorch-lightning==1.7.4
pytz==2023.3
PyYAML==6.0
regex==2023.3.23
requests==2.29.0
requests-oauthlib==1.3.1
rsa==4.9
scipy==1.10.1
sentencepiece==0.1.98
sentry-sdk==1.21.0
setproctitle==1.3.2
six==1.16.0
smmap==5.0.0
sqlparse==0.4.4
tensorboard==2.12.2
tensorboard-data-server==0.7.0
tensorboard-plugin-wit==1.8.1
tokenizers==0.13.3
torch==1.12.1+cu116
torchaudio==0.12.1+cu116
torchmetrics==0.9.3
torchvision==0.13.1+cu116
tqdm==4.65.0
transformers @ git+https://github.com/huggingface/transformers@11fd2c773b11c3fcfe0fa25aa4b92db03c83636c
tree-sitter==0.19.0
typing_extensions==4.5.0
tzdata==2023.3
urllib3==1.26.15
wandb==0.15.0
Werkzeug==2.3.1
yarg==0.1.9
yarl==1.9.2
zipp==3.15.0

…rchat, etc

re-upload the implementation for llama, alpaca, santacoder

529edce

yilunzhao mentioned this pull request Apr 25, 2023

Implementing more CodeLMs #41

Merged

Merge branch 'main' into yilunzhao/llm_implementation

7caf0ea

yilunzhao force-pushed the yilunzhao/llm_implementation branch from 3f2247e to 56ba84f Compare April 25, 2023 22:46

try to pass CI check

9dbc8cd

yilunzhao force-pushed the yilunzhao/llm_implementation branch 2 times, most recently from d537aba to 9dbc8cd Compare April 26, 2023 01:49

Merge branch 'main' into yilunzhao/llm_implementation

06c7858

modify process_output function in exectutor.py to handle the case tha…

ce36f13

…t llama-based model uses empty string as tokenizer_eos_token

yilunzhao added 3 commits May 5, 2023 16:03

fix error related to llama eos_token

60fd89b

add starcoder, gpt-neox-20b; test gpt-j-6b

83546d9

archieve code for NeurIPS exps; add gpt-4, pythia, replit, dolly, sta…

86d3bd5

…rchat, etc

niansong1996 merged commit cd3a9fb into main Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-update the LLM implementation#48

Re-update the LLM implementation#48
niansong1996 merged 8 commits into
mainfrom
yilunzhao/llm_implementation

yilunzhao commented Apr 25, 2023 •

edited

Loading

Uh oh!

niansong1996 commented Apr 25, 2023

Uh oh!

yilunzhao commented Apr 26, 2023

Uh oh!

niansong1996 commented Apr 26, 2023

Uh oh!

niansong1996 commented Apr 27, 2023 •

edited

Loading

Uh oh!

yilunzhao commented Apr 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yilunzhao commented Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

niansong1996 commented Apr 25, 2023

Uh oh!

yilunzhao commented Apr 26, 2023

Uh oh!

niansong1996 commented Apr 26, 2023

Uh oh!

niansong1996 commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yilunzhao commented Apr 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yilunzhao commented Apr 25, 2023 •

edited

Loading

niansong1996 commented Apr 27, 2023 •

edited

Loading