Skip to content

Commit cf5dee2

Browse files
readme for llama2 and azure
1 parent d844995 commit cf5dee2

3 files changed

Lines changed: 189 additions & 30 deletions

File tree

README.md

Lines changed: 39 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,19 @@ and much more!
2323

2424
https://github.com/silvanmelchior/IncognitoPilot/assets/6033305/05b0a874-6f76-4d22-afca-36c11f90b1ff
2525

26-
The video shows GPT-4 in action.
26+
The video shows Incognito Pilot with GPT-4.
2727
While your conversation and approved code results are sent to OpenAI, your **data is kept locally** on your machine.
28-
And you can go even further and use Llama 2 to be fully locally.
28+
And you can go even further and use Llama 2 to have everything running on your machine.
2929

30-
## :package: Installation
30+
## :package: Installation (GPT via OpenAI API)
31+
32+
This section shows how to install **Incognito Pilot** using a GPT model via OpenAI's API. For
33+
34+
- **Llama 2**, check [Installation for Llama 2](/docs/INSTALLATION_LLAMA.md) instead, and for
35+
- **GPT on Azure**, check [Installation with Azure](/docs/INSTALLATION_AZURE.md) instead.
36+
- If you don't have docker, you can install **Incognito Pilot** on your system directly, using the development setup (see below).
37+
38+
Follow these steps:
3139

3240
1. Install [docker](https://www.docker.com/).
3341
2. Create an empty folder somewhere on your system.
@@ -41,21 +49,18 @@ And you can go even further and use Llama 2 to be fully locally.
4149

4250
```shell
4351
docker run -i -t \
44-
-p 3030:3030 -p 3031:3031 \
52+
-p 3030:80 \
4553
-e OPENAI_API_KEY="sk-your-api-key" \
4654
-v /home/user/ipilot:/mnt/data \
4755
silvanmelchior/incognito-pilot:latest-slim
4856
```
4957

5058
You can now visit http://localhost:3030 and should see the **Incognito Pilot** interface.
5159

52-
Some final remarks:
53-
54-
- If you don't have docker, you can install **Incognito Pilot** on your system directly, using the development setup (see below).
55-
- You can also run **Incognito Pilot** with the free trial credits of OpenAI, without adding a credit card.
56-
At the moment, this does not include GPT-4 however, so see below how to change the model to GPT-3.5.
60+
It's also possible to run **Incognito Pilot** with the free trial credits of OpenAI, without adding a credit card.
61+
At the moment, this does not include GPT-4 however, so see below how to change the model to GPT-3.5.
5762

58-
## :rocket: Getting started
63+
## :rocket: Getting started (GPT)
5964

6065
In the **Incognito Pilot** interface, you will see a chat interface, with which you can interact with the model.
6166
Let's try it out!
@@ -78,8 +83,6 @@ To change this, head back to the console and press Ctrl-C to stop the container.
7883
Now re-run the command, but remove the `-slim` suffix from the image.
7984
This will download a much larger version, equipped with [many packages](/docker/requirements_full.txt).
8085

81-
## :gear: Settings
82-
8386
### Change model
8487

8588
To use another model than the default one (GPT-4), set the environment variable `LLM`.
@@ -89,32 +92,40 @@ OpenAI's GPT models have the prefix `gpt:`, so to use GPT-3.5 for example (the o
8992
-e LLM="gpt:gpt-3.5-turbo"
9093
```
9194

92-
Please note that GPT-4 is considerably better in this interpreter setup than GPT-3.5.
95+
Please note that GPT-4 is considerably better in the interpreter setup than GPT-3.5.
96+
97+
## :gear: Settings
9398

9499
### Change port
95100

96-
Per default, the UI is served on port 3030 and contacts the interpreter at port 3031.
97-
This can be changed to any ports using the port mapping of docker.
98-
However, the new port for the interpreter also needs to be communicated to the UI, using the environment variable `INTERPRETER_URL`.
99-
For example, to serve the UI on port 8080 and the interpreter on port 8081, run the following:
101+
To serve the UI at a different port than 3030, just expose the internal port 80 to a different one, for example 8080:
100102

101103
```shell
102104
docker run -i -t \
103-
-p 8080:3030 -p 8081:3031 \
104-
-e OPENAI_API_KEY="sk-your-api-key" \
105-
-e INTERPRETER_PORT=8081 \
106-
-v /home/user/ipilot:/mnt/data \
105+
-p 8080:80 \
106+
... \
107107
silvanmelchior/incognito-pilot
108108
```
109109

110-
### Further settings
110+
### Timeout
111+
112+
Per default, the Python interpreter stops after 30 seconds.
113+
To change this, set the environment variable `INTERPRETER_TIMEOUT`.
114+
For 2 minutes for example, add the following to the docker run command:
115+
116+
```shell
117+
-e INTERPRETER_TIMEOUT="120"
118+
```
111119

112-
The following further settings are available
120+
### Autostart
113121

114-
- Per default, the Python interpreter stops after 30 seconds.
115-
To change this, set the environment variable `INTERPRETER_TIMEOUT`.
116-
- To automatically start **Incognito Pilot** with docker / at startup, remove the remove `-i -t` from the run command and add `--restart always`.
117-
Together with a bookmark of the UI URL, you'll have **Incognito Pilot** at your fingertips whenever you need it.
122+
To automatically start **Incognito Pilot** with docker / at startup, remove the `-i -t` from the run command and add the following:
123+
124+
```shell
125+
--restart always
126+
```
127+
128+
Together with a bookmark of the UI URL, you'll have **Incognito Pilot** at your fingertips whenever you need it.
118129

119130
## :toolbox: Own dependencies
120131

@@ -149,9 +160,7 @@ Then run the container like this:
149160

150161
```shell
151162
docker run -i -t \
152-
-p 3030:3030 -p 3031:3031 \
153-
-e OPENAI_API_KEY="sk-your-api-key" \
154-
-v /home/user/ipilot:/mnt/data \
163+
... \
155164
incognito-pilot-custom
156165
```
157166

docs/INSTALLATION_AZURE.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# :package: Installation (GPT via Azure)
2+
3+
This section shows how to install **Incognito Pilot** using a GPT model via Azure.
4+
Follow these steps:
5+
6+
1. Install [docker](https://www.docker.com/).
7+
2. Create an empty folder somewhere on your system.
8+
This will be the working directory to which **Incognito Pilot** has access to.
9+
The code interpreter can read your files in this folder and store any results.
10+
In the following, we assume it to be */home/user/ipilot*.
11+
3. Login to Azure portal and create an [Azure OpenAI Service](https://azure.microsoft.com/en-us/products/ai-services/openai-service-b).
12+
4. You will see the access key and endpoint, which we will use later.
13+
5. Open Azure OpenAI Studio and deploy a model.
14+
6Now, just run the following command (replace your working directory, model-name and API information):
15+
16+
```shell
17+
docker run -i -t \
18+
-p 3030:80 \
19+
-e LLM="gpt-azure:your-deployment-name" \
20+
-e AZURE_API_KEY="your-azure-openai-api-key" \
21+
-e AZURE_API_BASE="https://your-azure-openai-service-name.openai.azure.com/" \
22+
-v /home/user/ipilot:/mnt/data \
23+
silvanmelchior/incognito-pilot:latest-slim
24+
```
25+
26+
You can now visit http://localhost:3030 and should see the **Incognito Pilot** interface.
27+
28+
Make sure you have access to a model which is capable of function calling, otherwise you will get an error similar to "unknown argument 'function'".
29+
30+
Let's head back to the [Getting Started](/README.md#rocket-getting-started-gpt) section.

docs/INSTALLATION_LLAMA.md

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# :package: Installation (Llama 2)
2+
3+
This section shows how to install **Incognito Pilot** using Llama 2.
4+
Please note that you will only get satisfactory results with the largest model *llama-2-70b-chat*, which needs considerable hardware resources.
5+
And even then, the experience will not be comparable to GPT-4, since Llama 2 was not fine-tuned for this task.
6+
7+
Nevertheless, it's a lot of fun to see what's already possible with open-source models.
8+
At the moment, there are two ways of using **Incognito Pilot** with Llama 2:
9+
10+
- Using a cloud API from [replicate](https://replicate.com/).
11+
While you don't have the advantage of a fully local setup here, you can try out the 70B model in a quick way without owning powerful hardware.
12+
- Using Hugging Face's [Text Generation Inference](https://github.com/huggingface/text-generation-inference) container,
13+
which allows you to run llama 2 locally with a simple `docker run` command.
14+
15+
## Replicate
16+
17+
Follow these steps:
18+
19+
1. Install [docker](https://www.docker.com/).
20+
2. Create an empty folder somewhere on your system.
21+
This will be the working directory to which **Incognito Pilot** has access to.
22+
The code interpreter can read your files in this folder and store any results.
23+
In the following, we assume it to be */home/user/ipilot*.
24+
3. Create a [Replicate](https://replicate.com/) account,
25+
add a [credit card](https://replicate.com/account/billing)
26+
and copy your [API key](https://replicate.com/account/api-tokens).
27+
4. Now, just run the following command (replace your working directory and API key):
28+
29+
```shell
30+
docker run -i -t \
31+
-p 3030:80 \
32+
-e LLM="llama-replicate:replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1" \
33+
-e REPLICATE_API_KEY="your-replicate-api-key" \
34+
-v /home/user/ipilot:/mnt/data \
35+
silvanmelchior/incognito-pilot:latest-slim
36+
```
37+
38+
You can of course also choose a [different model](https://replicate.com/blog/all-the-llamas), but the smaller ones are much less suited for this task.
39+
40+
Now visit http://localhost:3030 and should see the **Incognito Pilot** interface.
41+
Does it work? Great, let's move to the [Getting started](#rocket-getting-started-llama-2) section.
42+
43+
## Text Generation Inference
44+
45+
Follow these steps:
46+
47+
1. Install [docker](https://www.docker.com/).
48+
2. Create an empty folder somewhere on your system.
49+
This will be the working directory to which **Incognito Pilot** has access to.
50+
The code interpreter can read your files in this folder and store any results.
51+
In the following, we assume it to be */home/user/ipilot*.
52+
3. Create a [Hugging Face](https://huggingface.co/) account.
53+
4. Make sure you get access to the [Llama 2 model weights](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) on Hugging Face.
54+
5. In the *Files and versions* tab, download the following three files (we assume them to be in */home/user/tokenizer*):
55+
- tokenizer.json
56+
- tokenizer.model
57+
- tokenizer_config.json
58+
6. Create an [access token](https://huggingface.co/settings/tokens).
59+
60+
Now, let's first run the *Text Generation Inference* service.
61+
Check out their [Readme](https://github.com/huggingface/text-generation-inference#readme).
62+
I had to run something similar to this:
63+
64+
```shell
65+
docker run \
66+
--gpus all \
67+
--shm-size 1g \
68+
-p 8080:80 \
69+
-v /home/user/tgi_cache:/data
70+
-e HUGGING_FACE_HUB_TOKEN=hf_your-huggingface-api-token
71+
ghcr.io/huggingface/text-generation-inference \
72+
--model-id "meta-llama/Llama-2-70b-chat-hf"
73+
```
74+
75+
You can of course also choose a different model, but the smaller ones are much less suited for this task.
76+
Once the container shows a success message, you are ready for the next step.
77+
78+
Visit http://localhost:8080/info.
79+
You should see a JSON with model information.
80+
We will need the value for *max_total_tokens* in the next command.
81+
82+
Now, just run the following command (replace your directories and max tokens):
83+
84+
```shell
85+
docker run -i -t \
86+
-p 3030:80 \
87+
-e LLM="llama-tgi:http://host.docker.internal:8080" \
88+
-e MAX_TOKENS="your-max-tokens" \
89+
-e TOKENIZER_PATH="/mnt/tokenizer/tokenizer.model" \
90+
-v /home/user/tokenizer:/mnt/tokenizer \
91+
-v /home/user/ipilot:/mnt/data \
92+
silvanmelchior/incognito-pilot:latest-slim
93+
```
94+
95+
Visit http://localhost:3030 and should see the **Incognito Pilot** interface.
96+
97+
## :rocket: Getting started (Llama 2)
98+
99+
In the **Incognito Pilot** interface, you will see a chat interface, with which you can interact with the model.
100+
Let's try it out!
101+
102+
1. **File Access**: Type "Create a text file with all numbers from 0 to 100".
103+
You will see how the *Code* part of the UI shows you a Python snippet.
104+
As soon as you approve, the code will be executed on your machine (within the docker container).
105+
You will see the result in the *Result* part of the UI.
106+
As soon as you approve it, it will be sent back to the model.
107+
In the case of using an API (like Replicate), this of course also means that this result will be sent to their services.
108+
After the approval, the model will confirm you the execution.
109+
Check your working directory now (e.g. */home/user/ipilot*): You should see the file!
110+
2. **Math**: Type "What is 1 + 2 * 3 + 4 * 5 + 6 * 7 + 8 * 9?".
111+
The model will use the Python interpreter to come to the correct result.
112+
113+
Now you should be ready to use **Incognito Pilot** for your own tasks.
114+
One more thing: The version you just used has nearly no packages shipped with the Python interpreter.
115+
This means, things like reading images or Excel files will not work.
116+
To change this, head back to the console and press Ctrl-C to stop the container.
117+
Now re-run the command, but remove the `-slim` suffix from the image.
118+
This will download a much larger version, equipped with [many packages](/docker/requirements_full.txt).
119+
120+
Let's head back to the [Getting Started](/README.md#gear-settings) section.

0 commit comments

Comments
 (0)