Q&A with ChatGPT enriched with youtube transcriptions

This program uses OpenAI's GPT-3 to generate answers to your questions based on transcriptions of presentations on AI from YouTube.

Dependencies

OpenAI API key, stored in file openai.key
Cassandra database supporting vector search. Currently that means you need to build and run this branch: https://github.com/datastax/cassandra/tree/cep-vsearch
- TLDR git clone, git checkout cep-vsearch, ant jar, bin/cassandra -f
- ⚠️ !Note! This is a prerelease and uses FLOAT VECTOR[N] syntax for declaring a vector column. This will change in the near future.
JDK 11. Exactly 11.
You will be able to run cqlsh with vector support if you run bin/cqlsh from the cassandra source root
You can install the Python dependencies for cassgpt by running pip install -r requirements.txt from this source tree.

Usage

python gen-qa-openai.py [--load_data]

Specifying --load_data will will download the dataset, merge the transcriptions into larger chunks, generate embeddings for each chunk using OpenAI's text-embedding-ada-002 model, and insert the chunks and embeddings into the database. This will take around twenty minutes and cost about $5 as of May 2023. This only needs to be done once.

Once the dataset is loaded, the program will prompt you for a question; it will find the most relevant context from the transcriptions using Cassandra vector search, and feed the resulting context + question to OpenAI to generate an answer to your query.

Assumes Cassandra is running on localhost, hack the source if it's somewhere else.

Need to start over?

Instead of rebuilding the embeddings from scratch (slow!), dump them from Cassandra and re-load them into a fresh database.

python dump.py python load.py

Also assumes Cassandra is running on localhost.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.gitignore		.gitignore
README.md		README.md
ai.py		ai.py
db.py		db.py
dump.py		dump.py
gen-qa-openai.py		gen-qa-openai.py
load.py		load.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q&A with ChatGPT enriched with youtube transcriptions

Dependencies

Usage

Need to start over?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Q&A with ChatGPT enriched with youtube transcriptions

Dependencies

Usage

Need to start over?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages