A simple python based UI in front of a zero-shot RAG prompt to help users generate beats using AI. While the generated beats are not of high quality, the ability to quickly generate many simple ideas that can act as seeds for new and interesting ideas is extremely useful when you are stuck.
- The user can enter their OpenAI key and model on the first screen.
- On the second screen they can enter their style prompt and the tempo of the song.
- The API integration to OpenAI will return a json with the relevant result that will be converted to midi info stored in the
./beat_sessionsfolder.
The main aim with this simple project was to answer the following questions,
-
Is it possible to generate the output from an AI in the required format with 90% accuracy?
Yes, it is possible to do this with good prompt engineering. By providing the required format and instructions in the RAG prompt, we can see a high degree of success.
-
Can it do specialised tasks, like generating high quality beats, from a user's prompt?
No, the quality of the output (beat midi) is not of high quality nor can we control the output parameters in a predictable way. Example, we cannot generate beats with the same number of bars across runs, even though this info was provided in the final input prompt to the LLM.
-
What kinds of errors, exceptions, edge cases and other scenarios can come up in a production system that uses AI which can lead to a degraded experience for the user?
Since it is an api integration, we will have Rate Limits, Billing Quota Exceeded, 3rd party APIs downtime due to outage or maintenance.
Since we are using an AI whose output is non determenistic, we will have Junk output, varying quality of output and non compliance with the required output format.
Additionally, AI systems add a lot of latency therefore caching and/or using local vector chains with HNSW/keyword search before jumping to external LLM models might be an important consideration.
-
How do UI elements work in Python for a desktop app?
We can use
tkinterto generate windows apps.
Any system that uses a simple integration with OpenAI or any LLM model should tolerate high variance and errors. AI must be used with high degree of human in the loop approval systems in more critical and irreversible paths.