A Image Based Caption Generator
BitNBuild-25 Project
A Pretrained Faster RCNN model (Trained on COCO Dataset) is used to detect objects in the image, which is then sent to Google's Gemini and responses are recieved in the JSON File and displayed in the frontend
- Python
- PyTorch
- FastAPI