multi-modal-ai

Here are 11 public repositories matching this topic...

DHT-AI-Studio / RAPTOR

RAPTOR (Rapid AI-Powered Text and Object Recognition) is an AI-native Content Insight Engine that transforms passive media storage into an intelligent knowledge platform through automated analysis, semantic search, and actionable insights. RAPTOR reducing manual tagging by 85% and making content discovery 10x faster.

nlp machine-learning microservices ai computer-vision deep-learning artificial-intelligence semantic-search ai-framework audio-processing content-analysis digital-asset-management video-analysis vector-database ai-automation llm multi-modal-ai content-intelligence ai-orchestration

Updated Feb 2, 2026
Python

Tanush1912 / sales-forge-backend

Star

Sales Forge is a high performance, real time voice interaction platform designed to train sales representatives through adaptive AI personas. It provides a low latency, immersive roleplay experience that simulates real world sales challenges.

python docker websockets postgresql roleplay real ai-agents conversational-ai fastapi voice-ai generative-ai sales-training sales-enablement multi-modal-ai coaching-platform

Updated Jan 20, 2026
Python

veiston / Hygieia

Star

Hygieia (Desktop Doctor) is a local, private medical diagnosis AI. Runs with Python + Ollama & supports images/docs.

ai medical web-scraping ai-agents vision-language-model multi-modal-ai mmllm

Updated Feb 10, 2026
Python

nickcottrell / vrgb-kafka

Star

Color-based semantic routing for Apache Kafka - Tag events with RGB hex codes for flexible consumer-side filtering. Eliminates topic proliferation and enables dynamic routing without payload deserialization. Python reference implementation with validated 5x speedup over content-based routing.

python distributed-systems machine-learning kafka stream-processing apache-kafka real-time-processing message-broker event-routing multi-modal-ai

Updated Nov 15, 2025
Python

mwasifanwar / ChronoPredict

Star

Multi-modal system analyzing social media, news, art, and music to predict emerging cultural movements and artistic trends years before they mainstream.

creative-ai cultural-analytics social-dynamics trend-prediction multi-modal-ai multi-modal-ai-analysis

Updated Nov 11, 2025
Python

ShivamMishra1603 / video-xplore

Star

AI video analysis + web research in one tool. Upload videos, ask questions, get comprehensive insights with current web data.

multi-modal-ai agentic-ai google-gemini-api

Updated Aug 30, 2025
Python

metacore-stack / WanderLens-AI

Star

An intelligent travel planning platform powered by GPT-4 and DALL-E 3 that generates personalized, optimized itineraries with route optimization, budget allocation, and AI-generated visual content through advanced prompt engineering and multi-modal AI integration.

machine-learning ai openai fastapi itinerary-generator streamlit gpt-4 prompt-engineering travel-tech langchain dall-e-3 multi-modal-ai travel0planning

Updated Feb 3, 2026
Python

Diluksha-Upeka / Neurospace

Star

A Multi-Modal RAG Knowledge Engine An intelligent knowledge graph system that ingests video, audio, and PDF documents to create a connected semantic web. Features a graph-based retrieval engine (GraphRAG), multi-modal search, and an interactive React Flow visualization dashboard. Built with FastAPI, Next.js, Neo4j, and LlamaIndex.

python docker typescript neo4j nextjs knowledge-graph openai rag fastapi react-flow vector-database llamaindex genai multi-modal-ai graphrag

Updated Mar 6, 2026
Python

Lipeka / Multi-modal-Recommendation-System

Star

A multi-modal recommender system that suggests books or music based on: Voice input, Audio song recognition, Typed queries, Real-time weather in your city

python deep-learning artificial-intelligence speech-recognition gradio rag large-language-models multi-modal-ai

Updated Jun 16, 2025
Python

Aish-p / Text-Vision-Agent

Star

Text-Vision-Agent is an AI-powered assistant that generates images from text descriptions and provides detailed image descriptions. It combines image generation using FluxPipeline with vision-based language models like ChatOllama, enabling seamless text-to-image and image interpretation interactions.

generative-ai multi-modal-ai nlp-and-vision-integration chatollama fluxpipeline image-generation-and-description

Updated Feb 16, 2025
Python

Monographatmosphericphenomenon995 / reflective-reasoning-transformer

Star

🔍 Enhance reasoning accuracy with the Reflective Reasoning Transformer, leveraging causal reasoning graphs for better dynamic reasoning performance.

nlp deep-learning transformers agi causal-inference interpretability pre-training causal-graphs llm chain-of-thought multi-modal-ai adaptive-ai agentic-ai trending-agi relevant-agi-2025 self-reflective-architectures benchmark-outperformer ai-faithfulness

Updated Mar 13, 2026
Python

Improve this page

Add a description, image, and links to the multi-modal-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal-ai topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal-ai

Here are 11 public repositories matching this topic...

DHT-AI-Studio / RAPTOR

Tanush1912 / sales-forge-backend

veiston / Hygieia

nickcottrell / vrgb-kafka

mwasifanwar / ChronoPredict

ShivamMishra1603 / video-xplore

metacore-stack / WanderLens-AI

Diluksha-Upeka / Neurospace

Lipeka / Multi-modal-Recommendation-System

Aish-p / Text-Vision-Agent

Monographatmosphericphenomenon995 / reflective-reasoning-transformer

Improve this page

Add this topic to your repo