Search Results

Found 97 repositories(showing 30)

multimodal_rag_for_industry

riedlerm

🧡55

Implementation and evaluation of multimodal RAG with text and image inputs for industrial applications

MIT

Python

Updated 3 weeks ago

Multimodal-RAG-Implementation

CornelliusYW

🧡55

This repository contains a Multimodal Retrieval-Augmented Generation (RAG) Pipeline that integrates images, audio, and text for advanced multimodal querying and response generation..

Jupyter Notebook

Updated 1 week ago

Enterprise-ready solution leveraging multimodal Generative AI (Gen AI) to enhance existing or new applications beyond text—implementing RAG, image classification, video analysis, and advanced image embeddings.

MIT

HCL

Updated 1 week ago

aiazureazure-ai+4

MM-PoisonRAG

HyeonjeongHa

🧡60

Official PyTorch implementation of "MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks"

Python

Updated 3 days ago

artificial-intelligencemachine-learningmultimodal-large-language-models+4

Multi-Model-Rag

utkartist

❤️30

Multimodal Retrieval-Augmented Generation (RAG) is an advanced technique that combines text and image data to enhance the capabilities of large language models (LLMs) like GPT-4. This tutorial will guide you through the process of implementing a multimodal RAG system using GPT-4 and Llama Index.

Jupyter Notebook

Updated 7 months ago

Multimodal-RAG-Gemini-MongoDB

ranasaurus9

❤️40

This is a sample code implementation of Multimodal RAG using Google Gemini & MongoDB Altas Vector Search

MIT

Jupyter Notebook

Updated 1 year ago

ACAS

Thaman-N

🧡50

Advanced Contract Analysis System: A comprehensive legal contract analysis system using generative AI. The project implements various NLP techniques, prompt engineering approaches (CoT, TroT, GoT), Retrieval-Augmented Generation (RAG), multimodal inputs, QLoRA fine-tuning, and evaluation frameworks.

Apache-2.0

Python

Updated 2 months ago

multimodal_rag

TeenLucifer

🧡55

An implementation of a multimodal RAG system provides support for images, tables, and formulas.

Python

Updated 2 weeks ago

Multimodal_Rag

kirollos2001

❤️45

A Python-based Retrieval-Augmented Generation (RAG) system designed to handle multimodal inputs and outputs. This project implements an advanced RAG architecture capable of processing and retrieving information across multiple modalities (text, images, etc.), enabling mo

Python

Updated 1 month ago

Mutiple_RAG

Benedictusy

❤️20

In this project, I implemented a multimodal RAG (Retrieval-Augmented Generation) video question answering system that can understand both visual and textual information in videos to provide accurate answers.

Python

Updated 4 months ago

multimodal-rag-bedrock-nova

uhbhy

🧡55

This project implements a Multimodal Retrieval-Augmented Generation (RAG) pipeline using AWS Bedrock's Nova and Titan models. The system ingests PDFs and extracts: - Text - Tables - Embedded images - Full-page images and performs similarity search using FAISS to generate grounded answers using "Amazon Nova" with both text and visual context.

Jupyter Notebook

Updated 4 weeks ago

RecipeSearch

Dat-Bois

🧡55

Multimodal RAG implementation for Recipe1M dataset

Python

Updated 3 weeks ago

multimodal_RAG

steve601

❤️35

Implementing a multimodal rag system

Jupyter Notebook

Updated 1 year ago

MR.Multi.RAG-Moodle

mmm-megahed

❤️40

Multimodal RAG implementation for Moodle with evaluation experiments

Apache-2.0

Hack

Updated 3 months ago

LearnSynth_RAG

MansoobeZahra

❤️35

Study assistant to help assist while studying, Involves RAG implementation multimodal and multi agent system

JavaScript

Updated 3 months ago

multimodal-agentic-rag

hash2004

❤️35

This repository features a Multimodal Agentic RAG fusion with complex techniques implemented like Agentic Ingestion, and RAG Fusion

Jupyter Notebook

Updated 11 months ago

Multimodal-RAG-using-Open-Source-Language-and-Vision-models

AniRoy10

❤️35

Multimodal (Text,Images,Tables) RAG Pipeline implementation using Llama3.1 , Google Gemini 1.5-Flash , Chroma DB

Jupyter Notebook

Updated 1 year ago

ATLAST

jemayz

❤️35

ATLAST is a Multimodal Chatbot that implement RAG which covers three domains which are Medical, Islamic and Insurance.

Python

Updated 4 months ago

Agentic-RAG-with-ApertureDB-and-Hugging-Face-SmolAgents

manya-imran

❤️35

This implementation highlights the Agentic RAG implementation using ApertureDB data store, which is a graph-based multimodal database. Hugging Face SmolAgents will be employed for implementing a multi-agent LLM workflow.

Jupyter Notebook

Updated 8 months ago

Multimodal-RAG-Lab

Naveed05

❤️35

Advanced GenAI projects implementing Retrieval-Augmented Generation (RAG) across text, audio, and multimodal pipelines using vector databases and foundation models.

Jupyter Notebook

Updated 3 months ago

NovaSearch

Gauravmangate27

❤️45

NovaSearch– Multimodal RAG Engine Python, LLMs, LangChain, FastAPI, RAG Demo • Developed a multimodal RAG system enabling semantic search across text and image data using OpenAI embeddings and CLIP. • Implemented real-time ingestion and hybrid retrieval pipelines with Kafka, Spark Streaming, FAISS, and Elasticsearch k-NN, improving retrieval .

Python

Updated 2 months ago

Groq-Jina-Multimodal-Pipeline

nandanavijesh

❤️45

End-to-end RAG implementation using Jina Embeddings v2 and FAISS for vector search, with Groq llama-3.2-vision for grounded, multimodal response generation.

Python

Updated 1 month ago

What-Is-That-Train-Car

FormalIngenieroniel

❤️35

This project implements a Multimodal Retrieval-Augmented Generation (RAG) system designed to identify, retrieve, and describe specific train wagons based on visual and textual data.

Python

Updated 3 months ago

hoopla

fllin1

❤️35

Modern search engine techniques implemented end-to-end: keyword (BM25), semantic (embeddings), hybrid fusion (weighted, RRF), multimodal (image→text), and LLM-enhanced retrieval (RAG with Gemini).

Python

Updated 5 months ago

bm25ragsearching-algorithms

mmRAG-with-Vision-Language-Model

kydaong

❤️35

This project implements a Multimodal Retrieval-Augmented Generation (RAG) pipeline that combines text and visual understanding using a Vision-Language Model (VLM). It enables querying across both documents and images, retrieving relevant multimodal context, and generating grounded responses.

Jupyter Notebook

Updated 4 months ago

BuildWithGenAI

pparitoshh

❤️40

A collection of Generative AI implementations focused on real-world applications like Retrieval-Augmented Generation (RAG), chatbots, and multimodal systems. Includes production-ready code, tutorials, and experiments using LangChain, OpenAI, and open-source models (Llama, Mistral). Contributions welcome!

MIT

Python

Updated 8 months ago

full-stack-multimodal

rafamartinezquiles

❤️35

This project implements a multimodal pipeline capable of ingesting text, extracting knowledge, and enabling intelligent search using Retrieval-Augmented Generation (RAG). It uses cutting-edge tools like LangChain, OpenAI, and Neo4j to build a searchable knowledge graph from unstructured documents like employee handbooks.

Python

Updated 4 months ago

Multimodal-AI-RAG-Agent-with-LlamaIndex-NVIDIA-NIM-and-Milvus

TLILIFIRAS

❤️40

This Streamlit application implements a Multimodal Retrieval-Augmented Generation (RAG) system. It processes various types of documents including text files, PDFs, PowerPoint presentations, and images. The app leverages Large Language Models and Vision Language Models to extract and index information from these documents.

MIT

Python

Updated 7 months ago

MultimediaRAG

kratipandya

🧡55

This project implements a multimodal Retrieval-Augmented Generation (RAG) search engine focused on scientific content from arXiv. It allows users to search through research papers using text queries, image uploads, or audio inputs, and provides AI-generated answers based on relevant content.

Jupyter Notebook

Updated 3 weeks ago

deepseekllmmultimodal-deep-learning+1

Flower-Arrangement-Service-using-MultimodalRAG

karimtawfikk

❤️35

This project implements a multimodal RAG system for designing creative flower arrangements. Flower images are stored in ChromaDB with OpenCLIP embeddings, enabling natural language queries like “What flowers would look elegant for a wedding bouquet?” The model then generates personalized bouquet suggestions grounded in retrieved visuals.

Python

Updated 7 months ago

rag

GitHub Explorer

Search Results

multimodal_rag_for_industry

Multimodal-RAG-Implementation

multimodal-ai

MM-PoisonRAG

Multi-Model-Rag

Multimodal-RAG-Gemini-MongoDB

ACAS

multimodal_rag

Multimodal_Rag

Mutiple_RAG

multimodal-rag-bedrock-nova

RecipeSearch

multimodal_RAG

MR.Multi.RAG-Moodle

LearnSynth_RAG

multimodal-agentic-rag

Multimodal-RAG-using-Open-Source-Language-and-Vision-models

ATLAST

Agentic-RAG-with-ApertureDB-and-Hugging-Face-SmolAgents

Multimodal-RAG-Lab

NovaSearch

Groq-Jina-Multimodal-Pipeline

What-Is-That-Train-Car

hoopla

mmRAG-with-Vision-Language-Model

BuildWithGenAI

full-stack-multimodal

Multimodal-AI-RAG-Agent-with-LlamaIndex-NVIDIA-NIM-and-Milvus

MultimediaRAG

Flower-Arrangement-Service-using-MultimodalRAG

multimodal_rag_for_industry

Multimodal-RAG-Implementation

multimodal-ai

MM-PoisonRAG

Multi-Model-Rag

Multimodal-RAG-Gemini-MongoDB

ACAS

multimodal_rag

Multimodal_Rag

Mutiple_RAG

multimodal-rag-bedrock-nova

RecipeSearch

multimodal_RAG

MR.Multi.RAG-Moodle

LearnSynth_RAG

multimodal-agentic-rag

Multimodal-RAG-using-Open-Source-Language-and-Vision-models

ATLAST

Agentic-RAG-with-ApertureDB-and-Hugging-Face-SmolAgents

Multimodal-RAG-Lab

NovaSearch

Groq-Jina-Multimodal-Pipeline

What-Is-That-Train-Car

hoopla

mmRAG-with-Vision-Language-Model

BuildWithGenAI

full-stack-multimodal

Multimodal-AI-RAG-Agent-with-LlamaIndex-NVIDIA-NIM-and-Milvus

MultimediaRAG

Flower-Arrangement-Service-using-MultimodalRAG