Search Results

Found 13 repositories(showing 13)

Florence-2-Vision-Language-Model

anyantudre

🧡55

Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

177

Jupyter Notebook

Updated 1 week ago

computer-visiondeep-learningflorence-2+5

Surveillance_Video_Summarizer

Ravi-Teja-konda

❤️40

VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for querying and analyzing video footage.

130

Python

Updated 1 month ago

aichatgptflorence-2+9

fiftyone_florence2_plugin

jacobmarks

❤️35

Run SOTA Vision-Language Model Florence-2 on your data!

Jupyter Notebook

Updated 3 months ago

computer-visiondatacentricfiftyone-datasets+4

vilma

CharlesCNorton

❤️40

ViLMA (Vision-Language Model Active Monitoring) - A real-time desktop monitoring tool leveraging Florence-2

MIT

Python

Updated 6 months ago

Florence-2-Image-Caption

PRITHIVSAKTHIUR

🧡55

This application utilizes the powerful Florence-2 vision-language model from Microsoft to generate comprehensive captions for images. The model is capable of understanding visual content and expressing it in natural language.

Python

Updated 3 weeks ago

florenceflorence-2gradio+8

MultiModal-Vision-Language-Model-Training

nafew-azim

❤️40

The MultiModal-Vision-Language-Model-Training repository provides scripts for fine-tuning vision-language models (PaliGemma, BLIP-2, BLIP, SmolVLM, Qwen-VL, Florence-2) on SkinCAP and ROCOv2 datasets for medical image captioning. Optimized with LoRA and 4-bit quantization, it includes efficient training, evaluation (loss, accuracy, ROUGE, BLEU)

Apache-2.0

Python

Updated 6 months ago

Florence-2-Image-Captioning

SUP3RMASS1VE

❤️35

Florence-2 is a large vision-language model capable of various image and text generation tasks, such as object detection, captioning, and grounding. This demo allows users to interact with these capabilities by uploading images and selecting from various tasks.

Python

Updated 11 months ago

pinokio

finetune_florence2_vision_language_model

ictBioRtc

❤️20

No description available

Python

Updated 7 months ago

CaptionAid

5hak1r

❤️45

A Multimodal Image Captioning and Audio Narration System Using the Florence 2 Vision Language Model.

Updated 1 month ago

Modelo-Vision-florence

sandrarairan

❤️35

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Model Summary This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft. Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks.

Jupyter Notebook

Updated 1 year ago

region-captioning-with-sam3-florence2-blip2

asifa-nazir

💛70

A comparative analysis of Vision-Language Models (Florence-2 vs. BLIP-2) performing dense region captioning on isolated objects segmented by SAM3.

MIT

Jupyter Notebook

Updated 4 days ago

Image-Feature-Extraction-Using-GenAI

Abdeen-A-AI

❤️35

This project implements an advanced generative AI pipeline for extracting and rating features from images. It combines the power of Florence-2, a state-of-the-art vision-language model, with a fine-tuned version of Mistral-v3, a cutting-edge large language model.

Jupyter Notebook

Updated 1 year ago

airbnbflorence-2genai+2

ComfyUI_SmartLML

r-vage

💛70

Smart Model Loader for ComfyUI — for vision-language models, text LLMs, and WD14 taggers across 8 backends (Transformers, GGUF, vLLM, SGLang, Ollama, llama.cpp, YOLO, WD14). Supports QwenVL, Mistral3, Florence-2, LLaVA, YOLO with multi-task chaining, few-shot training, and auto-download. V3 API + Nodes 2.0 compatible. NVIDIA/AMD/ROCm.

Apache-2.0

Python

Updated 52 minutes ago

All 13 repositories loaded

GitHub Explorer

Search Results

Florence-2-Vision-Language-Model

Surveillance_Video_Summarizer

fiftyone_florence2_plugin

vilma

Florence-2-Image-Caption

MultiModal-Vision-Language-Model-Training

Florence-2-Image-Captioning

finetune_florence2_vision_language_model

CaptionAid

Modelo-Vision-florence

region-captioning-with-sam3-florence2-blip2

Image-Feature-Extraction-Using-GenAI

ComfyUI_SmartLML

Florence-2-Vision-Language-Model

Surveillance_Video_Summarizer

fiftyone_florence2_plugin

vilma

Florence-2-Image-Caption

MultiModal-Vision-Language-Model-Training

Florence-2-Image-Captioning

finetune_florence2_vision_language_model

CaptionAid

Modelo-Vision-florence

region-captioning-with-sam3-florence2-blip2

Image-Feature-Extraction-Using-GenAI

ComfyUI_SmartLML