Search Results

Found 13,289 repositories(showing 30)

sdnext

vladmandic

💛83

SD.Next: All-in-one WebUI for AI generative image and video creation, captioning and processing

7.0k

559

Apache-2.0

Python

Updated 12 hours ago

ai-artcaptiondiffusers+7

neuraltalk2

karpathy

💛83

Efficient Image Captioning code in Torch, runs on GPU

5.6k

1.3k

Jupyter Notebook

Updated 1 day ago

sketch-code

ashnkumar

💛77

Keras model to generate HTML code from hand-drawn website mockups. Implements an image captioning architecture to drawn source images.

5.2k

681

Python

Updated 16 hours ago

augmentationdeep-learningimage-processing+2

a-PyTorch-Tutorial-to-Image-Captioning

sgrvinod

💛80

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

2.9k

727

MIT

Python

Updated 3 days ago

attention-mechanismcomputer-visionencoder-decoder+5

The NCA Toolkit API eliminates monthly subscription fees by consolidating common API functionalities into a single FREE API. Designed for businesses, creators, and developers, it streamlines advanced media processing, including video editing and captioning, image transformations, cloud storage, and Python code execution.

2.3k

988

GPL-2.0

Python

Updated 1 day ago

Caption-Anything

ttengwang

💛73

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

1.8k

104

BSD-3-Clause

Python

Updated 1 day ago

chatgptcontrollable-generationcontrollable-image-captioning+2

ComfyUI-Prompt-Assistant

yawiii

💛72

提示词小助手可以一键调用智谱、硅基流动、gemini、本地ollama、百度等大语言模型服务，实现提示词翻译、润色扩写、图片反推。支持提示词预设实现一键插入、历史提示词查找等功能。是一个全能型提示词插件。The Prompt Assistant enables one-click access to LLMs/VLMs for prompt translation, expansion, and image captioning. It also supports one-click preset insertion and historical prompt search.

1.8k

GPL-3.0

JavaScript

Updated 9 hours ago

comfyuiexpandprompt+2

densecap

jcjohnson

🧡66

Dense image captioning in Torch

1.6k

427

MIT

Jupyter Notebook

Updated 2 weeks ago

ImageCaptioning.pytorch

ruotianluo

💛71

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

1.5k

423

MIT

Python

Updated 1 day ago

describe-anything

NVlabs

💛72

[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning

1.5k

Apache-2.0

Python

Updated 5 days ago

describe-anythingdetailed-localized-captioninglarge-multimodal-models+1

bottom-up-attention

peteanderson80

💛70

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

1.5k

376

MIT

Jupyter Notebook

Updated 2 days ago

caffecaptioning-imagesfaster-rcnn+5

CLIP_prefix_caption

rmokady

💛74

Simple image captioning model

1.4k

222

MIT

Jupyter Notebook

Updated 1 day ago

joycaption

fpgaminer

💛72

JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.

1.1k

Apache-2.0

Jupyter Notebook

Updated 1 day ago

captioningjoycaptionvlm

awesome-image-captioning

zhjohnchan

🧡68

A curated list of image captioning and related area resources. :-)

1.1k

182

Updated 3 days ago

self-critical.pytorch

ruotianluo

🧡69

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

1.0k

276

MIT

Python

Updated 3 days ago

image-captioning

xmodaler

YehLi

💛72

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

969

105

NOASSERTION

Python

Updated 3 days ago

cross-modal-retrievalimage-captioningpretraining+4

image_captioning

DeepRNN

❤️49

Tensorflow implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

796

352

MIT

Python

Updated 1 month ago

conceptual-captions

google-research-datasets

💛71

Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.

563

NOASSERTION

Shell

Updated 4 days ago

meshed-memory-transformer

aimagelab

🧡67

Meshed-Memory Transformer for Image Captioning. CVPR 2020

544

136

BSD-3-Clause

Python

Updated 3 days ago

caption-generationcaptioning-imagescvpr2020+4

Dank-Learning

alpv95

❤️41

Dank Learning codebase, generate a meme from any image using AI. Uses a modified version of the Show and Tell image captioning network

483

MIT

Jupyter Notebook

Updated 7 months ago

VLP

LuoweiZhou

🧡51

Vision-Language Pre-training for Image Captioning and Question Answering

423

Apache-2.0

Python

Updated 2 months ago

Awesome-Visual-Captioning

forence

❤️36

This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP

411

Updated 4 months ago

AoANet

husthuaan

❤️46

Code for paper "Attention on Attention for Image Captioning". ICCV 2019

339

MIT

Python

Updated 2 months ago

attention-mechanismiccv2019image-captioning

AdaptiveAttention

jiasenlu

🧡66

Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"

338

NOASSERTION

Jupyter Notebook

Updated 3 days ago

attention-mechanismimage-captioningtorch

Image-Captioning

yashk2810

❤️42

Image Captioning using InceptionV3 and beam search

328

123

MIT

Jupyter Notebook

Updated 4 months ago

beam-searchcnnimage-captioning+3

fairseq-image-captioning

krasserm

❤️41

Transformer-based image captioning extension for pytorch/fairseq

318

Apache-2.0

Python

Updated 3 months ago

fairseqimage-captioningpytorch+1

image-captioning

JDAI-CV

❤️46

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

273

Python

Updated 2 months ago

image-captioningvision-and-language

catr

saahiluppal

🧡66

Image Captioning Using Transformer

270

Apache-2.0

Python

Updated 5 days ago

image-captioningtransformer

NexusRAG

LeDat98

🧡66

Hybrid RAG system combining vector search, knowledge graph (LightRAG), and cross-encoder reranking — with Docling document parsing, visual intelligence (image/table captioning), agentic streaming chat, and inline citations. Powered by Gemini or local Ollama models.

259

Python

Updated 8 hours ago

chromadbcitationdocling+13

VSUA-Captioning

ltguo19

❤️40

Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019

258

MIT

Python

Updated 7 months ago

captioningdeep-learninglanguage-generation+2

GitHub Explorer

Search Results

sdnext

neuraltalk2

sketch-code

a-PyTorch-Tutorial-to-Image-Captioning

no-code-architects-toolkit

Caption-Anything

ComfyUI-Prompt-Assistant

densecap

ImageCaptioning.pytorch

describe-anything

bottom-up-attention

CLIP_prefix_caption

joycaption

awesome-image-captioning

self-critical.pytorch

xmodaler

image_captioning

conceptual-captions

meshed-memory-transformer

Dank-Learning

VLP

Awesome-Visual-Captioning

AoANet

AdaptiveAttention

Image-Captioning

fairseq-image-captioning

image-captioning

catr

NexusRAG

VSUA-Captioning

sdnext

neuraltalk2

sketch-code

a-PyTorch-Tutorial-to-Image-Captioning

no-code-architects-toolkit

Caption-Anything

ComfyUI-Prompt-Assistant

densecap

ImageCaptioning.pytorch

describe-anything

bottom-up-attention

CLIP_prefix_caption

joycaption

awesome-image-captioning

self-critical.pytorch

xmodaler

image_captioning

conceptual-captions

meshed-memory-transformer

Dank-Learning

VLP

Awesome-Visual-Captioning

AoANet

AdaptiveAttention

Image-Captioning

fairseq-image-captioning

image-captioning

catr

NexusRAG

VSUA-Captioning