Found 614 repositories(showing 30)
SakiRinn
Lightweight and powerful real-time audio/speech translation tool based on Windows LiveCaptions.
cddqssc
A one-stop video subtitle translation tool
ruotianluo
Use transformer for captioning
njchoma
Image Captioning based on Bottom-Up and Top-Down Attention model
Kamino666
这是一个基于Pytorch平台、Transformer框架实现的视频描述生成 (Video Captioning) 深度学习模型。 视频描述生成任务指的是:输入一个视频,输出一句描述整个视频内容的文字(前提是视频较短且可以用一句话来描述)。本repo主要目的是帮助视力障碍者欣赏网络视频、感知周围环境,促进“无障碍视频”的发展。
aws-samples
Convert AWS Transcribe output into multiple caption formats.
zarzouram
Pytorch implementation of image captioning using transformer-based model.
milkymap
Implementation of the paper CPTR : FULL TRANSFORMER NETWORK FOR IMAGE CAPTIONING
ataturhan21
A deep learning project for generating descriptive captions for images using ConvNeXt and Transformer models in PyTorch
livekit-examples
No description available
MIBlue119
A simple implement to translate the .srt/.vtt caption with Open AI API
No description available
luo3300612
Optimized code based on M2 for faster image captioning training
MatejMecka
Translate Subtitles using deepl.
Video URL transcriber and translator using AI. Download from Youtube and translate automatically by adding subtitles to the video
No description available
kaylode
Image captioning with Transformer
This repository contains code for generating captions for images using a Transformer-based model. The model used is the `VisionEncoderDecoderModel` from the Hugging Face Transformers library, specifically the `nlpconnect/vit-gpt2-image-captioning` model.
No description available
A real-time caption translation tool based on VOSK speech recognition and machine translation, which supports transcribing audio into target language subtitles in real time and displaying the translated content.
No description available
tranquoctrinh
Image Captioning with Encoder as Efficientnet and Decoder as Decoder of Transformer combined with the attention mechanism.
laicsiifes
No description available
No description available
Notebook implementing a custom bimodal transformer architecture for masked language modelling in order to learn to caption COCO images.
avijeett007
No description available
stephenllh
A Transformer model that caption images
thangman22
A simple, real-time speech transcription and translation application built with vanilla JavaScript and Tailwind CSS.
Luanramos
Captions and Transitions
RicRicci22
In this repo, I try to build a transformer for image captioning. I will train it on the UCM dataset, a famous dataset of remote sensing images