Found 95 repositories(showing 30)
tencent-ailab
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
amazon-science
Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"
AMAAI-Lab
Text2midi is the first end-to-end model for generating MIDI files from textual descriptions. By leveraging pretrained large language models and a powerful autoregressive transformer decoder, text2midi allows users to create symbolic music that aligns with detailed textual prompts, including musical attributes like chords, tempo, and style.
ShiZhengyan
[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner"
AGENDD
This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the idea of SLAM_ASR and used the RWKV language model as the LLM, and instead of directly writing a prompt template we directly finetuned the initial state of the RWKV model.
ssbuild
share data, prompt data , pretraining data
DNE-Digital
Dolores is a Python library designed to improve the developer experience when working with pretrained language models. Dolores provides prompts for interacting with language models that result in interesting or useful outputs.
MedICL-VU
[ISBI 2024 Oral] ProMISe: Prompt-driven 3D Medical Image Segmentation Using Pretrained Image Foundation Models
TOM-tym
Official PyTorch implementation of our ICCV2023 paper “When Prompt-based Incremental Learning Does Not Meet Strong Pretraining”
hedongxiao-tju
[NeurIPS 2025] One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
PRIS-CV
Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"
SufyanDanish
A comprehensive survey of Vision–Language Models: Pretrained models, fine-tuning, prompt engineering, adapters, and benchmark datasets
c-box
Code for ACL 2022 long paper: Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View
adriaciurana
Generate prompts using GA algorithm for a pretrained LLM
sangmichaelxie
Code for the NeurIPS 2021 paper "Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning"
wangyu-sd
Molecular Chemical reActivity pretraining and prompted-finetuning enhanced molecular representation learning
zzyking
Added image cropping for better prompting based on the official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
IvanC987
A Flask-based GUI for latent diffusion image generation with real-time denoising (DDIM/DDPM), CFG, and advanced user controls (prompt input, img2img, upscaling, etc.). This project integrates various supporting pretrained models (Stable Diffusion VAE, CLIP, Real-ESRGAN, VGG16) into a custom Latent Diffusion Model pipeline for image synthesis.
Wxy-24
This is the implementation of [ISBI26]:QwenCLIP: Boosting Medical Vision-Language Pretraining via LLM Embeddings and Prompt tuning
kaoyuky
This is Pytorch implementation for "Rethinking Remote Sensing Pretrained Model: Instance-Aware Visual Prompting for Remote Sensing Scene Classification". If you have any questions, please contact kys220900680@hnu.edu.cn
zhaoziheng
This is the official repository to conduct knowledge enhancement pretraining in "Large-Vocabulary Segmentation for Medical Images with Text Prompts".
This project aims to utilize the technology of Object Detection combined with voice assistant features for visually impaired people which can help them reach their destination and also help them to read sign boards using Computer Vision. This project proposes to build a prototype that performs real time object detection using deep neural network model, YOLOV3. Further the object, and the class of the object is prompted through speech stimulus to the blind person. Along with this we are augmenting a voice assistant for frequent requirements and utilities such as sending emails, getting information over internet, etc. This work uses a combination of YOLOv3 on pretrained dataset and darknet detection framework to build rapid real time multi object detection for a compact, portable and minimal response time device construction. Several prototypes and models have been made keeping in mind the blind people having different usages such as Object Recognizer for the Blind, Visual Aid for the Blind, Google Lookup, etc. Among these technologies, we want to create a computer vision assisted solution keeping in mind their needs and their movements. Computer Vision based solutions are emerging as one of the most promising options due to their affordability and accessibility. This project proposes a system for visually impaired people. The proposed project aims to create a wearable visual aid for visually impaired people.
In this assignment, we’re going to finetune a pretrained Stable Diffusion model to create images based on Naruto-themed prompts. We’ll use the "small-stable-diffusion-v0" model and a dataset of Naruto-related captions. By the end, our model should generate awesome Naruto-style images from text prompts.
Lohith0204
AI Text-to-Image Generator is a deep learning–based application that converts natural language text prompts into high-quality images using pretrained diffusion models. The project demonstrates the practical use of generative AI by leveraging modern text-to-image architectures to transform user descriptions into visually meaningful outputs.
The code of our paper "Entity-related Unsupervised Pretraining with Visual Prompts for Multimodal Aspect-based Sentiment Analysis"
ExpressAI
Prompting Evaluation for Pretrained Language Models
3-Flamingo
for my paper:A Script Event Prediction Method Based on Multi-Level Joint Pretraining and Prompt Fine-Tuning
1maginat0r
[ISBI 2024 Oral] ProMISe: Prompt-driven 3D Medical Image Segmentation Using Pretrained Image Foundation Models
assignments-sliit
Caption Generation using Flickr8k dataset by @jbrownlee and image generation from caption prompt using pretrained models
jiaolifengmi
Official PyTorch code for "Prompt-based Continual Learning for Extending Pretrained CLIP Models' Knowledge (ACMMM Asia 2024)".