Found 739 repositories(showing 30)
huggingface
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
lucidrains
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
jeonsworld
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
baofff
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
lukemelas
Vision Transformer (ViT) in PyTorch
p0p4k
unofficial vits2-TTS implementation in pytorch
naver-ai
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
omerbt
Official Pytorch Implementation for "Splicing ViT Features for Semantic Appearance Transfer" presenting "Splice" (CVPR 2022 Oral)
Pytorch version of Vision Transformer (ViT) with pretrained models. This is part of CASL (https://casl-project.github.io/) and ASYML project.
junyuchen245
Vision Transformer for 3D medical image registration (Pytorch)
thuanz123
An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch
gupta-abhay
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A Simplified PyTorch Implementation of Vision Transformer (ViT)
omihub777
PyTorch implementation for Vision Transformer[Dosovitskiy, A.(ICLR'21)] modified to obtain over 90% accuracy FROM SCRATCH on CIFAR-10 with small number of parameters (= 6.3M, originally ViT-B has 86M).
基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。
NVlabs
Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)
Simple and easy to understand PyTorch implementation of Vision Transformer (ViT) from scratch, with detailed steps. Tested on common datasets like MNIST, CIFAR10, and more.
juntang-zhuang
PyTorch repository for ICLR 2022 paper (GSAM) which improves generalization (e.g. +3.8% top-1 accuracy on ImageNet with ViT-B/32)
FENRlR
Application of MB-iSTFT-VITS components to vits2_pytorch
jaehyunnn
An unofficial implementation of ViTPose [Y. Xu et al., 2022]
Intellindust-AI-Lab
Pytorch implementation of "EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation"
ChenMnZ
(AAAI 2023 Oral) Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"
fudong03
PyTorch Implementation of "MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision Transformer", accepted by ICCV 2023.
jaiwei98
Collection and Implementation of Mobile-based Vision Transformer in Pytorch
teodorToshkov
A PyTorch implementation of VITGAN: Training GANs with Vision Transformers
yeyupiaoling
本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,直接一键训练和生成,大大降低了学习门槛。
gpastal24
VitPose without MMCV dependencies
sndnyang
PyTorch Implementation of "Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model"
likelyzhao
vit model from tensorflow
masora1030
Replacing Labeled Real-Image Datasets with Auto-Generated Contours (CVPR 2022)