Found 1 repositories(showing 1)
This PyTorch-based image captioning model uses ResNet-50 encoder and Transformer decoder to generate descriptive captions from Flickr8k images. Features include data augmentation (image transforms, synonym replacement), training with AdamW and early stopping, and inference via greedy, beam search (k=3,5,10), or nucleus sampling.
All 1 repositories loaded