Back to search
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Stars
969
Forks
105
Watchers
969
Open Issues
16
Overall repository health assessment
No package.json found
This might not be a Node.js project