We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench shows that fine-tuned video models consistently outperform strong VLMs on long-horizon spatial planning tasks.
Stars
58
Forks
6
Watchers
58
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
Merge pull request #14 from FoundationAgents/refactor/unified_generation
a79c65aView on GitHubrefactor: Refactor the game adapter interface to unify the video generation method
29a7d37View on GitHubMerge pull request #11 from FoundationAgents/feature/autoenv_skin_generation
4ba4d66View on GitHubMerge branch 'main' of https://github.com/ImYangC7/VR-Bench
76624abView on GitHubMerge pull request #10 from FoundationAgents/feature/add_dynamic_prompt
9f93083View on GitHub