Found 33 repositories(showing 30)
sierra-research
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
amazon-agi
τ²-Bench-Verified is a corrected and verified version of the original τ²-bench benchmark. This release addresses issues discovered in the original dataset where task definitions, expected actions, and evaluation criteria did not properly align with the stated policies or database contents.
jbarnes850
Multi-turn tool-use training pipeline for tau2-bench using slime
AGI-Eval-Official
No description available
oscaralvaro
Copia del repositorio original https://github.com/sierra-research/tau2-bench?tab=readme-ov-file
Maniktherana
AgentChangeBench
Vattikondadheeraj
tau2-bench
jitrc
Enhanced tau2-bench for deeper analysis
Leaderboard repository for tau2-bench-agent
wuTims
Datadog observability project for tau2-bench-agent
rezitdinovAR
Purple agent for tau2 bench on AgentBeats
vvvgo
A2A purple agent for tau2-bench on AgentBeats
leary-comos
No description available
APengX
No description available
uservan
No description available
moyai-ai
Evaluates tau2-bench using multiple agent frameworks and foundational models
mink555
No description available
codesque16
No description available
tienanh28122000
No description available
APengX
No description available
EnvCommons
tau2bench
ABHINAV2400
No description available
safikhanSoofiyani
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
agentbeater
No description available
EnvCommons
tau2bench implementation in ORS
runshengdu
No description available
mpdmanash
Tau2 like bench with STRIPS planner as verifier and world simulator
jdf-prog
No description available
singamsettysunil-code
No description available
dzhunkoffski
No description available