Search Results

Found 33 repositories(showing 30)

tau2-bench

sierra-research

🧡68

τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

981

245

MIT

Python

Updated 2 hours ago

aibenchmarkconversational-agents+2

τ²-Bench-Verified is a corrected and verified version of the original τ²-bench benchmark. This release addresses issues discovered in the original dataset where task definitions, expected actions, and evaluation criteria did not properly align with the stated policies or database contents.

MIT

Python

Updated 6 days ago

Tau2-RL-Pipeline

jbarnes850

🧡60

Multi-turn tool-use training pipeline for tau2-bench using slime

Apache-2.0

Python

Updated 1 week ago

tau2-bench-revised

AGI-Eval-Official

❤️40

No description available

MIT

Updated 1 month ago

tau2-bench

oscaralvaro

🧡55

Copia del repositorio original https://github.com/sierra-research/tau2-bench?tab=readme-ov-file

MIT

Python

Updated 4 days ago

AgentChangeBench

Maniktherana

🧡50

AgentChangeBench

MIT

Python

Updated 1 month ago

agentsneurips-2025tau2-bench

tau2-bench_dummy

Vattikondadheeraj

🧡60

tau2-bench

MIT

Python

Updated 3 weeks ago

tau2-enhanced

jitrc

❤️35

Enhanced tau2-bench for deeper analysis

Python

Updated 5 months ago

tau2-bench-agent-leaderboard

wuTims

❤️45

Leaderboard repository for tau2-bench-agent

Python

Updated 2 months ago

tau2-observe

wuTims

❤️40

Datadog observability project for tau2-bench-agent

Apache-2.0

Python

Updated 3 months ago

tau2-purple

rezitdinovAR

🧡65

Purple agent for tau2 bench on AgentBeats

Python

Updated 6 minutes ago

tau2-purple-agent

vvvgo

🧡65

A2A purple agent for tau2-bench on AgentBeats

Python

Updated 10 minutes ago

tau2-bench

leary-comos

🧡50

No description available

MIT

Python

Updated 3 weeks ago

tau2-bench

APengX

❤️25

No description available

Python

Updated 3 months ago

tau2-bench

uservan

❤️40

No description available

MIT

Python

Updated 1 month ago

tau2xagents-bench

moyai-ai

❤️45

Evaluates tau2-bench using multiple agent frameworks and foundational models

Python

Updated 1 month ago

TAU2-Bench

mink555

❤️45

No description available

Python

Updated 3 weeks ago

tau2-bench

codesque16

❤️40

No description available

MIT

Python

Updated 1 month ago

tau2-bench

tienanh28122000

❤️40

No description available

MIT

Python

Updated 1 month ago

Tau2_Benchmk

APengX

❤️20

No description available

Python

Updated 2 months ago

tau2benchenv

EnvCommons

🧡50

tau2bench

Python

Updated 2 weeks ago

tau2-bench

ABHINAV2400

❤️30

No description available

MIT

Python

Updated 7 months ago

tau2-bench

safikhanSoofiyani

❤️40

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

MIT

Python

Updated 4 months ago

Tau2-Bench

agentbeater

❤️20

No description available

Python

Updated 1 month ago

tau2bench

EnvCommons

🧡50

tau2bench implementation in ORS

Python

Updated 2 weeks ago

tau2_bench_exp

runshengdu

❤️40

No description available

MIT

Python

Updated 2 months ago

tau3_bench

mpdmanash

❤️45

Tau2 like bench with STRIPS planner as verifier and world simulator

Python

Updated 2 months ago

frozen-tau2-bench

jdf-prog

❤️40

No description available

MIT

Python

Updated 2 months ago

tau2-bench--1

singamsettysunil-code

❤️40

No description available

MIT

Python

Updated 1 month ago

agentx-tau2bench-sol

dzhunkoffski

🧡55

No description available

Python

Updated 1 day ago

GitHub Explorer

Search Results

tau2-bench

tau2-bench-verified

Tau2-RL-Pipeline

tau2-bench-revised

tau2-bench

AgentChangeBench

tau2-bench_dummy

tau2-enhanced

tau2-bench-agent-leaderboard

tau2-observe

tau2-purple

tau2-purple-agent

tau2-bench

tau2-bench

tau2-bench

tau2xagents-bench

TAU2-Bench

tau2-bench

tau2-bench

Tau2_Benchmk

tau2benchenv

tau2-bench

tau2-bench

Tau2-Bench

tau2bench

tau2_bench_exp

tau3_bench

frozen-tau2-bench

tau2-bench--1

agentx-tau2bench-sol

tau2-bench

tau2-bench-verified

Tau2-RL-Pipeline

tau2-bench-revised

tau2-bench

AgentChangeBench

tau2-bench_dummy

tau2-enhanced

tau2-bench-agent-leaderboard

tau2-observe

tau2-purple

tau2-purple-agent

tau2-bench

tau2-bench

tau2-bench

tau2xagents-bench

TAU2-Bench

tau2-bench

tau2-bench

Tau2_Benchmk

tau2benchenv

tau2-bench

tau2-bench

Tau2-Bench

tau2bench

tau2_bench_exp

tau3_bench

frozen-tau2-bench

tau2-bench--1

agentx-tau2bench-sol