Back to search
【年关将至!】Benchmark for evaluating LLMs on Chinese kinship term inference (中文亲属关系). Given a relation chain (e.g., "my father's elder brother"), models must output the correct address term (e.g., 伯父). LLM-as-Judge scoring; supports SiliconFlow, OpenRouter, OpenAI, Gemini.
Stars
3
Forks
0
Watchers
3
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
1
commits