Found 108,540 repositories(showing 30)
dotnet
Powerful .NET library for benchmarking
A microbenchmark support library
docker
The Docker Bench for Security is a script that checks for dozens of common best-practices around deploying Docker containers in production.
facebookresearch
Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
TechEmpower
Source for the TechEmpower Framework Benchmarks project
aquasecurity
Checks whether Kubernetes is deployed according to security best practices as defined in the CIS Kubernetes Benchmark
krausest
A comparison of the performance of a few popular javascript frameworks
masonr
YABS - a simple bash script to estimate Linux server performance using fio, iperf3, & Geekbench
jeinlee1991
ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括359个大模型,覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及step3.5-flash、kimi-k2.5、ernie4.5、MiniMax-M2.5、deepseek-v3.2、Qwen3.5、llama4、智谱GLM-5、GLM-4.7、LongCat、gemma3、mistral等开源大模型。不仅提供排行榜,也提供规模超200万的大模型缺陷库!方便广大社区研究分析、改进大模型。
erikbern
Benchmarks of approximate nearest neighbor libraries in Python
OpenBMB
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
bestiejs
A benchmarking library. As used on jsPerf.com.
Kodezi
Kodezi Chronos is a debugging-first language model that achieves state-of-the-art results on SWE-bench Lite (80.33%) and 67% real-world fix accuracy, over six times better than GPT-4. Built with Adaptive Graph-Guided Retrieval and Persistent Debug Memory. Model available Q1 2026 via Kodezi OS.
SWE-bench
SWE-bench: Can Language Models Resolve Real-world Github Issues?
foolwood
Visual Tracking Paper List
Text recognition (optical character recognition) with deep learning methods, ICCV 2019
HTTP(S) benchmark tools, testing/debugging, & restAPI (RESTful)
SWE-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
devMEremenko
XcodeBenchmark measures the compilation time of a large codebase on iMac, MacBook, and Mac Pro
THUDM
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
AutoCodeRoverSG
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-bench lite and 46.2% tasks (pass@1) in SWE-bench verified with each task costs less than $0.7.
zombocom
Go faster, off the Rails - Benchmarks for your whole Rails app
kostya
Some benchmarks of different languages
EZLippi
Webbench是Radim Kolar在1997年写的一个在linux下使用的非常简单的网站压测工具。它使用fork()模拟多个客户端同时访问我们设定的URL,测试网站在压力下工作的性能,最多可以模拟3万个并发连接去测试网站的负载能力。官网地址:http://home.tiscali.cz/~cz210552/webbench.html
soumith
Easy benchmarking of all publicly accessible implementations of convnets
graphdeeplearning
Repository for benchmarking graph neural networks (JMLR 2023)
jcjohnson
Benchmarks for popular CNN models
smallnest
:zap: Go web framework benchmark
miloyip
C/C++ JSON parser/generator benchmark