Found 99 repositories(showing 30)
aiplanethub
Build, evaluate and observe LLM apps
MingyuJ666
[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concentrated in low-frequency dimensions across different attention heads exclusively in attention queries (Q) and keys (K) while absent in values (V).
invariantlabs-ai
LLM proxy to observe and debug what your AI agents are doing.
llnl
ParaView-MCP integrates multimodal LLMs with ParaView via Model Context Protocol, enabling natural language control of scientific visualizations. The agent observes the viewport for visual feedback, making complex visualization tool accessible to all users while providing intelligent automation for experts.
dhyansraj
Enterprise-grade distributed AI agent framework | Develop → Deploy → Observe | K8s-native | Dynamic DI | Auto-failover | Multi-LLM | Python + Java + TypeScript
mehmetnadir
Zero-dependency browser automation CLI. 70+ commands, 10 test assertions, smart commands (click/fill by text — no LLM needed). MCP server for AI agents with 500x fewer tokens. Extract, observe, script runner. 50KB, pure CDP.
svd-ai-lab
sim — a CLI runtime that lets LLM agents launch, drive, and observe CAD/CAE simulators through one protocol
user1342
A security testing tool designed to evaluate the effectiveness of large language models (LLMs) in protecting secrets and preventing security breaches. With customisable LLM options, the tool allows you to simulate attacks on LLMs using various techniques and observe their defence capabilities.
ra189zor
Real-time observability and analytics platform for local LLMs, with dashboard and API.
yourconscience
Observe and analyze LLM agents decision-making through Space Rangers text adventures! 👾🚀📊
maltyxx
An autonomous Web Application Firewall (WAF) that uses a Large Language Model (LLM) to learn and adapt its security rules automatically based on observed traffic.
solzilberman
Minimal implementation of thought-act-observe design pattern for LLMs (gpt-3.5-turbo).
TengJiao33
自动化AI排行信息推送。每天推送HF,OpenRouter,LMSYS和Artificial Analysis的实时排行榜信息,稳抓LLM动态
kyyasdev
A small project captures everything our LLM traffic touches: FastAPI intercepted each prompt, Postgres archived the full exchange, and the React dashboard replayed token counts like telemetry. It wasn’t just a proxy—it was proof we could observe any model in real time, down to the user label and individual completion.
JoNeedsSleep
LLMs play Diplomacy testing out their Machiavellian prowess, and we get to observe them.
azank1
MCP server that observes every prompt, scores quality in real time, and closes the loop with iterative refinement. Built on FastMCP, SQLite, and Bayesian priors — no extra LLM required.
sfc-gh-sdickson
Testing Tool for LLM Observability
dongkoony
Production-ready MLOps platform for monitoring and evaluating LLM response quality with automated alerts and real-time analytics
kunish
Wheel. LLM API Gateway — Aggregate, Balance, Observe.
kernel-systems
Observe claude code agent's LLM calls
logsv
An open-source platform to govern, evaluate, observe, and control LLMs in production.
kevinsze1996
A interface allows people choosing different llm and observe them talking to each other
PoiName1923
Streaming Pipeline to observe Trades on Binance and build some LLM and Dashboard base on data.
cruz209
VIGIL: A reflective runtime for LLM agents that observes behavior, appraises failures, and proposes its own fixes (even to itself)
vijayashankar-g
A minimal AI coding agent that observes terminal output, thinks using an LLM, fixes broken code with tools, and loops until the program runs successfully.
spyrae
Local AI agent framework for macOS. Observes work patterns via native APIs, analyzes with local LLM (Ollama), stores everything locally. Tauri 2.0 + Swift + React + SQLite.
inchara23
PropInsight is an AI-powered property inspection report generator that utilizes LLM models to analyze property types and observed issues, generating comprehensive and data-driven reports for smarter decision-making.
Soham041201
A Claude-Code–style CLI built with Bun, TypeScript, React Ink, and Playwright that observes UI interactions and network calls, correlates them using LLM + Vision, and generates safe, structured API documentation.
Spartan-71
A learning-in-public repo. I'm going from zero RL knowledge → fine-tuning LLMs with reinforcement learning. Every folder is a phase. Every experiment has notes on what I tried and what I observed.
Screen-Mate is an AI-powered desktop assistant that observes your screen, understands context, and provides real-time, proactive help using OCR, YOLO, and LLMs — offering smart suggestions and debugging support through a minimal floating overlay.