Found 70 repositories(showing 30)
taylorwilsdon
Since OpenAI and friends refuse to give us a max_ctx param in /models, here's the current context window, input token and output token limits for OpenAI (API), Anthropic, Qwen, Deepseek, llama, Phi, Gemini and Mistral
yash9439
Transform any codebase, web page, or document into an optimized LLM prompt. CodeToPrompt intelligently compresses code and filters content to overcome context window limits.
alexandephilia
Context Limiter & Output Vetter for context bloat. It is a highly specialized, structure-aware JSON built specifically to intercept and compress MCP responses before they annihilate your LLM's context window.
taylorbayouth
A Python script to compress large text files for LLM context windows, optimizing the ratio of essential information to tokens used. It offers various compression techniques (key points, glossary terms, paraphrasing, etc.) to fit important content within token limits, reducing the risk of losing context and improving clarity and impact.
Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities"
danwritecode
Solving context limits when working with AI LLM models by implementing a "chunkable" prop macro on your prompt structs.
Abzaek
Export ChatGPT conversations to Markdown to bypass message limits and seamlessly transfer context between LLMs.
paradite
Information on LLM models, context window token limit, output token limit, pricing and more.
bradAGI
The complete guide to free AI/LLM inference APIs — 20+ providers, 81+ verified $0 models, rate limits, context windows, and code examples
zhuangziGiantfish
Unable to forget: Proactive lnterference Reveals Working Memory Limits in LLMs Beyond Context Length
hilyfux
Stop AI Coding from forgetting. A knowledge graph–driven memory layer for LLMs (ChatGPT, Claude, Codex, DeepSeek, Gemini), enabling persistent long-term memory beyond context window limits. Build smarter AI agents with structured context, better consistency, and scalable multi-step reasoning across complex coding workflows.
zircote
Claude Code plugin for processing documents 100x larger than context limits using the Recursive Language Model pattern. Rust-powered chunking, hybrid semantic + BM25 search, and sub-LLM orchestration.
GPUforLLM
Accurate VRAM calculator for Local LLMs (Llama 4, DeepSeek V3, Qwen 2.5). Calculates GGUF quantization, GQA context overhead, and offloading limits
FilippoLeone
The LLM Code Prompter is a command-line utility designed to generate structured prompts from code repos for GPT-4 models, leveraging the maximum context limit.
ElliotOne
Deterministic context budgeting for LLM prompts, demonstrating stable prompt packing within fixed token limits.
Fast CLI tool for counting tokens with LLM context limit comparison
vilsonrodrigues
Context management for LLM agents. Persistent memory that survives context limits.
drandrewlaw
Intelligent conversation compaction for LLM applications. Never hit context window limits again. Works with any LLM provider.
sezginpaydas
Persistent vector-based memory for local LLMs using PostgreSQL to overcome context window limits.
glpayson
Summarize long LLM chat conversations using rolling summarization to preserve context and continue conversations past token limits
rudramadhabofficial
An infinite-context LLM agent inspired by MIT's Recursive Language Models (RLM). Uses autonomous tool-use and recursion to navigate and analyze massive datasets without RAG or context limits.
Sahith59
A local-first backend AI framework that solves LLM context limits by dynamically routing queries, branching conversations independently, and compressing old memories mathematically using ChromaDB
vishal-labade
AI Evals v2 is a structured, reproducible LLM evaluation framework that isolates behavioral reliability from memory capacity. It introduces controlled experiment families, a Memory Compliance Score (MCS), and context-cliff analysis to quantify when reliability failures stem from scale vs. context limits.
ArthurusDent
Optimal Ollama is a cross-platform benchmarking and tuning tool designed to find the "Sweet Spot" for your local LLMs. It helps you determine the maximum context window a model can handle on your specific hardware before performance degrades or memory limits are exceeded.
roycrisses
A CLI tool + importable library that analyzes any codebase, finds the most relevant files to a user's question, trims content to fit any LLM's token limit, and outputs a ready-to-use context block (to clipboard or directly to an LLM API).
Aryan-202
An intelligent optimization engine that dynamically adjusts LLM selection, context size, and token limits based on real-time hardware telemetry to maximize inference efficiency and prevent resource bottlenecks.
cheikh2shift
Library for getting LLM context window limits
0pfleet
Generate documents with exact token counts to test LLM context window limits
varriaza
Find the Pareto Frontier for open LLMs across model, quantization and context limits
PsiClawOps
Active context window probing for LLM providers — finds real enforced limits, not just advertised ones