llama.cpp deployment scripts for GPT-OSS 20B GGUF model (Windows & Linux)
Stars
0
Forks
0
Watchers
0
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
30
commits
fix: remove 32k presets (not feasible on RTX 3060 12GB), use full-GPU presets
f3e3d59View on GitHubfix: 32k preset GPU layers 30->20 to avoid VRAM OOM on RTX 3060
c955c3fView on GitHubfeat: add switch.sh/switch.ps1 for quick config profile switching
86bf71cView on GitHubfeat: support settings.local.ini for machine-specific overrides
fced85cView on GitHubfix: revert KV cache type to f16, remove deprecated defrag-thold
1ed0179View on GitHubfix: add KV cache stability flags and fix Windows --jinja dead code bug
e612d32View on GitHubfeat: add Gradio frontend with streaming chat and tool support
a69bd14View on GitHubfeat: add status.sh/status.ps1 for live server monitoring (throughput, KV cache, GPU)
b48382dView on GitHubfeat: add official Agents SDK tool examples (web_search, web_open, weather, calc)
f6d8c67View on GitHubfix: make --jinja opt-in via ENABLE_JINJA in settings.ini (default false)
af11c54View on GitHubfeat: add --jinja flag for tool calling + function call examples
c201183View on GitHub