Universal vulkan gguf loader. Will load v1, v2, and v3 gguf files, all quantized formats
Stars
0
Forks
0
Watchers
0
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
28
commits
Phase 6: Moonshot Expansions - Hybrid ray-traced attention for 1M+ tokens, multimodal video decode prototype, adaptive precision switching, federated multi-GPU external memory, WASM/Docker/browser ecosystem
07d8168View on GitHubPhase 5: Ultimate Performance Polish - Optimize ray query MoE with dynamic thresholds/coop matrices, full bindless overhaul, async enhancements, dequant fusion expansion, benchmark suite upgrades
896b599View on GitHubPhase 4: Verify & Stabilize - Stress tests for sparse MoE, GEMM benchmarks, multi-GPU scaling validation, numerical stability checks, portability audit
4bd2b56View on GitHubPhase 3: Out-of-the-Box Moonshots - Sparse MoE attention with ray queries, mixed-precision support, async compute queues, Vulkan video extensions, CI/GitHub Actions, comparison table
65d8342View on GitHubPhase 2: Performance Polish - GEMM shader optimizations, kernel fusion, paged KV cache, bindless descriptors, benchmark enhancements
732178fView on GitHubPhase 1: Verification & Hardening - Replace all mock test code with real model loading and generation, add speculative decoding flag, implement GPU profiling with timestamps
648b1feView on GitHubPhase 1: Verification & Hardening - Add runtime Vulkan extension checks with fallbacks, implement per-layer GPU timestamp queries for matmul/attention/transfer breakdown, expand tests with integration tests for speculative decoding, multi-GPU, LoRA, verify FlashAttention causality on long prompts, tune dynamic workgroup sizes
9d7fef1View on GitHubPhase 3: Maximum Fusion & Optimizations - fused FFN and RMSNorm kernels
f366d34View on GitHubPhase 2: True FlashAttention-2 Mastery - complete tiled causal attention
ecb1c7aView on GitHubPhase 1: Perfect GEMM Suite - optimized tiled matmul kernels
13dc162View on GitHubfeat: Phase 1 - Rewrite GEMM shaders with modern optimizations
971fc5dView on GitHub