dhawalc/turboQuantDC - GitHub Explorer | GitHub Explorer | Trending

Stars

32

Forks

7

Watchers

32

Open Issues

0

Repository Health Score

🧡

60/100

Fair

Overall repository health assessment

Score Breakdown

Activity

Active development - updated this week

30/30

100%

Recent Commits

perf: Triton-accelerate full pipeline (quantize + dequantize + WHT rotation)

Dhawal•4 days ago

a391a98View on GitHub

perf: Triton WHT rotation kernel for O(d log d) fused rotation

Dhawal•4 days ago

c1126abView on GitHub

perf: wire Triton fused quantize kernel into GenerationCache

Dhawal•4 days ago

5bc5a3cView on GitHub

perf: wire Triton fused dequantize kernel into GenerationCache

Dhawal•4 days ago

6018fbfView on GitHub

perf: add speed benchmark for Triton vs Python quantize/dequantize

Dhawal•4 days ago

40b7d03View on GitHub

feat: overnight validation — 236B model running on single RTX 4090

Dhawal•4 days ago

afef829View on GitHub

feat: cross-layer KV cache with shared codebook/rotation resources

Dhawal•4 days ago

c86f04eView on GitHub

feat: 1-bit value quantization with correction (V compression is free)

Dhawal•5 days ago

6b8a749View on GitHub

feat: self-correcting KV cache with periodic refresh (prevents error accumulation)

Dhawal•5 days ago

f4c10baView on GitHub

feat: ultra-streaming engine for 200B+ models on consumer GPUs

Dhawal•5 days ago

3688a3eView on GitHub

perf: switch to WHT rotation as default (O(d log d), 98% quality match)

Dhawal•5 days ago

a40eaabView on GitHub

feat: unified 70B-on-4090 launcher with auto-configured KV compression

Dhawal•5 days ago

dc904ecView on GitHub

feat: integrate HybridCache + fixed eviction + streaming into autoresearch sweep

Dhawal•5 days ago

9959393View on GitHub

feat: TurboQuant weight compression (TQ-W) for ultra-low-bit model deployment

Dhawal•5 days ago

c94127fView on GitHub

feat: HybridCache combining boundary anchoring + gradient bits + per-head allocation

Dhawal•5 days ago

f22e6a1View on GitHub

View all commits