GitHub Explorer

by Alexey Ratnikov

GitHub Explorer

GitHub Explorer|TRENDING COMPARE|FEEDBACK

Back to search

cksac/turboquant-model - GitHub Explorer | GitHub Explorer | Trending | Compare

Back to search

turboquant-model

cksac•PUBLIC

View on GitHub

Created on Mar 27, 2026

Updated on Apr 7, 2026

Stars

172

Forks

Watchers

172

Open Issues

Repository Health Score

🧡

50/100

Fair

Overall repository health assessment

Score Breakdown

Activity

Active development - updated this week

30/30

100%

Issues Analytics

Total Issues

All time

Open

0% of total

Closed

Recent Commits

Merge pull request #8 from cksac/copilot/implement-turboquant-model

cksac•2 days ago

c486c48View on GitHub

Address code review: use actual prime for table_size, clarify test assertions

copilot-swe-agent[bot]•2 days ago

bd516deView on GitHub

Add hash-based weight compression module (TurboQuant-Model)

copilot-swe-agent[bot]•2 days ago

af4ae1fView on GitHub

Initial plan

copilot-swe-agent[bot]•2 days ago

e9a56b0View on GitHub

feat: per-group blockwise calibration (4-bit PPL -0.51, KLD -26%)\n\nAdd per-group alpha correction to blockwise calibration. Instead of a\nsingle scalar per row (M,), each group gets its own learnable correction\n(M,G), injected into weight_norms during the forward pass for gradient\nflow through the per-group norm scaling.\n\nResults on Qwen3.5-0.8B-Base (4-bit, 4s/50i):\n- Per-row: PPL 13.6971 (-0.259), KLD 0.1170 (-10%), 12.9 min\n- Per-group: PPL 13.4427 (-0.514), KLD 0.0959 (-26%), 14.0 min\n\nPer-group is 2x better PPL and 2.6x better KLD at only 8% more time.\nRecovers 28.1% of the quantization gap (vs 14.2% for per-row).\n\nChanges:\n- CalibrationConfig.per_group: default True\n- Blockwise cal: per-group alpha via weight_norms injection\n- _fold_alpha: handle (M,G) alpha shape\n- Test script: --per-group flag\n- Updated docs and site with per-group results"

cksac•2 days ago

16c7de7View on GitHub

feat: block-wise norm calibration (4-bit PPL -0.26, KLD -10%)\n\nAdd block-wise end-to-end norm calibration that optimizes per-row norms\nthrough each transformer block sequentially, using MSE + angular + KLD\nloss against pre-captured FP targets.\n\nKey results on Qwen3.5-0.8B-Base (4-bit):\n- PPL: 13.9564 → 13.6971 (-0.2592)\n- KLD: 0.1301 → 0.1170 (-10.1%)\n- Calibration time: ~13 min (4 samples, 50 iters)\n\nPer-layer calibration was harmful (+0.07 PPL), confirming that\nlocally optimal norms don't compose through the network.\n4+4 residual doesn't benefit (already cos≈1.000).\n\nChanges:\n- norm_calibration.py: calibrate_norms_blockwise() with sequential\n block processing, FP target pre-capture, exp parameterization\n- cli.py: --calibrate flag and calibrate subcommand now use blockwise\n- Defaults: n_samples=4, n_iters=50 (3.9x faster, equal quality)\n- docs/techniques/blockwise-calibration.md\n- site/src/app/techniques/blockwise-calibration/page.tsx\n- Updated nav chain: Norm Compression → Block-wise Cal → QJL"

cksac•2 days ago

6825487View on GitHub

feat: CPU offload for pass 2 + embedding quantization (INT8/INT4)

cksac•3 days ago

d54fc7dView on GitHub

feat: add comprehensive survey of recent quantization papers and their compatibility with TurboQuant

cksac•6 days ago

39ae95eView on GitHub

feat: add MMLU benchmark for f16 reference and 4+4bit per-layer rotation with factored_int8 norm

cksac•6 days ago

1edf89aView on GitHub

feat: implement per-layer rotation strategy in quantization and add corresponding tests