Search Results

Found 21 repositories(showing 21)

Flash-Sparse-Attention

Relaxed-System-Lab

🧡66

🚀🚀 Efficient implementations of Native Sparse Attention

941

Apache-2.0

Python

Updated 2 days ago

kernelslarge-language-modelsmachine-learning-systems

flash-sparse-attention

HKUSTDial

💛71

Trainable fast and memory-efficient sparse attention

598

BSD-3-Clause

Python

Updated 17 hours ago

flash-attentionflash-sparse-attentionkernel+2

dynamic-sparse-flash-attention

epfml

❤️30

No description available

150

NOASSERTION

Jupyter Notebook

Updated 4 months ago

Scalable-Flash-Native-Sparse-Attention

mdy666

❤️40

No description available

MIT

Jupyter Notebook

Updated 1 week ago

adasplash

deep-spin

🧡55

AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)

MIT

Python

Updated 1 week ago

Block-Sparse-Flash-Attention

Danielohayon

❤️40

No description available

BSD-3-Clause

C++

Updated 1 month ago

chill-attention

alexdremov

❤️40

Fast, flexible, and chill sparse flash attention kernel

GPL-3.0

Python

Updated 3 weeks ago

Adaptive-Splash-Attention

mvideet

❤️40

Developing CUDA Kernel for Adaptive Sparse Flash Attention (Goncalves et. al)

GPL-3.0

Jupyter Notebook

Updated 5 months ago

viva_tensor

gabrielmaialva33

🧡50

Pure Gleam tensor library with quantization (INT8, NF4, AWQ), Flash Attention, and 2:4 Sparsity - 7.5x memory multiplication

MIT

Gleam

Updated 1 month ago

awqdeep-learningflash-attention+7

asam-attention

li-guohao

🧡50

Adaptive Sparse Attention Module with Flash Attention - 5.45x speedup on consumer GPUs

MIT

Python

Updated 2 months ago

attention-mechanismcudadeep-learning+6

SplashAttention

raayandhar

❤️35

Implementation of Sparse Flash (Splash) Attention in CUDA. FP32, nothing production grade.

GPL-3.0

Jupyter Notebook

Updated 10 months ago

FlashInfer_SparseAttention

BhoumikPatidar

❤️45

MLSys 2026 NVIDIA Track: FlashInfer-Bench Contest - Deepseek Sparse Attention

Updated 2 months ago

flashInfer-DeepseekSparseAttention

pranay5255

❤️35

No description available

Python

Updated 2 months ago

SCFA

raayandhar

❤️40

Sparse Causal Flash Attention. QK-sparse and Hash-sparse attention kernels.

GPL-3.0

Jupyter Notebook

Updated 9 months ago

Block-Sparse-Flash-Attention

Anonymous44414

❤️40

No description available

BSD-3-Clause

C++

Updated 2 months ago

sparse_attention_flashinfer_contest

ykirpichev

❤️35

No description available

Python

Updated 1 month ago

flashinfer-sparse-attention-triton

benzhang0323

❤️45

No description available

Python

Updated 2 weeks ago

sparse-flash-attention-pallas

reachtarunhere

❤️25

No description available

Python

Updated 6 months ago

Deep-Learning

tranhohoangvu

❤️35

Deep Learning coursework (2025): attention mechanisms (Self/Flash/Linear/Sparse) and OCR with ResNet + Transformer Decoder.

Jupyter Notebook

Updated 3 months ago

attentionbleucomputer-vision+11

Deep Learning final project exploring advanced attention mechanisms in LLMs (self-attention, MQA, GQA, Flash/linear/sparse attention, RoPE) with PyTorch demos, plus a CNN + Transformer-Decoder OCR model for image-to-text with evaluation on test data.

Jupyter Notebook

Updated 4 months ago

fastest_flash_attention

MindIntels

🧡55

⚡ Production-ready Flash Attention library unifying FlashAttention-2/3/4 + FFPA innovations. including polynomial exp2 emulation, conditional rescaling, ping-pong pipelining, GQA/MQA/MLA, paged KV-cache, block-sparse masking, and Triton auto-tuned GPU kernels

MIT

Python

Updated 3 weeks ago

All 21 repositories loaded

GitHub Explorer

Search Results

Flash-Sparse-Attention

flash-sparse-attention

dynamic-sparse-flash-attention

Scalable-Flash-Native-Sparse-Attention

adasplash

Block-Sparse-Flash-Attention

chill-attention

Adaptive-Splash-Attention

viva_tensor

asam-attention

SplashAttention

FlashInfer_SparseAttention

flashInfer-DeepseekSparseAttention

SCFA

Block-Sparse-Flash-Attention

sparse_attention_flashinfer_contest

flashinfer-sparse-attention-triton

sparse-flash-attention-pallas

Deep-Learning

Final_DeepLearing

fastest_flash_attention

Flash-Sparse-Attention

flash-sparse-attention

dynamic-sparse-flash-attention

Scalable-Flash-Native-Sparse-Attention

adasplash

Block-Sparse-Flash-Attention

chill-attention

Adaptive-Splash-Attention

viva_tensor

asam-attention

SplashAttention

FlashInfer_SparseAttention

flashInfer-DeepseekSparseAttention

SCFA

Block-Sparse-Flash-Attention

sparse_attention_flashinfer_contest

flashinfer-sparse-attention-triton

sparse-flash-attention-pallas

Deep-Learning

Final_DeepLearing

fastest_flash_attention