LLVM Plugin to Instrument Global Memory Accesses in CUDA Kernels
Stars
10
Forks
3
Watchers
10
Open Issues
1
Overall repository health assessment
No package.json found
This might not be a Node.js project
6
commits
2
commits
replace __shfl call with call to __shfl_sync with a full mask. this should be equivalent to the deprecated __shfl (see CUDA-C-Programming Guide for Cuda/9.2 Section B.15.2 and B.15.5.1)
d1ee4f2View on GitHubreplace __ballot(1) with __activemask() (see CUDA-C-Programming Guide for Cuda/9.2 Section H.6.2)
c208918View on GitHub