Found 179 repositories(showing 30)
hunterlew
CNN acceleration on virtex-7 FPGA with verilog HDL
mit-han-lab
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
tirumalnaidu
OpenCL HLS based CNN Accelerator on Intel DE10 Nano FPGA.
atrifex
Implementing CNN code in CUDA and OpenCL to evaluate its performance on NVIDIA GPUs, AMD GPUs, and an FPGA platform.
AlexMontgomerie
SAMO: Streaming Architecture Mapping Optimisation
Beryex
RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration
yhinai
RISC-V vector and tensor compute extensions for Vortex GPGPU acceleration for ML workloads. Optimized for transformer models, CNNs, and generative AI with configurable precision (FP32/16/BF16/INT8).
JingyangXiang
Pytorch implementation of our paper SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration accepted by NeurIPS 2023.
musco-ai
MUSCO: Multi-Stage COmpression of neural networks
flsgavin
Systolic array acceleration model of CNN, DNN and other networks.
ShahinQazvineh
This is a hybrid CNN-SVM model initially coded for structural damage detection using acceleration data
tianyili2017
Source codes for accelerate the ETH-CNN model. A related journal paper is published on IEEE TIP 2020.
Austin-TheTrueShinobi
Research repository for the IS-WiN laboratory at Clemson University. Conducted research includes; inference, classification, and GPU-[CUDA] CNN accelerations.
dev-the-desai
A high-performance 32-bit RISC-V processor core with an integrated CNN accelerator, implemented on Xilinx Nexys A7 FPGA. This project combines general-purpose computing capabilities with specialized neural network acceleration, optimized for embedded systems and edge computing applications.
JingfeiChang
A method for CNN compression and acceleration.
Developed and implemented a high-performance accelerator for Convolutional Neural Networks (CNNs) on the PYNQ-Z2 FPGA, focusing on optimizing computational efficiency and resource utilization. Conducted performance comparisons between FPGA- based and CPU-based CNN acceleration
HemantaIngle
In this Project, our main aim is to accelerate the image recognition of CNN (Convolution Neural Network) with the help of a platform deployable on FPGA. CNN focuses on image classification, speech recognition, and video analysis. CNN is accelerated by using GPU (Graphical Processing Unit), which is relatively slow and consumes a high amount of power as CNN requires 20 GFLOPS/image. Also, the CPU acceleration being cheaper as it is readily available on most x86 machines is proportional to power. The modern Application-Specific Chips(ASICS) and the capability of a Field Programmable Gate Array( FPGA ) have power efficiency and faster computation rate over the GPU. With FPGA as a reconfigurable base and parallel architecture, we decided to target the CNN acceleration with an FPGA using Pipe CNN- an algorithm that gets synthesized via HLS (Hardware Level Synthesis Tools) like Intel's Quartus, and Open CL toolkit. Modern Large scale FPGA's like Stratix 10 and Arria 10 have shown a 10 percent less power consumption than GPU's, and it has an added advantage of pipeline parallel architecture and dedicated DSP for faster and efficient computations. The main goal of the Project is to design an OpenCL accelerator that is generic and yet powerful means of improving throughput in inference computations
missionfission
No description available
dragen1860
Fast, Efficient and Elegant Android Neural Network Library with strong GPU acceleration, implemented CNN, Squeezenet yet.
CRAFT-THU
XB-SIM∗: A simulation framework for modeling and exploration of ReRAM-based CNN acceleration design
Build an accurate digit recognition model using PyTorch. Train a deep learning CNN on the MNIST dataset to classify handwritten digits. GPU acceleration for faster training. Ideal for image recognition enthusiasts.
BGUCompSci
Solution of the Helmholtz equation using multigrid with CNN acceleration.
Keio-CSG
[JSSC'24] Codes for CSNR estimation on paper A 818–4094 TOPS/W Capacitor-Reconfigured Analog CIM for Unified Acceleration of CNNs and Transformers
thelakshyadubey
A deepfake face detection system using transfer learning with Xception CNN. Trained on real and fake face datasets using data augmentation, mixed precision, and GPU acceleration. Accurately classifies facial images as real or fake with high confidence. Ideal for media forensics.
tianxiaochen1108
The project is implemented in Python and includes a CNN layer, a BiGRU layer, and a Self-Attention layer. By inputting joint angles and acceleration data from three lower limb joints of ACLR patients across three planes of motion, it can accurately predict knee joint contact forces in patients.
The difficulty of identifying a body's behavior based on sensor data, such as an accelerometer in a smartphone, is known as activity recognition. It's among the most widely studied topics in the field of machine learning-based classification. Cooking Activity Recognition Challenge (CARC) asked participants to recognize food preparation using motion capture and acceleration sensors. Two smartphones, two wristbands, and motion-capturing equipment were used to collect three-axis (x, y, z) acceleration data and motion data for the CARC dataset. One of the most challenging difficulties to solve in this investigation was identifying complicated tasks as smaller activities that are part of larger activities. Using a Convolutional Neural Network (CNN) and a Bidirectional LSTM, we’ve built a deep learning approach that extracts dynamical data for macro and micro activity identification. The model we proposed for that kind of dataset has a classification accuracy of 83% for macro activity and 85.3% for micro activity, respectively.
This repository contains the complete RTL-to-GDSII flow for implementing a systolic array architecture using the SkyWater SKY130 open-source PDK and OpenLane. The project demonstrates the digital ASIC design flow from synthesis to layout, targeting a custom matrix multiplication accelerator core.
eduardo4jesus
Phasor-Driven Acceleration for FFT-based CNNs
ThalesMMS
Multi-Layer Perceptron, CNN and Attention models in Rust with GPU acceleration.
ThalesMMS
Multi-Layer Perceptron, CNN and Attention models in Swift with GPU acceleration.