Found 18 repositories(showing 18)
xmba15
small c++ library to quickly deploy models using onnxruntime
mohamedsamirx
YOLOv12 Inference Using CPP and ONNX Runtime
Mobile-Artificial-Intelligence
Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port of the DeepPhonemizer model is used. For speech synthesis VITS models are used. Piper models are compatible after a conversion script is run.
mohamedsamirx
YOLO V8 and V11 Inference Using CPP and ONNX Runtime
hahahappyboy
C++和Python的OnnxRunTime使用
ataffe
An example of using Yolo with the Onnx runtime in C++
JPaulDuncan
A pure C# LLM inference engine built from scratch — no Python, no llama.cpp bindings, no ONNX Runtime. SharpInfer loads GGUF and Safetensors models directly, dequantizes weights in managed code, and runs the full transformer forward pass natively on .NET 8.
Choise-ieee
Yamnet for speech classification using CPP and ONNX-runtime-2025高通边缘智能创新应用大赛入围决赛方案
Rohithdgrr
RAY AI is a high-performance, private, and secure offline AI assistant for Android. It leverages llama.cpp and ONNX Runtime to provide state-of-the-art LLM inference directly on your device without requiring an internet connection.
sinously
No description available
Ronakdeora
Header-only C++17 RAII wrapper for ONNX Runtime inference
nvduy0511
No description available
onnx runtime gpu inference test code
bresilla
No description available
GururajPAthani
using yolo model for object classification with onnx runtime in cpp
PrashantNeupane33
Depth estimation with OAK_D Pro in CPP with ONNX Runtime
G-B-KEVIN-ARJUN
"Faster AI: Accelerating Qwen 2.5 from 7 t/s to 82 t/s on a single RTX 4060 using Llama.cpp and ONNX" a comparative analysis of LLM inference runtimes (PyTorch, ONNX, Llama.cpp) on consumer hardware. Benchmarking throughput, latency, and quantization trade-offs to optimize local deployment.
Dyagnosys
A comprehensive benchmarking suite evaluating deep learning inference frameworks (ONNX Runtime, NCNN, MNN, TVM, llama.cpp) across various models and threading configurations. It provides tools for model creation, performance measurement, and result visualization, helping developers and researchers optimize heterogeneous inference workloads.
All 18 repositories loaded