- ONNX-Ecosystem: includes ONNX Runtime (CPU, Python), dependencies, tools to convert from various frameworks, and Jupyter notebooks to help get started
- Additional dockerfiles
|Python||3.5, 3.6, 3.7, 3.8 (3.8 excludes Win GPU and Linux ARM) Python Dev Notes||Samples|
|Ruby (external project)||2.4-2.7||Samples|
|Default CPU - MLAS (Microsoft Linear Algebra Subprograms) + Eigen||NVIDIA CUDA||Intel OpenVINO|
|Intel DNNL||NVIDIA TensorRT||ARM Compute Library (preview)||Rockchip NPU (preview)|
|Intel nGraph||DirectML||Android Neural Networks API (preview)||Xilinx Vitis-AI (preview)|
|Intel MKL-ML (build option)||AMD MIGraphX (*preview)||ARM-NN (preview)|
- ONNX Runtime can be deployed to any cloud for model inference, including Azure Machine Learning Services.
- ONNX Runtime Server (beta) is a hosting application for serving ONNX models using ONNX Runtime, providing a REST API for prediction.
The expanding focus and selection of IoT devices with sensors and consistent signal streams introduces new opportunities to move AI workloads to the edge. This is particularly important when there are massive volumes of incoming data/signals that may not be efficient or useful to push to the cloud due to storage or latency considerations. Consider: surveillance tapes where 99% of footage is uneventful, or real-time person detection scenarios where immediate action is required. In these scenarios, directly executing model inference on the target device is crucial for optimal assistance.
Install or build the package you need to use in your application. (sample implementations using the C++ API)
For production scenarios, it’s strongly recommended to build only from an official release branch.