Accelerated Edge
Machine Learning

Production-grade AI engine to speed up training and inferencing in your existing technology stack.

In a rush? Get started easily:

pip install onnxruntime

pip install onnxruntime-genai

Interested in using other languages? See the many others we support →

Trusted By

Generative AI

Integrate the power of Generative AI and Large language Models (LLMs) in your apps and services with ONNX Runtime. No matter what language you develop in or what platform you need to run on, you can make use of state-of-the-art models for image synthesis, text generation, and more.

Learn more about ONNX Runtime & Generative AI →

Use ONNX Runtime with your favorite language and get started with the tutorials:

Quickstart Tutorials Install ONNX Runtime Hardware acceleration Get started (Docs)

Python

Java

C++

import onnxruntime as ort
# Load the model and create InferenceSession
model_path = "path/to/your/onnx/model"
session = ort.InferenceSession(model_path)
# "Load and preprocess the input image inputTensor"
...
# Run inference
outputs = session.run(None, {"input": inputTensor})
print(outputs)

Python Docs

Videos

Check out some of our videos to help you get started!

What is ONNX Runtime (ORT)?

Converting Models to ONNX Format

Optimize Training and Inference with ONNX Runtime (ACPT/DeepSpeed)

ONNX Runtime YouTube channel →

Cross-Platform

Do you program in Python? C#? C++? Java? JavaScript? Rust? No problem. ONNX Runtime has you covered with support for many languages. And it runs on Linux, Windows, Mac, iOS, Android, and even in web browsers.

Performance

CPU, GPU, NPU - no matter what hardware you run on, ONNX Runtime optimizes for latency, throughput, memory utilization, and binary size. In addition to excellent out-of-the-box performance for common usage patterns, additional model optimization techniques and runtime configurations are available to further improve performance for specific use cases and models.

ONNX Runtime Inferencing

ONNX Runtime powers AI in Microsoft products including Windows, Office, Azure Cognitive Services, and Bing, as well as in thousands of other projects across the world. ONNX Runtime is cross-platform, supporting cloud, edge, web, and mobile experiences.

Learn more about ONNX Runtime Inferencing →

Web Browsers

Run PyTorch and other ML models in the web browser with ONNX Runtime Web.

Mobile Devices

Infuse your Android and iOS mobile apps with AI using ONNX Runtime Mobile.

ONNX Runtime Training

ONNX Runtime reduces costs for large model training and enables on-device training.

Learn more about ONNX Runtime Training →

Large Model Training

Accelerate training of popular models, including Hugging Face models like Llama-2-7b and curated models from the Azure AI | Machine Learning Studio model catalog.

On-Device Training

On-device training with ONNX Runtime lets developers take an inference model and train it locally to deliver a more personalized and privacy-respecting experience for customers.

Accelerated Edge Machine Learning