# ONNX Runtime

## Docs

- [Custom Operators](https://mintlify.wiki/microsoft/onnxruntime/advanced/custom-operators.md): Create and register custom operators in ONNX Runtime
- [Mobile Deployment](https://mintlify.wiki/microsoft/onnxruntime/advanced/mobile-deployment.md): Deploy ONNX Runtime models on Android and iOS devices
- [ORT Format Models](https://mintlify.wiki/microsoft/onnxruntime/advanced/ort-format.md): Optimized model format for ONNX Runtime deployment
- [Server Deployment](https://mintlify.wiki/microsoft/onnxruntime/advanced/server-deployment.md): Deploy ONNX Runtime models with ONNX Runtime Server for production inference
- [WebAssembly Deployment](https://mintlify.wiki/microsoft/onnxruntime/advanced/webassembly.md): Deploy ONNX Runtime models in browsers and web applications using WebAssembly
- [OrtApi Structure Reference](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/ort-api.md): Complete reference for the ONNX Runtime C API structure and functions
- [C/C++ API Overview](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/overview.md): Overview of ONNX Runtime C and C++ APIs for model inference
- [Execution Providers in C/C++](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/providers.md): Configuring GPU acceleration and specialized hardware execution providers
- [Session Management](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/session.md): Loading models and running inference with OrtSession
- [Tensor Operations](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/tensors.md): Creating and manipulating tensors with OrtValue
- [InferenceSession Class](https://mintlify.wiki/microsoft/onnxruntime/api/csharp/inference-session.md): C# InferenceSession API reference for running ONNX models
- [SessionOptions Class](https://mintlify.wiki/microsoft/onnxruntime/api/csharp/session-options.md): C# SessionOptions API reference for configuring ONNX Runtime sessions
- [Tensor Operations](https://mintlify.wiki/microsoft/onnxruntime/api/csharp/tensors.md): C# Tensor API reference for working with multi-dimensional arrays in ONNX Runtime
- [OrtEnvironment Class](https://mintlify.wiki/microsoft/onnxruntime/api/java/ort-environment.md): Java OrtEnvironment API reference for managing ONNX Runtime resources
- [OrtSession Class](https://mintlify.wiki/microsoft/onnxruntime/api/java/ort-session.md): Java OrtSession API reference for running ONNX model inference
- [Java API Overview](https://mintlify.wiki/microsoft/onnxruntime/api/java/overview.md): Overview of ONNX Runtime Java API for model inference
- [InferenceSession (JavaScript)](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/inference-session.md): JavaScript InferenceSession API for running ONNX models in browsers and Node.js
- [Node.js-Specific APIs](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/node.md): ONNX Runtime Node.js-specific APIs and configurations
- [Tensor Class (JavaScript)](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/tensor.md): JavaScript Tensor API for creating and manipulating multi-dimensional arrays
- [Web-Specific APIs](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/web.md): ONNX Runtime Web-specific APIs and configurations for browser environments
- [InferenceSession](https://mintlify.wiki/microsoft/onnxruntime/api/python/inference-session.md): Main class for running ONNX models with ONNX Runtime
- [IOBinding](https://mintlify.wiki/microsoft/onnxruntime/api/python/io-binding.md): Bind inputs and outputs to device memory for zero-copy inference
- [Execution Providers](https://mintlify.wiki/microsoft/onnxruntime/api/python/providers.md): Hardware acceleration with execution providers
- [Quantization API](https://mintlify.wiki/microsoft/onnxruntime/api/python/quantization.md): Model quantization for reduced size and improved performance
- [RunOptions](https://mintlify.wiki/microsoft/onnxruntime/api/python/run-options.md): Configuration options for individual inference runs
- [SessionOptions](https://mintlify.wiki/microsoft/onnxruntime/api/python/session-options.md): Configuration options for InferenceSession
- [Transformers Tools](https://mintlify.wiki/microsoft/onnxruntime/api/python/transformers.md): Optimization tools for transformer models (BERT, GPT, T5, etc.)
- [Execution Providers](https://mintlify.wiki/microsoft/onnxruntime/concepts/execution-providers.md): Understanding execution providers and hardware acceleration in ONNX Runtime
- [Graph Optimization Techniques](https://mintlify.wiki/microsoft/onnxruntime/concepts/graph-optimizations.md): Understanding graph transformations and optimization levels in ONNX Runtime
- [ONNX Model Format](https://mintlify.wiki/microsoft/onnxruntime/concepts/onnx-format.md): Understanding the ONNX format and how models are represented
- [Core Concepts Overview](https://mintlify.wiki/microsoft/onnxruntime/concepts/overview.md): Fundamental concepts and architecture of ONNX Runtime
- [InferenceSession and Session Management](https://mintlify.wiki/microsoft/onnxruntime/concepts/sessions.md): Deep dive into InferenceSession configuration and lifecycle management
- [CoreML Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/coreml.md): Accelerate ONNX models on Apple devices using the CoreML Execution Provider
- [CUDA Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/cuda.md): Accelerate ONNX models on NVIDIA GPUs using the CUDA Execution Provider
- [DirectML Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/directml.md): Enable GPU acceleration on Windows with DirectML for cross-vendor hardware support
- [OpenVINO Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/openvino.md): Optimize ONNX models for Intel hardware with the OpenVINO Execution Provider
- [Execution Providers Overview](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/overview.md): Learn about ONNX Runtime execution providers and how to choose the right one for your hardware
- [QNN Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/qnn.md): Optimize inference on Qualcomm hardware with the QNN Execution Provider
- [TensorRT Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/tensorrt.md): Achieve maximum performance on NVIDIA GPUs with the TensorRT Execution Provider
- [WebGPU Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/webgpu.md): Enable GPU-accelerated inference in web browsers using the WebGPU Execution Provider
- [C/C++ Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/c-cpp-api.md): High-performance ONNX model inference using the C++ API with real code examples
- [C# Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/csharp-api.md): Run ONNX model inference in .NET applications with C# API examples
- [Java Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/java-api.md): Run ONNX model inference in Java and Android applications with complete API examples
- [JavaScript Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/javascript-api.md): Run ONNX models in web browsers and Node.js with the JavaScript API
- [Model Optimization for Inference](https://mintlify.wiki/microsoft/onnxruntime/inference/model-optimization.md): Optimize ONNX models for production deployment with quantization, graph optimization, and profiling
- [Inference Overview](https://mintlify.wiki/microsoft/onnxruntime/inference/overview.md): Learn how to run inference with ONNX Runtime across different programming languages and platforms
- [Python Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/python-api.md): Complete guide to running ONNX model inference in Python with real code examples
- [Installation](https://mintlify.wiki/microsoft/onnxruntime/installation.md): Install ONNX Runtime on various platforms and programming languages including Python, C/C++, C#, Java, and JavaScript
- [Introduction](https://mintlify.wiki/microsoft/onnxruntime/introduction.md): High-performance ML inferencing and training accelerator for deep learning models
- [Converting PyTorch Models to ONNX](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/pytorch.md): Learn how to convert PyTorch models to ONNX format using torch.onnx.export with detailed examples and best practices.
- [Model Quantization Guide](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/quantization.md): Comprehensive guide to quantizing ONNX models for improved performance and reduced model size using ONNX Runtime quantization tools.
- [Converting scikit-learn Models to ONNX](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/scikit-learn.md): Learn how to convert scikit-learn machine learning models to ONNX format using skl2onnx for production deployment with ONNX Runtime.
- [Converting TensorFlow Models to ONNX](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/tensorflow.md): Learn how to convert TensorFlow and Keras models to ONNX format using tf2onnx with practical examples and optimization techniques.
- [Benchmarking Models](https://mintlify.wiki/microsoft/onnxruntime/performance/benchmarking.md): Guide to benchmarking ONNX Runtime models for accurate performance measurement and comparison
- [Memory Optimization](https://mintlify.wiki/microsoft/onnxruntime/performance/memory-optimization.md): Techniques for optimizing memory usage in ONNX Runtime including memory patterns, arena allocation, and recomputation strategies
- [Threading and Parallelism](https://mintlify.wiki/microsoft/onnxruntime/performance/threading.md): Guide to configuring thread pools, parallelism, and concurrency in ONNX Runtime for optimal performance
- [Performance Tuning Guide](https://mintlify.wiki/microsoft/onnxruntime/performance/tuning.md): Comprehensive guide to tuning ONNX Runtime for optimal performance including session options, execution providers, and optimization techniques
- [Quickstart](https://mintlify.wiki/microsoft/onnxruntime/quickstart.md): Get started with ONNX Runtime in minutes. Learn how to load models, prepare inputs, run inference, and process outputs across multiple programming languages.
- [Distributed Training](https://mintlify.wiki/microsoft/onnxruntime/training/distributed-training.md): Scale ONNX Runtime training across multiple GPUs and nodes with DeepSpeed, PyTorch DDP, and other distributed frameworks
- [On-Device Training](https://mintlify.wiki/microsoft/onnxruntime/training/on-device-training.md): Train ONNX models on edge devices and mobile platforms with the lightweight ONNX Runtime Training API
- [ORTModule](https://mintlify.wiki/microsoft/onnxruntime/training/ortmodule.md): Accelerate PyTorch model training with ORTModule - a drop-in replacement for torch.nn.Module
- [Training Overview](https://mintlify.wiki/microsoft/onnxruntime/training/overview.md): Accelerate PyTorch model training with ONNX Runtime's high-performance training capabilities