# ONNX Runtime ## Docs - [Custom Operators](https://mintlify.wiki/microsoft/onnxruntime/advanced/custom-operators.md): Create and register custom operators in ONNX Runtime - [Mobile Deployment](https://mintlify.wiki/microsoft/onnxruntime/advanced/mobile-deployment.md): Deploy ONNX Runtime models on Android and iOS devices - [ORT Format Models](https://mintlify.wiki/microsoft/onnxruntime/advanced/ort-format.md): Optimized model format for ONNX Runtime deployment - [Server Deployment](https://mintlify.wiki/microsoft/onnxruntime/advanced/server-deployment.md): Deploy ONNX Runtime models with ONNX Runtime Server for production inference - [WebAssembly Deployment](https://mintlify.wiki/microsoft/onnxruntime/advanced/webassembly.md): Deploy ONNX Runtime models in browsers and web applications using WebAssembly - [OrtApi Structure Reference](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/ort-api.md): Complete reference for the ONNX Runtime C API structure and functions - [C/C++ API Overview](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/overview.md): Overview of ONNX Runtime C and C++ APIs for model inference - [Execution Providers in C/C++](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/providers.md): Configuring GPU acceleration and specialized hardware execution providers - [Session Management](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/session.md): Loading models and running inference with OrtSession - [Tensor Operations](https://mintlify.wiki/microsoft/onnxruntime/api/c-cpp/tensors.md): Creating and manipulating tensors with OrtValue - [InferenceSession Class](https://mintlify.wiki/microsoft/onnxruntime/api/csharp/inference-session.md): C# InferenceSession API reference for running ONNX models - [SessionOptions Class](https://mintlify.wiki/microsoft/onnxruntime/api/csharp/session-options.md): C# SessionOptions API reference for configuring ONNX Runtime sessions - [Tensor Operations](https://mintlify.wiki/microsoft/onnxruntime/api/csharp/tensors.md): C# Tensor API reference for working with multi-dimensional arrays in ONNX Runtime - [OrtEnvironment Class](https://mintlify.wiki/microsoft/onnxruntime/api/java/ort-environment.md): Java OrtEnvironment API reference for managing ONNX Runtime resources - [OrtSession Class](https://mintlify.wiki/microsoft/onnxruntime/api/java/ort-session.md): Java OrtSession API reference for running ONNX model inference - [Java API Overview](https://mintlify.wiki/microsoft/onnxruntime/api/java/overview.md): Overview of ONNX Runtime Java API for model inference - [InferenceSession (JavaScript)](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/inference-session.md): JavaScript InferenceSession API for running ONNX models in browsers and Node.js - [Node.js-Specific APIs](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/node.md): ONNX Runtime Node.js-specific APIs and configurations - [Tensor Class (JavaScript)](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/tensor.md): JavaScript Tensor API for creating and manipulating multi-dimensional arrays - [Web-Specific APIs](https://mintlify.wiki/microsoft/onnxruntime/api/javascript/web.md): ONNX Runtime Web-specific APIs and configurations for browser environments - [InferenceSession](https://mintlify.wiki/microsoft/onnxruntime/api/python/inference-session.md): Main class for running ONNX models with ONNX Runtime - [IOBinding](https://mintlify.wiki/microsoft/onnxruntime/api/python/io-binding.md): Bind inputs and outputs to device memory for zero-copy inference - [Execution Providers](https://mintlify.wiki/microsoft/onnxruntime/api/python/providers.md): Hardware acceleration with execution providers - [Quantization API](https://mintlify.wiki/microsoft/onnxruntime/api/python/quantization.md): Model quantization for reduced size and improved performance - [RunOptions](https://mintlify.wiki/microsoft/onnxruntime/api/python/run-options.md): Configuration options for individual inference runs - [SessionOptions](https://mintlify.wiki/microsoft/onnxruntime/api/python/session-options.md): Configuration options for InferenceSession - [Transformers Tools](https://mintlify.wiki/microsoft/onnxruntime/api/python/transformers.md): Optimization tools for transformer models (BERT, GPT, T5, etc.) - [Execution Providers](https://mintlify.wiki/microsoft/onnxruntime/concepts/execution-providers.md): Understanding execution providers and hardware acceleration in ONNX Runtime - [Graph Optimization Techniques](https://mintlify.wiki/microsoft/onnxruntime/concepts/graph-optimizations.md): Understanding graph transformations and optimization levels in ONNX Runtime - [ONNX Model Format](https://mintlify.wiki/microsoft/onnxruntime/concepts/onnx-format.md): Understanding the ONNX format and how models are represented - [Core Concepts Overview](https://mintlify.wiki/microsoft/onnxruntime/concepts/overview.md): Fundamental concepts and architecture of ONNX Runtime - [InferenceSession and Session Management](https://mintlify.wiki/microsoft/onnxruntime/concepts/sessions.md): Deep dive into InferenceSession configuration and lifecycle management - [CoreML Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/coreml.md): Accelerate ONNX models on Apple devices using the CoreML Execution Provider - [CUDA Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/cuda.md): Accelerate ONNX models on NVIDIA GPUs using the CUDA Execution Provider - [DirectML Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/directml.md): Enable GPU acceleration on Windows with DirectML for cross-vendor hardware support - [OpenVINO Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/openvino.md): Optimize ONNX models for Intel hardware with the OpenVINO Execution Provider - [Execution Providers Overview](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/overview.md): Learn about ONNX Runtime execution providers and how to choose the right one for your hardware - [QNN Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/qnn.md): Optimize inference on Qualcomm hardware with the QNN Execution Provider - [TensorRT Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/tensorrt.md): Achieve maximum performance on NVIDIA GPUs with the TensorRT Execution Provider - [WebGPU Execution Provider](https://mintlify.wiki/microsoft/onnxruntime/execution-providers/webgpu.md): Enable GPU-accelerated inference in web browsers using the WebGPU Execution Provider - [C/C++ Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/c-cpp-api.md): High-performance ONNX model inference using the C++ API with real code examples - [C# Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/csharp-api.md): Run ONNX model inference in .NET applications with C# API examples - [Java Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/java-api.md): Run ONNX model inference in Java and Android applications with complete API examples - [JavaScript Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/javascript-api.md): Run ONNX models in web browsers and Node.js with the JavaScript API - [Model Optimization for Inference](https://mintlify.wiki/microsoft/onnxruntime/inference/model-optimization.md): Optimize ONNX models for production deployment with quantization, graph optimization, and profiling - [Inference Overview](https://mintlify.wiki/microsoft/onnxruntime/inference/overview.md): Learn how to run inference with ONNX Runtime across different programming languages and platforms - [Python Inference API](https://mintlify.wiki/microsoft/onnxruntime/inference/python-api.md): Complete guide to running ONNX model inference in Python with real code examples - [Installation](https://mintlify.wiki/microsoft/onnxruntime/installation.md): Install ONNX Runtime on various platforms and programming languages including Python, C/C++, C#, Java, and JavaScript - [Introduction](https://mintlify.wiki/microsoft/onnxruntime/introduction.md): High-performance ML inferencing and training accelerator for deep learning models - [Converting PyTorch Models to ONNX](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/pytorch.md): Learn how to convert PyTorch models to ONNX format using torch.onnx.export with detailed examples and best practices. - [Model Quantization Guide](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/quantization.md): Comprehensive guide to quantizing ONNX models for improved performance and reduced model size using ONNX Runtime quantization tools. - [Converting scikit-learn Models to ONNX](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/scikit-learn.md): Learn how to convert scikit-learn machine learning models to ONNX format using skl2onnx for production deployment with ONNX Runtime. - [Converting TensorFlow Models to ONNX](https://mintlify.wiki/microsoft/onnxruntime/model-conversion/tensorflow.md): Learn how to convert TensorFlow and Keras models to ONNX format using tf2onnx with practical examples and optimization techniques. - [Benchmarking Models](https://mintlify.wiki/microsoft/onnxruntime/performance/benchmarking.md): Guide to benchmarking ONNX Runtime models for accurate performance measurement and comparison - [Memory Optimization](https://mintlify.wiki/microsoft/onnxruntime/performance/memory-optimization.md): Techniques for optimizing memory usage in ONNX Runtime including memory patterns, arena allocation, and recomputation strategies - [Threading and Parallelism](https://mintlify.wiki/microsoft/onnxruntime/performance/threading.md): Guide to configuring thread pools, parallelism, and concurrency in ONNX Runtime for optimal performance - [Performance Tuning Guide](https://mintlify.wiki/microsoft/onnxruntime/performance/tuning.md): Comprehensive guide to tuning ONNX Runtime for optimal performance including session options, execution providers, and optimization techniques - [Quickstart](https://mintlify.wiki/microsoft/onnxruntime/quickstart.md): Get started with ONNX Runtime in minutes. Learn how to load models, prepare inputs, run inference, and process outputs across multiple programming languages. - [Distributed Training](https://mintlify.wiki/microsoft/onnxruntime/training/distributed-training.md): Scale ONNX Runtime training across multiple GPUs and nodes with DeepSpeed, PyTorch DDP, and other distributed frameworks - [On-Device Training](https://mintlify.wiki/microsoft/onnxruntime/training/on-device-training.md): Train ONNX models on edge devices and mobile platforms with the lightweight ONNX Runtime Training API - [ORTModule](https://mintlify.wiki/microsoft/onnxruntime/training/ortmodule.md): Accelerate PyTorch model training with ORTModule - a drop-in replacement for torch.nn.Module - [Training Overview](https://mintlify.wiki/microsoft/onnxruntime/training/overview.md): Accelerate PyTorch model training with ONNX Runtime's high-performance training capabilities