Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/onnxruntime/llms.txt
Use this file to discover all available pages before exploring further.
The InferenceSession class is the main entry point for running inference with ONNX models in C#.
Namespace
Class Declaration
public class InferenceSession : IDisposable
Constructors
InferenceSession(string)
Constructs an InferenceSession from a model file.
public InferenceSession(string modelPath)
Parameters:
modelPath (string): Path to the ONNX or ORT model file
Example:
using Microsoft.ML.OnnxRuntime;
var session = new InferenceSession("model.onnx");
InferenceSession(string, SessionOptions)
Constructs an InferenceSession with custom session options.
public InferenceSession(string modelPath, SessionOptions options)
Parameters:
modelPath (string): Path to the model file
options (SessionOptions): Session configuration options
Example:
var options = new SessionOptions();
options.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL;
var session = new InferenceSession("model.onnx", options);
InferenceSession(byte[])
Constructs an InferenceSession from a model in a byte array.
public InferenceSession(byte[] model)
Parameters:
model (byte[]): ONNX model as byte array
Example:
byte[] modelBytes = File.ReadAllBytes("model.onnx");
var session = new InferenceSession(modelBytes);
InferenceSession(byte[], SessionOptions)
Constructs an InferenceSession from bytes with custom options.
public InferenceSession(byte[] model, SessionOptions options)
InferenceSession with PrePackedWeightsContainer
Constructs a session that shares pre-packed weights across multiple sessions.
public InferenceSession(
string modelPath,
SessionOptions options,
PrePackedWeightsContainer prepackedWeightsContainer)
Parameters:
modelPath (string): Path to the model
options (SessionOptions): Session options
prepackedWeightsContainer (PrePackedWeightsContainer): Shared weights container
Example:
var container = new PrePackedWeightsContainer();
var options = new SessionOptions();
var session1 = new InferenceSession("model.onnx", options, container);
var session2 = new InferenceSession("model.onnx", options, container);
// Both sessions share pre-packed weights
Properties
Gets metadata for input nodes.
public IReadOnlyDictionary<string, NodeMetadata> InputMetadata { get; }
Example:
foreach (var input in session.InputMetadata)
{
Console.WriteLine($"Input: {input.Key}");
Console.WriteLine($" Type: {input.Value.ElementDataType}");
Console.WriteLine($" Shape: [{string.Join(", ", input.Value.Dimensions)}]");
}
Gets ordered list of input names.
public IReadOnlyList<string> InputNames { get; }
Example:
var inputName = session.InputNames[0];
Console.WriteLine($"First input: {inputName}");
Gets metadata for output nodes.
public IReadOnlyDictionary<string, NodeMetadata> OutputMetadata { get; }
OutputNames
Gets ordered list of output names.
public IReadOnlyList<string> OutputNames { get; }
Gets metadata for overridable initializers.
public IReadOnlyDictionary<string, NodeMetadata> OverridableInitializerMetadata { get; }
Methods
Run
Runs inference on the model.
Run(IReadOnlyCollection<NamedOnnxValue>)
public IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(
IReadOnlyCollection<NamedOnnxValue> inputs)
Parameters:
inputs: Collection of input tensors
Returns: Collection of output tensors
Example:
using Microsoft.ML.OnnxRuntime.Tensors;
// Create input tensor
var inputData = new float[] { 1.0f, 2.0f, 3.0f, 4.0f };
var tensor = new DenseTensor<float>(inputData, new[] { 1, 4 });
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("input", tensor)
};
// Run inference
using (var results = session.Run(inputs))
{
var output = results.First().AsTensor<float>();
Console.WriteLine("Output: " + string.Join(", ", output.ToArray()));
}
Run(IReadOnlyCollection<NamedOnnxValue>, IReadOnlyCollection<string>)
Runs inference with specific output names.
public IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(
IReadOnlyCollection<NamedOnnxValue> inputs,
IReadOnlyCollection<string> outputNames)
Example:
var outputNames = new[] { "output1", "output2" };
using (var results = session.Run(inputs, outputNames))
{
foreach (var result in results)
{
Console.WriteLine($"{result.Name}: {result.AsTensor<float>().GetValue(0)}");
}
}
Run(IReadOnlyCollection<NamedOnnxValue>, RunOptions)
Runs inference with custom run options.
public IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(
IReadOnlyCollection<NamedOnnxValue> inputs,
RunOptions options)
RunAsync
Asynchronously runs inference.
public Task<IDisposableReadOnlyCollection<DisposableNamedOnnxValue>> RunAsync(
IReadOnlyCollection<NamedOnnxValue> inputs)
Example:
var results = await session.RunAsync(inputs);
using (results)
{
// Process results
}
Gets memory information for all inputs.
public IDisposableReadOnlyCollection<OrtMemoryInfo> GetMemoryInfosForInputs()
GetMemoryInfosForOutputs
Gets memory information for all outputs.
public IDisposableReadOnlyCollection<OrtMemoryInfo> GetMemoryInfosForOutputs()
Dispose
Releases resources used by the session.
Best Practice:
using (var session = new InferenceSession("model.onnx"))
{
// Use session
}
// Session is automatically disposed
Complete Example
Image Classification
using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
using System;
using System.Collections.Generic;
using System.Linq;
class ImageClassifier
{
private InferenceSession _session;
public ImageClassifier(string modelPath)
{
var options = new SessionOptions();
options.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL;
_session = new InferenceSession(modelPath, options);
// Print model info
Console.WriteLine("Model Inputs:");
foreach (var input in _session.InputMetadata)
{
Console.WriteLine($" {input.Key}: {input.Value.ElementDataType} {string.Join("x", input.Value.Dimensions)}");
}
}
public float[] Classify(float[] imageData, int[] shape)
{
// Create input tensor
var tensor = new DenseTensor<float>(imageData, shape);
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor(_session.InputNames[0], tensor)
};
// Run inference
using (var results = _session.Run(inputs))
{
var output = results.First().AsTensor<float>();
return output.ToArray();
}
}
public void Dispose()
{
_session?.Dispose();
}
}
// Usage
class Program
{
static void Main()
{
using (var classifier = new ImageClassifier("resnet50.onnx"))
{
var imageData = new float[1 * 3 * 224 * 224];
// ... load and preprocess image data ...
var predictions = classifier.Classify(
imageData,
new[] { 1, 3, 224, 224 }
);
// Get top prediction
var maxIndex = Array.IndexOf(predictions, predictions.Max());
Console.WriteLine($"Predicted class: {maxIndex}");
Console.WriteLine($"Confidence: {predictions[maxIndex]:P2}");
}
}
}
Batch Processing
public class BatchInference
{
private InferenceSession _session;
public BatchInference(string modelPath)
{
_session = new InferenceSession(modelPath);
}
public List<float[]> ProcessBatch(List<float[]> batch)
{
var results = new List<float[]>();
foreach (var item in batch)
{
var tensor = new DenseTensor<float>(item, new[] { 1, item.Length });
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("input", tensor)
};
using (var output = _session.Run(inputs))
{
results.Add(output.First().AsTensor<float>().ToArray());
}
}
return results;
}
}
Thread Safety
The InferenceSession class is thread-safe for Run() operations. Multiple threads can call Run() concurrently on the same session instance.
var session = new InferenceSession("model.onnx");
Parallel.For(0, 10, i =>
{
var inputs = CreateInputs(i);
using (var results = session.Run(inputs))
{
ProcessResults(results);
}
});
- Reuse sessions: Create session once, use many times
- Use SessionOptions: Configure optimizations appropriately
- Dispose properly: Always dispose sessions and results
- Batch when possible: Process multiple inputs together
- Use execution providers: Enable GPU acceleration when available
See Also