InferenceSession Class - ONNX Runtime

The InferenceSession class is the main entry point for running inference with ONNX models in C#.

Namespace

Microsoft.ML.OnnxRuntime

Class Declaration

public class InferenceSession : IDisposable

Constructors

InferenceSession(string)

Constructs an InferenceSession from a model file.

public InferenceSession(string modelPath)

Parameters:

modelPath (string): Path to the ONNX or ORT model file

Example:

using Microsoft.ML.OnnxRuntime;

var session = new InferenceSession("model.onnx");

InferenceSession(string, SessionOptions)

Constructs an InferenceSession with custom session options.

public InferenceSession(string modelPath, SessionOptions options)

Parameters:

modelPath (string): Path to the model file
options (SessionOptions): Session configuration options

Example:

var options = new SessionOptions();
options.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL;

var session = new InferenceSession("model.onnx", options);

InferenceSession(byte[])

Constructs an InferenceSession from a model in a byte array.

public InferenceSession(byte[] model)

Parameters:

model (byte[]): ONNX model as byte array

Example:

byte[] modelBytes = File.ReadAllBytes("model.onnx");
var session = new InferenceSession(modelBytes);

InferenceSession(byte[], SessionOptions)

Constructs an InferenceSession from bytes with custom options.

public InferenceSession(byte[] model, SessionOptions options)

InferenceSession with PrePackedWeightsContainer

Constructs a session that shares pre-packed weights across multiple sessions.

public InferenceSession(
    string modelPath, 
    SessionOptions options,
    PrePackedWeightsContainer prepackedWeightsContainer)

Parameters:

modelPath (string): Path to the model
options (SessionOptions): Session options
prepackedWeightsContainer (PrePackedWeightsContainer): Shared weights container

Example:

var container = new PrePackedWeightsContainer();
var options = new SessionOptions();

var session1 = new InferenceSession("model.onnx", options, container);
var session2 = new InferenceSession("model.onnx", options, container);
// Both sessions share pre-packed weights

Properties

InputMetadata

Gets metadata for input nodes.

public IReadOnlyDictionary<string, NodeMetadata> InputMetadata { get; }

Example:

foreach (var input in session.InputMetadata)
{
    Console.WriteLine($"Input: {input.Key}");
    Console.WriteLine($"  Type: {input.Value.ElementDataType}");
    Console.WriteLine($"  Shape: [{string.Join(", ", input.Value.Dimensions)}]");
}

InputNames

Gets ordered list of input names.

public IReadOnlyList<string> InputNames { get; }

Example:

var inputName = session.InputNames[0];
Console.WriteLine($"First input: {inputName}");

OutputMetadata

Gets metadata for output nodes.

public IReadOnlyDictionary<string, NodeMetadata> OutputMetadata { get; }

OutputNames

Gets ordered list of output names.

public IReadOnlyList<string> OutputNames { get; }

OverridableInitializerMetadata

Gets metadata for overridable initializers.

public IReadOnlyDictionary<string, NodeMetadata> OverridableInitializerMetadata { get; }

Methods

Run

Runs inference on the model.

Run(IReadOnlyCollection<NamedOnnxValue>)

public IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(
    IReadOnlyCollection<NamedOnnxValue> inputs)

Parameters:

inputs: Collection of input tensors

Returns: Collection of output tensors Example:

using Microsoft.ML.OnnxRuntime.Tensors;

// Create input tensor
var inputData = new float[] { 1.0f, 2.0f, 3.0f, 4.0f };
var tensor = new DenseTensor<float>(inputData, new[] { 1, 4 });
var inputs = new List<NamedOnnxValue>
{
    NamedOnnxValue.CreateFromTensor("input", tensor)
};

// Run inference
using (var results = session.Run(inputs))
{
    var output = results.First().AsTensor<float>();
    Console.WriteLine("Output: " + string.Join(", ", output.ToArray()));
}

Run(IReadOnlyCollection<NamedOnnxValue>, IReadOnlyCollection<string>)

Runs inference with specific output names.

public IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(
    IReadOnlyCollection<NamedOnnxValue> inputs,
    IReadOnlyCollection<string> outputNames)

Example:

var outputNames = new[] { "output1", "output2" };
using (var results = session.Run(inputs, outputNames))
{
    foreach (var result in results)
    {
        Console.WriteLine($"{result.Name}: {result.AsTensor<float>().GetValue(0)}");
    }
}

Run(IReadOnlyCollection<NamedOnnxValue>, RunOptions)

Runs inference with custom run options.

public IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(
    IReadOnlyCollection<NamedOnnxValue> inputs,
    RunOptions options)

RunAsync

Asynchronously runs inference.

public Task<IDisposableReadOnlyCollection<DisposableNamedOnnxValue>> RunAsync(
    IReadOnlyCollection<NamedOnnxValue> inputs)

Example:

var results = await session.RunAsync(inputs);
using (results)
{
    // Process results
}

GetMemoryInfosForInputs

Gets memory information for all inputs.

public IDisposableReadOnlyCollection<OrtMemoryInfo> GetMemoryInfosForInputs()

GetMemoryInfosForOutputs

Gets memory information for all outputs.

public IDisposableReadOnlyCollection<OrtMemoryInfo> GetMemoryInfosForOutputs()

Dispose

Releases resources used by the session.

public void Dispose()

Best Practice:

using (var session = new InferenceSession("model.onnx"))
{
    // Use session
}
// Session is automatically disposed

Complete Example

Image Classification

using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
using System;
using System.Collections.Generic;
using System.Linq;

class ImageClassifier
{
    private InferenceSession _session;
    
    public ImageClassifier(string modelPath)
    {
        var options = new SessionOptions();
        options.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL;
        _session = new InferenceSession(modelPath, options);
        
        // Print model info
        Console.WriteLine("Model Inputs:");
        foreach (var input in _session.InputMetadata)
        {
            Console.WriteLine($"  {input.Key}: {input.Value.ElementDataType} {string.Join("x", input.Value.Dimensions)}");
        }
    }
    
    public float[] Classify(float[] imageData, int[] shape)
    {
        // Create input tensor
        var tensor = new DenseTensor<float>(imageData, shape);
        var inputs = new List<NamedOnnxValue>
        {
            NamedOnnxValue.CreateFromTensor(_session.InputNames[0], tensor)
        };
        
        // Run inference
        using (var results = _session.Run(inputs))
        {
            var output = results.First().AsTensor<float>();
            return output.ToArray();
        }
    }
    
    public void Dispose()
    {
        _session?.Dispose();
    }
}

// Usage
class Program
{
    static void Main()
    {
        using (var classifier = new ImageClassifier("resnet50.onnx"))
        {
            var imageData = new float[1 * 3 * 224 * 224];
            // ... load and preprocess image data ...
            
            var predictions = classifier.Classify(
                imageData, 
                new[] { 1, 3, 224, 224 }
            );
            
            // Get top prediction
            var maxIndex = Array.IndexOf(predictions, predictions.Max());
            Console.WriteLine($"Predicted class: {maxIndex}");
            Console.WriteLine($"Confidence: {predictions[maxIndex]:P2}");
        }
    }
}

Batch Processing

public class BatchInference
{
    private InferenceSession _session;
    
    public BatchInference(string modelPath)
    {
        _session = new InferenceSession(modelPath);
    }
    
    public List<float[]> ProcessBatch(List<float[]> batch)
    {
        var results = new List<float[]>();
        
        foreach (var item in batch)
        {
            var tensor = new DenseTensor<float>(item, new[] { 1, item.Length });
            var inputs = new List<NamedOnnxValue>
            {
                NamedOnnxValue.CreateFromTensor("input", tensor)
            };
            
            using (var output = _session.Run(inputs))
            {
                results.Add(output.First().AsTensor<float>().ToArray());
            }
        }
        
        return results;
    }
}

Thread Safety

The InferenceSession class is thread-safe for Run() operations. Multiple threads can call Run() concurrently on the same session instance.

var session = new InferenceSession("model.onnx");

Parallel.For(0, 10, i =>
{
    var inputs = CreateInputs(i);
    using (var results = session.Run(inputs))
    {
        ProcessResults(results);
    }
});

Performance Tips

Reuse sessions: Create session once, use many times
Use SessionOptions: Configure optimizations appropriately
Dispose properly: Always dispose sessions and results
Batch when possible: Process multiple inputs together
Use execution providers: Enable GPU acceleration when available

​Namespace

​Class Declaration

​Constructors

​InferenceSession(string)

​InferenceSession(string, SessionOptions)

​InferenceSession(byte[])

​InferenceSession(byte[], SessionOptions)

​InferenceSession with PrePackedWeightsContainer

​Properties

​InputMetadata

​InputNames

​OutputMetadata

​OutputNames

​OverridableInitializerMetadata

​Methods

​Run

​Run(IReadOnlyCollection<NamedOnnxValue>)

​Run(IReadOnlyCollection<NamedOnnxValue>, IReadOnlyCollection<string>)

​Run(IReadOnlyCollection<NamedOnnxValue>, RunOptions)

​RunAsync

​GetMemoryInfosForInputs

​GetMemoryInfosForOutputs

​Dispose

​Complete Example

​Image Classification

​Batch Processing

​Thread Safety

​Performance Tips

​See Also

Namespace

Class Declaration

Constructors

InferenceSession(string)

InferenceSession(string, SessionOptions)

InferenceSession(byte[])

InferenceSession(byte[], SessionOptions)

InferenceSession with PrePackedWeightsContainer

Properties

InputMetadata

InputNames

OutputMetadata

OutputNames

OverridableInitializerMetadata

Methods

Run

Run(IReadOnlyCollection<NamedOnnxValue>)

Run(IReadOnlyCollection<NamedOnnxValue>, IReadOnlyCollection<string>)

Run(IReadOnlyCollection<NamedOnnxValue>, RunOptions)

RunAsync

GetMemoryInfosForInputs

GetMemoryInfosForOutputs

Dispose

Complete Example

Image Classification

Batch Processing

Thread Safety

Performance Tips

See Also