SessionOptions

The SessionOptions class allows you to configure various aspects of an InferenceSession, including graph optimization, thread pool sizes, execution providers, and profiling.

Constructor

SessionOptions()

Creates a new SessionOptions object with default settings.

Properties

graph_optimization_level

GraphOptimizationLevel

Controls the level of graph optimizations applied.

ORT_DISABLE_ALL - No optimizations
ORT_ENABLE_BASIC - Basic optimizations (default)
ORT_ENABLE_EXTENDED - Extended optimizations
ORT_ENABLE_ALL - All optimizations including layout transformations

intra_op_num_threads

int

Number of threads used to parallelize execution within nodes. Default is 0 (use default number).

inter_op_num_threads

int

Number of threads used to parallelize execution of nodes. Default is 0 (use default number).

execution_mode

ExecutionMode

Controls whether operators are executed sequentially or in parallel.

ORT_SEQUENTIAL - Execute operators sequentially
ORT_PARALLEL - Execute operators in parallel when possible

execution_order

ExecutionOrder

Controls the order in which graph nodes are executed.

DEFAULT - Use default topological order
PRIORITY_BASED - Use priority-based scheduling

enable_profiling

bool

Enable profiling to collect performance data. Default is False.

optimized_model_filepath

str

Path to save the optimized model. If set, the optimized graph will be saved to this location.

log_severity_level

int

Logging verbosity level (0=Verbose, 1=Info, 2=Warning, 3=Error, 4=Fatal). Default is 2.

log_verbosity_level

int

VLOG level for verbose logging. Default is 0.

enable_mem_pattern

bool

Enable memory pattern optimization. Default is True.

enable_mem_reuse

bool

Enable memory reuse optimization. Default is True.

enable_cpu_mem_arena

bool

Enable CPU memory arena allocator. Default is True.

Methods

add_session_config_entry()

Add custom session configuration entry.

add_session_config_entry(
    key: str,
    value: str
)

key

str

required

Configuration key.

value

str

required

Configuration value.

Common Configuration Keys:

session.load_model_format - Set to “ONNX” or “ORT”
session.use_env_allocators - Use environment allocators
session.record_ep_graph_assignment_info - Record EP graph assignment (“1” to enable)
session.disable_prepacking - Disable weight prepacking

register_custom_ops_library()

register_custom_ops_library(library_path: str)

library_path

str

required

Path to the shared library (.so, .dll, or .dylib).

add_external_initializers()

Add external initializers to the session.

add_external_initializers(
    names: list[str],
    values: list[OrtValue]
)

names

list[str]

required

Names of the initializers.

values

list[OrtValue]

required

OrtValue objects containing initializer data.

add_free_dimension_override_by_name()

Override a free dimension with a specific value.

add_free_dimension_override_by_name(
    dim_name: str,
    dim_value: int
)

dim_name

str

required

Name of the dimension to override.

dim_value

int

required

Value to use for the dimension.

Example Usage

Basic Configuration

import onnxruntime as ort

sess_options = ort.SessionOptions()

# Enable all optimizations
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL

# Set thread counts
sess_options.intra_op_num_threads = 4
sess_options.inter_op_num_threads = 2

# Enable parallel execution
sess_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL

# Create session with options
sess = ort.InferenceSession("model.onnx", sess_options=sess_options)

Enable Profiling

sess_options = ort.SessionOptions()
sess_options.enable_profiling = True

sess = ort.InferenceSession("model.onnx", sess_options=sess_options)

# Run inference
outputs = sess.run(None, inputs)

# Get profiling results
profile_file = sess.end_profiling()
print(f"Profiling data saved to: {profile_file}")

Save Optimized Model

sess_options = ort.SessionOptions()
sess_options.optimized_model_filepath = "model_optimized.onnx"
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_EXTENDED

sess = ort.InferenceSession("model.onnx", sess_options=sess_options)
# Optimized model is automatically saved

Custom Configuration

sess_options = ort.SessionOptions()

# Load ORT format model
sess_options.add_session_config_entry("session.load_model_format", "ORT")

# Record EP graph assignment for debugging
sess_options.add_session_config_entry("session.record_ep_graph_assignment_info", "1")

sess = ort.InferenceSession("model.ort", sess_options=sess_options)

# Get graph assignment info
assignment = sess.get_provider_graph_assignment_info()

Register Custom Operators

sess_options = ort.SessionOptions()
sess_options.register_custom_ops_library("custom_ops.so")

sess = ort.InferenceSession("model_with_custom_ops.onnx", sess_options=sess_options)

Performance Tuning

# For CPU inference
sess_options = ort.SessionOptions()
sess_options.intra_op_num_threads = 8  # Use 8 threads per op
sess_options.inter_op_num_threads = 1   # Sequential op execution
sess_options.execution_mode = ort.ExecutionMode.ORT_SEQUENTIAL
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL

# For GPU inference
sess_options = ort.SessionOptions()
sess_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL

InferenceSession - Create sessions with options
RunOptions - Per-run configuration
Execution Providers - Hardware acceleration

​SessionOptions

​Constructor

​Properties

​Methods

​add_session_config_entry()

​register_custom_ops_library()

​add_external_initializers()

​add_free_dimension_override_by_name()

​Example Usage

​Basic Configuration

​Enable Profiling

​Save Optimized Model

​Custom Configuration

​Register Custom Operators

​Performance Tuning

​Related APIs

SessionOptions

Constructor

Properties

Methods

add_session_config_entry()

register_custom_ops_library()

add_external_initializers()

add_free_dimension_override_by_name()

Example Usage

Basic Configuration

Enable Profiling

Save Optimized Model

Custom Configuration

Register Custom Operators

Performance Tuning

Related APIs