onnx-mlir

Logo

ONNX 模型在 MLIR 编译器基础设施中的表示和参考下推

在 GitHub 上查看项目 onnx/onnx-mlir

操作指南

使用 Python 进行推理
使用 C/C++ 进行推理
使用 Java 进行推理

参考资料

ONNX 方言
OMTensor C99 运行时 API
OMTensorList C99 运行时 API
OMTensor Java 运行时 API
OMTensorList Java 运行时 API
生成 ONNX 方言
关于文档

开发

添加操作
测试指南
错误处理
命令行选项
插桩
常量传播
添加加速器

工具

工具

RunONNXModel.py
DocCheck

此项目由 onnx 维护

托管于 GitHub Pages — 主题来自 orderedlist

使用 Python 接口

Onnx-mlir 提供了运行时工具,可以在 Python 中编译和运行 ONNX 模型。这些工具通过 OnnxMlirCompiler 编译器接口(include/OnnxMlirCompiler.h)和 ExecutionSession 类(src/Runtime/ExecutionSession.hpp)实现。这两个工具都通过 pybind 库 生成了相应的 Python 绑定。

配置 Python 接口

使用 pybind,Python 解释器可以直接导入 C/C++ 二进制文件。对于 onnx-mlir,有五个这样的库:一个用于编译 onnx-mlir 模型,两个用于运行模型,另外两个用于编译和运行模型。

  1. 用于编译 onnx-mlir 模型的共享库由 PyOMCompileSession(src/Compiler/PyOMCompileSession.hpp)生成,并构建为共享库 build/Debug/lib/PyCompile.cpython-<target>.so
  2. 用于运行 onnx-mlir 模型的共享库由 PyExecutionSession(src/Runtime/PyExecutionSession.hpp)生成,并构建为共享库 build/Debug/lib/PyRuntimeC.cpython-<target>.so
  3. 用于运行 onnx-mlir 模型的 Python 库(src/Runtime/python/PyRuntime.py)。
  4. 用于编译和运行 onnx-mlir 模型的共享库由 PyOMCompileExecutionSessionC(src/Runtime/PyOMCompileExecutionSession.hpp)生成,并构建为共享库 build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so
  5. 用于编译和运行 onnx-mlir 模型的 Python 库(src/Runtime/python/PyCompileAndRuntime.py)。该库接收一个 .onnx 文件和选项作为输入,它将加载该文件,然后编译并运行它。

只要该模块在您的 PYTHONPATH 中,Python 解释器就可以正常导入它。另一种选择是在您的工作目录中创建一个指向它的符号链接。

cd <working directory>
ln -s <path to the shared library to copmpile onnx-mlir models>(e.g. `build/Debug/lib/PyCompile.cpython-<target>.so`) .
ln -s <path to the shared library to run onnx-mlir models>(e.g. `build/Debug/lib/PyRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to run onnx-mlir models>(e.g. src/Runtime/python/PyRuntime.py) .
ln -s <path to the shared library to compile and run onnx-mlir models>(e.g. `build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to compile and run onnx-mlir models>(e.g. src/Runtime/python/PyCompileAndRuntime.py) .
python3

运行模型的 Python 接口:PyRuntime

运行 PyRuntime 接口

ONNX 模型是一个计算图,通常图只有一个入口点来触发计算。下面是一个具有单个入口点的模型进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model)
# Input and output signatures of the default entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

如果计算图有多个入口点,用户必须设置一个特定的入口点来进行推理。下面是一个具有多个入口点的推理示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'multi-entry-points-model.so'

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model, use_default_entry_point=False) # False to manually set an entry point.

# Query entry points in the model.
entry_points = session.entry_points()

for entry_point in entry_points:
  # Set the entry point to do inference.
  session.set_entry_point(name=entry_point)
  # Input and output signatures of the current entry point.
  print("input signature in json", session.input_signature())
  print("output signature in json",session.output_signature())
  # Do inference using the current entry point.
  a = np.arange(10).astype('float32')
  b = np.arange(10).astype('float32')
  outputs = session.run(input=[a, b])
  for output in outputs:
    print(output.shape)

使用模型标签

如果模型是通过使用 --tag 编译的,则必须将 --tag 的值传递给 OMExecutionSession。当同一个 Python 脚本中有多个模型有多个会话时,使用标签非常有用。下面是使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'encoder/model.so' # Assumed that the model was compiled using `--tag=encoder`
decoder_model = 'decoder/model.so' # Assumed that the model was compiled using `--tag=decoder`

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model, tag="encoder")
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model, tag="decoder")

如果两个模型没有使用 --tag 编译,则它们在同一个进程中使用时必须使用不同的 .so 文件名进行编译。实际上,当没有给出标签时,我们使用文件名作为其默认标签。下面是未使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'my_encoder.so'
decoder_model = 'my_decoder.so'

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model) # tag will be `my_encoder` by default.
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model) # tag will be `my_decoder` by default.

要使用不带标签的函数,例如 run_main_graph,请将 tag = "NONE" 设置为。

PyRuntime 模型 API

您可以在前面提到的源代码中看到 OMExecutionSession 的完整接口。但是,使用构造函数和 run 方法足以执行推理。

def __init__(self, shared_lib_path: str, tag: str, use_default_entry_point: bool):
    """
    Args:
        shared_lib_path: relative or absolute path to your .so model.
        tag: a string that was passed to `--tag` when compiling the .so model. By default, it is the output file name without its extension, namely, `filename` in `filename.so`
        use_default_entry_point: use the default entry point that is `run_main_graph_{tag}` or not. Set to True by default.
    """

def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """

def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """

def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """

def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """

编译模型的 Python 接口:PyCompile

运行 PyCompile 接口

ONNX 模型可以直接从命令行编译。生成的库随后可以使用 Python 执行,如前几节所示。有时,直接在 Python 中编译模型也可能很方便。本节将探讨这样做的 Python 方法。

OMCompileSession 对象在构造时将接收一个文件名。对于编译,compile() 将接受一个 flags 字符串作为输入,该字符串将覆盖从环境变量设置的任何默认选项。

import numpy as np
from PyCompile import OMCompileSession

# Load onnx model and create OMCompileSession object.
file = './mnist.onnx'
compiler = OMCompileSession(file)
# Generate the library file. Success when rc == 0 while set the opt as "-O3"
rc = compiler.compile("-O3")
# Get the output file name
model = compiler.get_compiled_file_name()
if rc:
    print("Failed to compile with error code", rc)
    exit(1)
print("Compiled onnx file", file, "to", model, "with rc", rc)

PyCompile 模块导出 OMCompileSession 类,用于驱动 ONNX 模型到可执行模型的编译。通常,为给定模型创建一个编译器对象,为其提供 ONNX 模型的文件名。然后,所有编译器选项都可以设置为一个完整的 std::string 来生成所需的程序。最后,通过调用 compile() 命令来执行编译本身,用户将选项字符串作为此函数的输入。

compile() 命令返回一个反映编译状态的返回码。零值表示成功,非零值表示错误代码。由于不同操作系统库的后缀可能不同,因此可以使用 get_compiled_file_name() 方法检索输出文件名。

PyCompile 模型 API

您可以在前面提到的源代码中看到 OnnxMlirCompiler 的完整接口。但是,使用构造函数和下面的方法足以编译模型。

def __init__(self, file_name: str):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        file_name: relative or absolute path to your ONNX model.
    """
def __init__(self, input_buffer: void *, buffer_size: int):
    """
    Constructor for an ONNX model contained in an input buffer.
    Args:
        input_buffer: buffer containing the protobuf representation of the model.
        buffer_size: byte size of the input buffer.
    """
def compile(self, flags: str):
    """
    Method to compile a model from a file.
    Args:
        flags: all the options users would like to set.
    Returns:
        Zero on success, error code on failure.
    """
def compile_from_array(self, output_base_name: str, target: OnnxMlirTarget):
    """
    Method to compile a model from an array.
    Args:
        output_base_name: base name (relative or absolute, without suffix)
        where the compiled model should be written into.
        target: target for the compiler's output. Typical values are
        OnnxMlirTarget.emit_lib or emit_jni.
    Returns:
        Zero on success, error code on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output compiled file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """

编译和运行模型的 Python 接口:PyCompileAndRuntime

运行 PyCompileAndRuntime 接口

import numpy as np
from PyCompileAndRuntime import OMCompileExecutionSession

# Load onnx model and create OMCompileExecutionSession object.
inputFileName = './mnist.onnx'
# Set the full name of compiled model
sharedLibPath = './mnist.so'
# Set the compile option as "-O3"
session = OMCompileExecutionSession(inputFileName,sharedLibPath,"-O3")

# Print the models input/output signature, for display.
# Signature functions for info only, commented out if they cause problems.
session.print_input_signature()
session.print_output_signature()

# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

PyCompileAndRuntime 模型 API

PyCompileAndRuntime 是一个新类,它结合了编译和执行功能。它的构造函数接收 .onnx 输入文件,并使用用户提供的选项编译模型,然后使用输入运行模型。

def __init__(self, input_model_path: str, compiled_file_path: str, flags: str, use_default_entry_point: bool):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        input_model_path: relative or absolute path to your ONNX model.
        compiled_file_path: relative or absolute path to your compiled file.
        flags: all the options users would like to set.
        use_default_entry_point: use the default entry point that is `run_main_graph` or not. Set to True by default.
    """
def get_compiled_result(self):
    """
    Method to provide the results of the compilation.
    Returns:
        Int containing the results. 0 represents successful compilation; others on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """
def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """
def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """
def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """
def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """