onnx-mlir

Logo

ONNX 模型在 MLIR 编译器基础设施中的表示与参考转换

在 GitHub 上查看项目 onnx/onnx-mlir

操作指南

使用 Python 进行推理
使用 C/C++ 进行推理
使用 Java 进行推理

参考资料

ONNX 方言
OMTensor C99 运行时 API
OMTensorList C99 运行时 API
OMTensor Java 运行时 API
OMTensorList Java 运行时 API
生成 ONNX 方言
关于文档

开发

添加一个算子
测试指南
错误处理
命令行选项
插桩
常量传播
添加一个加速器

工具

工具

RunONNXModel.py
DocCheck

本项目由 onnx 维护

托管于 GitHub Pages — 主题由 orderedlist

使用 Python 接口

Onnx-mlir 提供了运行时工具,用于在 Python 中编译和运行 ONNX 模型。这些工具由 OnnxMlirCompiler 编译器接口 (include/OnnxMlirCompiler.h) 和 ExecutionSession 类 (src/Runtime/ExecutionSession.hpp) 实现。这两个工具都有一个关联的 Python 绑定,该绑定由 pybind 库生成。

配置 Python 接口

使用 pybind,Python 解释器可以直接导入 C/C++ 二进制文件。对于 onnx-mlir,有五种这样的库:一个用于编译 onnx-mlir 模型,两个用于运行模型,另外两个用于编译和运行模型。

  1. 用于编译 onnx-mlir 模型的共享库由 PyOMCompileSession (src/Compiler/PyOMCompileSession.hpp) 生成,并构建为共享库到 build/Debug/lib/PyCompile.cpython-<target>.so
  2. 用于运行 onnx-mlir 模型的共享库由 PyExecutionSession (src/Runtime/PyExecutionSession.hpp) 生成,并构建为共享库到 build/Debug/lib/PyRuntimeC.cpython-<target>.so
  3. 用于运行 onnx-mlir 模型的 Python 库 (src/Runtime/python/PyRuntime.py)。
  4. 用于编译和运行 onnx-mlir 模型的共享库由 PyOMCompileExecutionSessionC (src/Runtime/PyOMCompileExecutionSession.hpp) 生成,并构建为共享库到 build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so
  5. 用于编译和运行 onnx-mlir 模型的 Python 库 (src/Runtime/python/PyCompileAndRuntime.py)。该库接受 .onnx 文件和选项作为输入,然后加载、编译并运行它。

只要模块在您的 PYTHONPATH 中,Python 解释器就可以正常导入它。另一种方法是在您的工作目录中为其创建一个符号链接。

cd <working directory>
ln -s <path to the shared library to copmpile onnx-mlir models>(e.g. `build/Debug/lib/PyCompile.cpython-<target>.so`) .
ln -s <path to the shared library to run onnx-mlir models>(e.g. `build/Debug/lib/PyRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to run onnx-mlir models>(e.g. src/Runtime/python/PyRuntime.py) .
ln -s <path to the shared library to compile and run onnx-mlir models>(e.g. `build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to compile and run onnx-mlir models>(e.g. src/Runtime/python/PyCompileAndRuntime.py) .
python3

用于运行模型的 Python 接口: PyRuntime

运行 PyRuntime 接口

ONNX 模型是一个计算图,通常情况下该图只有一个入口点来触发计算。下面是一个对具有单个入口点的模型进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model)
# Input and output signatures of the default entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

如果计算图有多个入口点,用户必须设置一个特定的入口点来进行推理。下面是一个对具有多个入口点的模型进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'multi-entry-points-model.so'

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model, use_default_entry_point=False) # False to manually set an entry point.

# Query entry points in the model.
entry_points = session.entry_points()

for entry_point in entry_points:
  # Set the entry point to do inference.
  session.set_entry_point(name=entry_point)
  # Input and output signatures of the current entry point.
  print("input signature in json", session.input_signature())
  print("output signature in json",session.output_signature())
  # Do inference using the current entry point.
  a = np.arange(10).astype('float32')
  b = np.arange(10).astype('float32')
  outputs = session.run(input=[a, b])
  for output in outputs:
    print(output.shape)

使用模型标签

如果模型是使用 --tag 编译的,则必须将 --tag 的值传递给 OMExecutionSession。在同一个 python 脚本中有多个模型的多个会话时,使用标签很有用。下面是一个使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'encoder/model.so' # Assumed that the model was compiled using `--tag=encoder`
decoder_model = 'decoder/model.so' # Assumed that the model was compiled using `--tag=decoder`

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model, tag="encoder")
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model, tag="decoder")

如果两个模型不是使用 --tag 编译的,那么如果它们要在同一个进程中使用,则必须使用不同的 .so 文件名进行编译。实际上,如果没有给出标签,我们使用文件名作为其默认标签。下面是一个不使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'my_encoder.so'
decoder_model = 'my_decoder.so'

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model) # tag will be `my_encoder` by default.
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model) # tag will be `my_decoder` by default.

要使用不带标签的函数,例如 run_main_graph,设置 tag = "NONE"

PyRuntime 模型 API

OMExecutionSession 的完整接口可以在前面提到的源代码中查看。但是,使用构造函数和 run 方法就足以执行推理。

def __init__(self, shared_lib_path: str, tag: str, use_default_entry_point: bool):
    """
    Args:
        shared_lib_path: relative or absolute path to your .so model.
        tag: a string that was passed to `--tag` when compiling the .so model. By default, it is the output file name without its extension, namely, `filename` in `filename.so`
        use_default_entry_point: use the default entry point that is `run_main_graph_{tag}` or not. Set to True by default.
    """

def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """

def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """

def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """

def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """

用于编译模型的 Python 接口: PyCompile

运行 PyCompile 接口

ONNX 模型可以直接从命令行编译。生成的库可以按照前面章节所示,使用 Python 执行。有时,直接在 Python 中编译模型可能更方便。本节探讨实现此目的的 Python 方法。

OMCompileSession 对象在构建时会接受一个文件名。对于编译,compile() 方法将接受一个 flags 字符串作为输入,该输入将覆盖通过环境变量设置的任何默认选项。

import numpy as np
from PyCompile import OMCompileSession

# Load onnx model and create OMCompileSession object.
file = './mnist.onnx'
compiler = OMCompileSession(file)
# Generate the library file. Success when rc == 0 while set the opt as "-O3"
rc = compiler.compile("-O3")
# Get the output file name
model = compiler.get_compiled_file_name()
if rc:
    print("Failed to compile with error code", rc)
    exit(1)
print("Compiled onnx file", file, "to", model, "with rc", rc)

PyCompile 模块导出了 OMCompileSession 类,用于驱动 ONNX 模型编译成可执行模型。通常,通过提供 ONNX 模型的文件名来为给定模型创建一个编译器对象。然后,可以将所有编译器选项作为一个完整的 std::string 来设置,以生成所需的可执行文件。最后,通过调用 compile() 命令来执行编译本身,用户将选项字符串作为该函数的输入传递。

compile() 命令返回一个反映编译状态的返回码。零值表示成功,非零值反映错误码。由于不同的操作系统对库文件可能使用不同的后缀,可以使用 get_compiled_file_name() 方法检索输出文件名。

PyCompile 模型 API

OnnxMlirCompiler 的完整接口可以在前面提到的源代码中查看。但是,使用构造函数和下面的方法足以编译模型。

def __init__(self, file_name: str):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        file_name: relative or absolute path to your ONNX model.
    """
def __init__(self, input_buffer: void *, buffer_size: int):
    """
    Constructor for an ONNX model contained in an input buffer.
    Args:
        input_buffer: buffer containing the protobuf representation of the model.
        buffer_size: byte size of the input buffer.
    """
def compile(self, flags: str):
    """
    Method to compile a model from a file.
    Args:
        flags: all the options users would like to set.
    Returns:
        Zero on success, error code on failure.
    """
def compile_from_array(self, output_base_name: str, target: OnnxMlirTarget):
    """
    Method to compile a model from an array.
    Args:
        output_base_name: base name (relative or absolute, without suffix)
        where the compiled model should be written into.
        target: target for the compiler's output. Typical values are
        OnnxMlirTarget.emit_lib or emit_jni.
    Returns:
        Zero on success, error code on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output compiled file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """

用于编译和运行模型的 Python 接口: PyCompileAndRuntime

运行 PyCompileAndRuntime 接口

import numpy as np
from PyCompileAndRuntime import OMCompileExecutionSession

# Load onnx model and create OMCompileExecutionSession object.
inputFileName = './mnist.onnx'
# Set the full name of compiled model
sharedLibPath = './mnist.so'
# Set the compile option as "-O3"
session = OMCompileExecutionSession(inputFileName,sharedLibPath,"-O3")

# Print the models input/output signature, for display.
# Signature functions for info only, commented out if they cause problems.
session.print_input_signature()
session.print_output_signature()

# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

PyCompileAndRuntime 模型 API

PyCompileAndRuntime 是一个新类,它结合了编译和执行功能。它的构造函数接受 .onnx 输入文件,并使用用户提供的选项编译模型,然后使用输入运行模型。

def __init__(self, input_model_path: str, compiled_file_path: str, flags: str, use_default_entry_point: bool):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        input_model_path: relative or absolute path to your ONNX model.
        compiled_file_path: relative or absolute path to your compiled file.
        flags: all the options users would like to set.
        use_default_entry_point: use the default entry point that is `run_main_graph` or not. Set to True by default.
    """
def get_compiled_result(self):
    """
    Method to provide the results of the compilation.
    Returns:
        Int containing the results. 0 represents successful compilation; others on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """
def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """
def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """
def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """
def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """