onnx-mlir

Logo

ONNX 模型在 MLIR 编译器基础设施中的表示和参考降低

在 GitHub 上查看项目 onnx/onnx-mlir

操作指南

使用 Python 进行推理
使用 C/C++ 进行推理
使用 Java 进行推理

参考

ONNX 方言
OMTensor C99 运行时 API
OMTensorList C99 运行时 API
OMTensor Java 运行时 API
OMTensorList Java 运行时 API
生成 ONNX 方言
关于文档

开发

添加操作
测试指南
错误处理
命令行选项
检测
常量传播
添加加速器

工具

工具

RunONNXModel.py
DocCheck

此项目由 onnx 维护

托管在 GitHub Pages 上 — 主题由 orderedlist 提供

使用 Python 接口

Onnx-mlir 具有在 Python 中编译和运行 ONNX 模型的运行时实用程序。这些实用程序由 OnnxMlirCompiler 编译器接口(include/OnnxMlirCompiler.h)和 ExecutionSession 类(src/Runtime/ExecutionSession.hpp)实现。这两个实用程序都具有由 pybind 库 生成的关联 Python 绑定。

配置 Python 接口

使用 pybind,C/C++ 二进制文件可以直接由 Python 解释器导入。对于 onnx-mlir,有五个这样的库,一个用于编译 onnx-mlir 模型,两个用于运行模型,另外两个用于编译和运行模型。

  1. 用于编译 onnx-mlir 模型的 shapred 库由 PyOMCompileSession(src/Compiler/PyOMCompileSession.hpp)生成,并构建为共享库到 build/Debug/lib/PyCompile.cpython-<target>.so
  2. 用于运行 onnx-mlir 模型的共享库由 PyExecutionSession(src/Runtime/PyExecutionSession.hpp)生成,并构建为共享库到 build/Debug/lib/PyRuntimeC.cpython-<target>.so
  3. 用于运行 onnx-mlir 模型的 Python 库(src/Runtime/python/PyRuntime.py)。
  4. 用于编译和运行 onnx-mlir 模型的共享库由 PyOMCompileExecutionSessionC(src/Runtime/PyOMCompileExecutionSession.hpp)生成,并构建为共享库到 build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so
  5. 用于编译运行 onnx-mlir 模型的 Python 库(src/Runtime/python/PyCompileAndRuntime.py)。此库以 .onnx 文件和选项作为输入,它将加载它,然后编译并运行它。

只要该模块位于您的 PYTHONPATH 中,Python 解释器就可以正常导入它。另一种方法是在您的工作目录中为此创建一个符号链接。

cd <working directory>
ln -s <path to the shared library to copmpile onnx-mlir models>(e.g. `build/Debug/lib/PyCompile.cpython-<target>.so`) .
ln -s <path to the shared library to run onnx-mlir models>(e.g. `build/Debug/lib/PyRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to run onnx-mlir models>(e.g. src/Runtime/python/PyRuntime.py) .
ln -s <path to the shared library to compile and run onnx-mlir models>(e.g. `build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to compile and run onnx-mlir models>(e.g. src/Runtime/python/PyCompileAndRuntime.py) .
python3

用于运行模型的 Python 接口:PyRuntime

运行 PyRuntime 接口

ONNX 模型是一个计算图,并且通常情况下,该图具有一个单一的入口点来触发计算。以下是如何对具有单个入口点的模型进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model)
# Input and output signatures of the default entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

如果计算图具有多个入口点,则用户必须设置特定的入口点才能进行推理。以下是如何使用多个入口点进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'multi-entry-points-model.so'

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model, use_default_entry_point=False) # False to manually set an entry point.

# Query entry points in the model.
entry_points = session.entry_points()

for entry_point in entry_points:
  # Set the entry point to do inference.
  session.set_entry_point(name=entry_point)
  # Input and output signatures of the current entry point.
  print("input signature in json", session.input_signature())
  print("output signature in json",session.output_signature())
  # Do inference using the current entry point.
  a = np.arange(10).astype('float32')
  b = np.arange(10).astype('float32')
  outputs = session.run(input=[a, b])
  for output in outputs:
    print(output.shape)

使用模型标签

如果模型是使用 --tag 编译的,则必须将 --tag 的值传递给 OMExecutionSession。当同一个 python 脚本中有多个模型的多个会话时,使用标签非常有用。以下是如何使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'encoder/model.so' # Assumed that the model was compiled using `--tag=encoder`
decoder_model = 'decoder/model.so' # Assumed that the model was compiled using `--tag=decoder`

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model, tag="encoder")
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model, tag="decoder")

如果两个模型没有使用 --tag 进行编译,则如果要在同一进程中使用它们,则必须使用不同的 .so 文件名进行编译。实际上,当没有给出标签时,我们使用文件名作为其默认标签。以下是如何在不使用标签的情况下进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'my_encoder.so'
decoder_model = 'my_decoder.so'

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model) # tag will be `my_encoder` by default.
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model) # tag will be `my_decoder` by default.

要使用不带标签的函数,例如 run_main_graph,请设置 tag = "NONE"

PyRuntime 模型 API

之前提到的源代码中可以查看 OMExecutionSession 的完整接口。但是,使用构造函数和运行方法足以执行推理。

def __init__(self, shared_lib_path: str, tag: str, use_default_entry_point: bool):
    """
    Args:
        shared_lib_path: relative or absolute path to your .so model.
        tag: a string that was passed to `--tag` when compiling the .so model. By default, it is the output file name without its extension, namely, `filename` in `filename.so`
        use_default_entry_point: use the default entry point that is `run_main_graph_{tag}` or not. Set to True by default.
    """

def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """

def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """

def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """

def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """

用于编译模型的 Python 接口:PyCompile

运行 PyCompile 接口

可以直接从命令行编译 ONNX 模型。然后,可以使用 Python 执行生成的库,如上一节所示。有时,可能也方便直接在 Python 中编译模型。本节探讨了这样做的 Python 方法。

OMCompileSession 对象在构建时将采用文件名。对于编译,compile() 将采用 flags 字符串作为输入,这将覆盖从 env var 设置的任何默认选项。

import numpy as np
from PyCompile import OMCompileSession

# Load onnx model and create OMCompileSession object.
file = './mnist.onnx'
compiler = OMCompileSession(file)
# Generate the library file. Success when rc == 0 while set the opt as "-O3"
rc = compiler.compile("-O3")
# Get the output file name
model = compiler.get_compiled_file_name()
if rc:
    print("Failed to compile with error code", rc)
    exit(1)
print("Compiled onnx file", file, "to", model, "with rc", rc)

PyCompile 模块导出 OMCompileSession 类来驱动将 ONNX 模型编译成可执行模型。通常,通过为给定模型提供 ONNX 模型的文件名来为其创建一个编译器对象。然后,可以将所有编译器选项作为一个整体 std::string 设置以生成所需的可执行文件。最后,编译本身是通过调用 compile() 命令执行的,用户将选项字符串作为此函数的输入传递。

compile() 命令返回一个反映编译状态的返回代码。零值表示成功,非零值反映错误代码。由于不同的操作系统可能对库具有不同的后缀,因此可以使用 get_compiled_file_name() 方法检索输出文件名。

PyCompile 模型 API

之前提到的源代码中可以查看 OnnxMlirCompiler 的完整接口。但是,使用构造函数和以下方法足以编译模型。

def __init__(self, file_name: str):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        file_name: relative or absolute path to your ONNX model.
    """
def __init__(self, input_buffer: void *, buffer_size: int):
    """
    Constructor for an ONNX model contained in an input buffer.
    Args:
        input_buffer: buffer containing the protobuf representation of the model.
        buffer_size: byte size of the input buffer.
    """
def compile(self, flags: str):
    """
    Method to compile a model from a file.
    Args:
        flags: all the options users would like to set.
    Returns:
        Zero on success, error code on failure.
    """
def compile_from_array(self, output_base_name: str, target: OnnxMlirTarget):
    """
    Method to compile a model from an array.
    Args:
        output_base_name: base name (relative or absolute, without suffix)
        where the compiled model should be written into.
        target: target for the compiler's output. Typical values are
        OnnxMlirTarget.emit_lib or emit_jni.
    Returns:
        Zero on success, error code on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output compiled file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """

用于编译和运行模型的 Python 接口:PyCompileAndRuntime

运行 PyCompileAndRuntime 接口

import numpy as np
from PyCompileAndRuntime import OMCompileExecutionSession

# Load onnx model and create OMCompileExecutionSession object.
inputFileName = './mnist.onnx'
# Set the full name of compiled model
sharedLibPath = './mnist.so'
# Set the compile option as "-O3"
session = OMCompileExecutionSession(inputFileName,sharedLibPath,"-O3")

# Print the models input/output signature, for display.
# Signature functions for info only, commented out if they cause problems.
session.print_input_signature()
session.print_output_signature()

# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

PyCompileAndRuntime 模型 API

PyCompileAndRuntime 是一个新的类,它结合了编译和执行。它的构造函数采用 .onnx 输入文件,并使用用户提供的选项编译模型,然后使用输入运行模型。

def __init__(self, input_model_path: str, compiled_file_path: str, flags: str, use_default_entry_point: bool):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        input_model_path: relative or absolute path to your ONNX model.
        compiled_file_path: relative or absolute path to your compiled file.
        flags: all the options users would like to set.
        use_default_entry_point: use the default entry point that is `run_main_graph` or not. Set to True by default.
    """
def get_compiled_result(self):
    """
    Method to provide the results of the compilation.
    Returns:
        Int containing the results. 0 represents successful compilation; others on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """
def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """
def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """
def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """
def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """