使用 Python 接口

Onnx-mlir 提供了运行时工具，用于在 Python 中编译和运行 ONNX 模型。这些工具由 OnnxMlirCompiler 编译器接口 (include/OnnxMlirCompiler.h) 和 ExecutionSession 类 (src/Runtime/ExecutionSession.hpp) 实现。这两个工具都有一个关联的 Python 绑定，该绑定由 pybind 库生成。

配置 Python 接口

使用 pybind，Python 解释器可以直接导入 C/C++ 二进制文件。对于 onnx-mlir，有五种这样的库：一个用于编译 onnx-mlir 模型，两个用于运行模型，另外两个用于编译和运行模型。

用于编译 onnx-mlir 模型的共享库由 PyOMCompileSession (src/Compiler/PyOMCompileSession.hpp) 生成，并构建为共享库到 build/Debug/lib/PyCompile.cpython-<target>.so。
用于运行 onnx-mlir 模型的共享库由 PyExecutionSession (src/Runtime/PyExecutionSession.hpp) 生成，并构建为共享库到 build/Debug/lib/PyRuntimeC.cpython-<target>.so。
用于运行 onnx-mlir 模型的 Python 库 (src/Runtime/python/PyRuntime.py)。
用于编译和运行 onnx-mlir 模型的共享库由 PyOMCompileExecutionSessionC (src/Runtime/PyOMCompileExecutionSession.hpp) 生成，并构建为共享库到 build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so。
用于编译和运行 onnx-mlir 模型的 Python 库 (src/Runtime/python/PyCompileAndRuntime.py)。该库接受 .onnx 文件和选项作为输入，然后加载、编译并运行它。

只要模块在您的 PYTHONPATH 中，Python 解释器就可以正常导入它。另一种方法是在您的工作目录中为其创建一个符号链接。

cd <working directory>
ln -s <path to the shared library to copmpile onnx-mlir models>(e.g. `build/Debug/lib/PyCompile.cpython-<target>.so`) .
ln -s <path to the shared library to run onnx-mlir models>(e.g. `build/Debug/lib/PyRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to run onnx-mlir models>(e.g. src/Runtime/python/PyRuntime.py) .
ln -s <path to the shared library to compile and run onnx-mlir models>(e.g. `build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to compile and run onnx-mlir models>(e.g. src/Runtime/python/PyCompileAndRuntime.py) .
python3

用于运行模型的 Python 接口: PyRuntime

运行 PyRuntime 接口

ONNX 模型是一个计算图，通常情况下该图只有一个入口点来触发计算。下面是一个对具有单个入口点的模型进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model)
# Input and output signatures of the default entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

如果计算图有多个入口点，用户必须设置一个特定的入口点来进行推理。下面是一个对具有多个入口点的模型进行推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'multi-entry-points-model.so'

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model, use_default_entry_point=False) # False to manually set an entry point.

# Query entry points in the model.
entry_points = session.entry_points()

for entry_point in entry_points:
  # Set the entry point to do inference.
  session.set_entry_point(name=entry_point)
  # Input and output signatures of the current entry point.
  print("input signature in json", session.input_signature())
  print("output signature in json",session.output_signature())
  # Do inference using the current entry point.
  a = np.arange(10).astype('float32')
  b = np.arange(10).astype('float32')
  outputs = session.run(input=[a, b])
  for output in outputs:
    print(output.shape)

使用模型标签

如果模型是使用 --tag 编译的，则必须将 --tag 的值传递给 OMExecutionSession。在同一个 python 脚本中有多个模型的多个会话时，使用标签很有用。下面是一个使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'encoder/model.so' # Assumed that the model was compiled using `--tag=encoder`
decoder_model = 'decoder/model.so' # Assumed that the model was compiled using `--tag=decoder`

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model, tag="encoder")
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model, tag="decoder")

如果两个模型不是使用 --tag 编译的，那么如果它们要在同一个进程中使用，则必须使用不同的 .so 文件名进行编译。实际上，如果没有给出标签，我们使用文件名作为其默认标签。下面是一个不使用标签进行多次推理的示例。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'my_encoder.so'
decoder_model = 'my_decoder.so'

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model) # tag will be `my_encoder` by default.
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model) # tag will be `my_decoder` by default.

要使用不带标签的函数，例如 run_main_graph，设置 tag = "NONE"。

PyRuntime 模型 API

OMExecutionSession 的完整接口可以在前面提到的源代码中查看。但是，使用构造函数和 run 方法就足以执行推理。

def __init__(self, shared_lib_path: str, tag: str, use_default_entry_point: bool):
    """
    Args:
        shared_lib_path: relative or absolute path to your .so model.
        tag: a string that was passed to `--tag` when compiling the .so model. By default, it is the output file name without its extension, namely, `filename` in `filename.so`
        use_default_entry_point: use the default entry point that is `run_main_graph_{tag}` or not. Set to True by default.
    """

def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """

def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """

def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """

def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """

用于编译模型的 Python 接口: PyCompile

运行 PyCompile 接口

ONNX 模型可以直接从命令行编译。生成的库可以按照前面章节所示，使用 Python 执行。有时，直接在 Python 中编译模型可能更方便。本节探讨实现此目的的 Python 方法。

OMCompileSession 对象在构建时会接受一个文件名。对于编译，compile() 方法将接受一个 flags 字符串作为输入，该输入将覆盖通过环境变量设置的任何默认选项。

import numpy as np
from PyCompile import OMCompileSession

# Load onnx model and create OMCompileSession object.
file = './mnist.onnx'
compiler = OMCompileSession(file)
# Generate the library file. Success when rc == 0 while set the opt as "-O3"
rc = compiler.compile("-O3")
# Get the output file name
model = compiler.get_compiled_file_name()
if rc:
    print("Failed to compile with error code", rc)
    exit(1)
print("Compiled onnx file", file, "to", model, "with rc", rc)

PyCompile 模块导出了 OMCompileSession 类，用于驱动 ONNX 模型编译成可执行模型。通常，通过提供 ONNX 模型的文件名来为给定模型创建一个编译器对象。然后，可以将所有编译器选项作为一个完整的 std::string 来设置，以生成所需的可执行文件。最后，通过调用 compile() 命令来执行编译本身，用户将选项字符串作为该函数的输入传递。

compile() 命令返回一个反映编译状态的返回码。零值表示成功，非零值反映错误码。由于不同的操作系统对库文件可能使用不同的后缀，可以使用 get_compiled_file_name() 方法检索输出文件名。

PyCompile 模型 API

OnnxMlirCompiler 的完整接口可以在前面提到的源代码中查看。但是，使用构造函数和下面的方法足以编译模型。

def __init__(self, file_name: str):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        file_name: relative or absolute path to your ONNX model.
    """
def __init__(self, input_buffer: void *, buffer_size: int):
    """
    Constructor for an ONNX model contained in an input buffer.
    Args:
        input_buffer: buffer containing the protobuf representation of the model.
        buffer_size: byte size of the input buffer.
    """
def compile(self, flags: str):
    """
    Method to compile a model from a file.
    Args:
        flags: all the options users would like to set.
    Returns:
        Zero on success, error code on failure.
    """
def compile_from_array(self, output_base_name: str, target: OnnxMlirTarget):
    """
    Method to compile a model from an array.
    Args:
        output_base_name: base name (relative or absolute, without suffix)
        where the compiled model should be written into.
        target: target for the compiler's output. Typical values are
        OnnxMlirTarget.emit_lib or emit_jni.
    Returns:
        Zero on success, error code on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output compiled file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """

用于编译和运行模型的 Python 接口: PyCompileAndRuntime

运行 PyCompileAndRuntime 接口

import numpy as np
from PyCompileAndRuntime import OMCompileExecutionSession

# Load onnx model and create OMCompileExecutionSession object.
inputFileName = './mnist.onnx'
# Set the full name of compiled model
sharedLibPath = './mnist.so'
# Set the compile option as "-O3"
session = OMCompileExecutionSession(inputFileName,sharedLibPath,"-O3")

# Print the models input/output signature, for display.
# Signature functions for info only, commented out if they cause problems.
session.print_input_signature()
session.print_output_signature()

# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

PyCompileAndRuntime 模型 API

PyCompileAndRuntime 是一个新类，它结合了编译和执行功能。它的构造函数接受 .onnx 输入文件，并使用用户提供的选项编译模型，然后使用输入运行模型。

def __init__(self, input_model_path: str, compiled_file_path: str, flags: str, use_default_entry_point: bool):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        input_model_path: relative or absolute path to your ONNX model.
        compiled_file_path: relative or absolute path to your compiled file.
        flags: all the options users would like to set.
        use_default_entry_point: use the default entry point that is `run_main_graph` or not. Set to True by default.
    """
def get_compiled_result(self):
    """
    Method to provide the results of the compilation.
    Returns:
        Int containing the results. 0 represents successful compilation; others on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """
def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """
def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """
def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """
def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """

onnx-mlir

操作指南

参考资料

开发

工具

工具