ONNX 模型在 MLIR 编译器基础设施中的表示和参考降低
此项目由 onnx 维护
托管在 GitHub Pages 上 — 主题由 orderedlist 提供
Onnx-mlir 具有在 Python 中编译和运行 ONNX 模型的运行时实用程序。这些实用程序由 OnnxMlirCompiler
编译器接口(include/OnnxMlirCompiler.h)和 ExecutionSession
类(src/Runtime/ExecutionSession.hpp)实现。这两个实用程序都具有由 pybind 库 生成的关联 Python 绑定。
使用 pybind,C/C++ 二进制文件可以直接由 Python 解释器导入。对于 onnx-mlir,有五个这样的库,一个用于编译 onnx-mlir 模型,两个用于运行模型,另外两个用于编译和运行模型。
PyOMCompileSession
(src/Compiler/PyOMCompileSession.hpp)生成,并构建为共享库到 build/Debug/lib/PyCompile.cpython-<target>.so
。PyExecutionSession
(src/Runtime/PyExecutionSession.hpp)生成,并构建为共享库到 build/Debug/lib/PyRuntimeC.cpython-<target>.so
。PyOMCompileExecutionSessionC
(src/Runtime/PyOMCompileExecutionSession.hpp)生成,并构建为共享库到 build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so
。只要该模块位于您的 PYTHONPATH 中,Python 解释器就可以正常导入它。另一种方法是在您的工作目录中为此创建一个符号链接。
cd <working directory>
ln -s <path to the shared library to copmpile onnx-mlir models>(e.g. `build/Debug/lib/PyCompile.cpython-<target>.so`) .
ln -s <path to the shared library to run onnx-mlir models>(e.g. `build/Debug/lib/PyRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to run onnx-mlir models>(e.g. src/Runtime/python/PyRuntime.py) .
ln -s <path to the shared library to compile and run onnx-mlir models>(e.g. `build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to compile and run onnx-mlir models>(e.g. src/Runtime/python/PyCompileAndRuntime.py) .
python3
ONNX 模型是一个计算图,并且通常情况下,该图具有一个单一的入口点来触发计算。以下是如何对具有单个入口点的模型进行推理的示例。
import numpy as np
from PyRuntime import OMExecutionSession
model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir
# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model)
# Input and output signatures of the default entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])
for output in outputs:
print(output.shape)
如果计算图具有多个入口点,则用户必须设置特定的入口点才能进行推理。以下是如何使用多个入口点进行推理的示例。
import numpy as np
from PyRuntime import OMExecutionSession
model = 'multi-entry-points-model.so'
# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model, use_default_entry_point=False) # False to manually set an entry point.
# Query entry points in the model.
entry_points = session.entry_points()
for entry_point in entry_points:
# Set the entry point to do inference.
session.set_entry_point(name=entry_point)
# Input and output signatures of the current entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the current entry point.
a = np.arange(10).astype('float32')
b = np.arange(10).astype('float32')
outputs = session.run(input=[a, b])
for output in outputs:
print(output.shape)
如果模型是使用 --tag
编译的,则必须将 --tag
的值传递给 OMExecutionSession。当同一个 python 脚本中有多个模型的多个会话时,使用标签非常有用。以下是如何使用标签进行多次推理的示例。
import numpy as np
from PyRuntime import OMExecutionSession
encoder_model = 'encoder/model.so' # Assumed that the model was compiled using `--tag=encoder`
decoder_model = 'decoder/model.so' # Assumed that the model was compiled using `--tag=decoder`
# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model, tag="encoder")
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model, tag="decoder")
如果两个模型没有使用 --tag
进行编译,则如果要在同一进程中使用它们,则必须使用不同的 .so 文件名进行编译。实际上,当没有给出标签时,我们使用文件名作为其默认标签。以下是如何在不使用标签的情况下进行多次推理的示例。
import numpy as np
from PyRuntime import OMExecutionSession
encoder_model = 'my_encoder.so'
decoder_model = 'my_decoder.so'
# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model) # tag will be `my_encoder` by default.
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model) # tag will be `my_decoder` by default.
要使用不带标签的函数,例如 run_main_graph
,请设置 tag = "NONE"
。
之前提到的源代码中可以查看 OMExecutionSession
的完整接口。但是,使用构造函数和运行方法足以执行推理。
def __init__(self, shared_lib_path: str, tag: str, use_default_entry_point: bool):
"""
Args:
shared_lib_path: relative or absolute path to your .so model.
tag: a string that was passed to `--tag` when compiling the .so model. By default, it is the output file name without its extension, namely, `filename` in `filename.so`
use_default_entry_point: use the default entry point that is `run_main_graph_{tag}` or not. Set to True by default.
"""
def run(self, input: List[ndarray]) -> List[ndarray]:
"""
Args:
input: A list of NumPy arrays, the inputs of your model.
Returns:
A list of NumPy arrays, the outputs of your model.
"""
def input_signature(self) -> str:
"""
Returns:
A string containing a JSON representation of the model's input signature.
"""
def output_signature(self) -> str:
"""
Returns:
A string containing a JSON representation of the model's output signature.
"""
def entry_points(self) -> List[str]:
"""
Returns:
A list of entry point names.
"""
def set_entry_point(self, name: str):
"""
Args:
name: an entry point name.
"""
可以直接从命令行编译 ONNX 模型。然后,可以使用 Python 执行生成的库,如上一节所示。有时,可能也方便直接在 Python 中编译模型。本节探讨了这样做的 Python 方法。
OMCompileSession 对象在构建时将采用文件名。对于编译,compile()
将采用 flags
字符串作为输入,这将覆盖从 env var 设置的任何默认选项。
import numpy as np
from PyCompile import OMCompileSession
# Load onnx model and create OMCompileSession object.
file = './mnist.onnx'
compiler = OMCompileSession(file)
# Generate the library file. Success when rc == 0 while set the opt as "-O3"
rc = compiler.compile("-O3")
# Get the output file name
model = compiler.get_compiled_file_name()
if rc:
print("Failed to compile with error code", rc)
exit(1)
print("Compiled onnx file", file, "to", model, "with rc", rc)
PyCompile
模块导出 OMCompileSession
类来驱动将 ONNX 模型编译成可执行模型。通常,通过为给定模型提供 ONNX 模型的文件名来为其创建一个编译器对象。然后,可以将所有编译器选项作为一个整体 std::string
设置以生成所需的可执行文件。最后,编译本身是通过调用 compile()
命令执行的,用户将选项字符串作为此函数的输入传递。
compile()
命令返回一个反映编译状态的返回代码。零值表示成功,非零值反映错误代码。由于不同的操作系统可能对库具有不同的后缀,因此可以使用 get_compiled_file_name()
方法检索输出文件名。
之前提到的源代码中可以查看 OnnxMlirCompiler 的完整接口。但是,使用构造函数和以下方法足以编译模型。
def __init__(self, file_name: str):
"""
Constructor for an ONNX model contained in a file.
Args:
file_name: relative or absolute path to your ONNX model.
"""
def __init__(self, input_buffer: void *, buffer_size: int):
"""
Constructor for an ONNX model contained in an input buffer.
Args:
input_buffer: buffer containing the protobuf representation of the model.
buffer_size: byte size of the input buffer.
"""
def compile(self, flags: str):
"""
Method to compile a model from a file.
Args:
flags: all the options users would like to set.
Returns:
Zero on success, error code on failure.
"""
def compile_from_array(self, output_base_name: str, target: OnnxMlirTarget):
"""
Method to compile a model from an array.
Args:
output_base_name: base name (relative or absolute, without suffix)
where the compiled model should be written into.
target: target for the compiler's output. Typical values are
OnnxMlirTarget.emit_lib or emit_jni.
Returns:
Zero on success, error code on failure.
"""
def get_compiled_file_name(self):
"""
Method to provide the full (absolute or relative) output compiled file name, including
its suffix.
Returns:
String containing the fle name after successful compilation; empty string on failure.
"""
def get_error_message(self):
"""
Method to provide the compilation error message.
Returns:
String containing the error message; empty string on success.
"""
import numpy as np
from PyCompileAndRuntime import OMCompileExecutionSession
# Load onnx model and create OMCompileExecutionSession object.
inputFileName = './mnist.onnx'
# Set the full name of compiled model
sharedLibPath = './mnist.so'
# Set the compile option as "-O3"
session = OMCompileExecutionSession(inputFileName,sharedLibPath,"-O3")
# Print the models input/output signature, for display.
# Signature functions for info only, commented out if they cause problems.
session.print_input_signature()
session.print_output_signature()
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])
for output in outputs:
print(output.shape)
PyCompileAndRuntime 是一个新的类,它结合了编译和执行。它的构造函数采用 .onnx
输入文件,并使用用户提供的选项编译模型,然后使用输入运行模型。
def __init__(self, input_model_path: str, compiled_file_path: str, flags: str, use_default_entry_point: bool):
"""
Constructor for an ONNX model contained in a file.
Args:
input_model_path: relative or absolute path to your ONNX model.
compiled_file_path: relative or absolute path to your compiled file.
flags: all the options users would like to set.
use_default_entry_point: use the default entry point that is `run_main_graph` or not. Set to True by default.
"""
def get_compiled_result(self):
"""
Method to provide the results of the compilation.
Returns:
Int containing the results. 0 represents successful compilation; others on failure.
"""
def get_compiled_file_name(self):
"""
Method to provide the full (absolute or relative) output file name, including
its suffix.
Returns:
String containing the fle name after successful compilation; empty string on failure.
"""
def get_error_message(self):
"""
Method to provide the compilation error message.
Returns:
String containing the error message; empty string on success.
"""
def entry_points(self) -> List[str]:
"""
Returns:
A list of entry point names.
"""
def set_entry_point(self, name: str):
"""
Args:
name: an entry point name.
"""
def run(self, input: List[ndarray]) -> List[ndarray]:
"""
Args:
input: A list of NumPy arrays, the inputs of your model.
Returns:
A list of NumPy arrays, the outputs of your model.
"""
def input_signature(self) -> str:
"""
Returns:
A string containing a JSON representation of the model's input signature.
"""
def output_signature(self) -> str:
"""
Returns:
A string containing a JSON representation of the model's output signature.
"""