注意
转到末尾 下载完整的示例代码
在一个 ONNX 图中存储数组¶
模型转换后,可以将数组作为常量存储在图中,并通过输出检索。这允许用户存储训练参数或其他信息,例如词汇表。最后几节展示了如何删除输出或将中间结果提升为输出。
训练和转换模型¶
我们从 :epkg:`ONNX Zoo` 下载一个模型,但模型也可以由其他转换库训练和生成。
import pprint
import numpy
from onnx import load
from onnxruntime import InferenceSession
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from skl2onnx import to_onnx
from skl2onnx.helpers.onnx_helper import (
add_output_initializer,
select_model_inputs_outputs,
)
data = load_iris()
X, y = data.data.astype(numpy.float32), data.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
model = LogisticRegression(penalty="elasticnet", C=2.0, solver="saga", l1_ratio=0.5)
model.fit(X_train, y_train)
onx = to_onnx(model, X_train[:1], target_opset=12, options={"zipmap": False})
/home/xadupre/github/scikit-learn/sklearn/linear_model/_sag.py:350: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
warnings.warn(
添加训练参数¶
new_onx = add_output_initializer(
onx, ["C", "l1_ratio"], [numpy.array([model.C]), numpy.array([model.l1_ratio])]
)
推断¶
sess = InferenceSession(new_onx.SerializeToString(), providers=["CPUExecutionProvider"])
print("output names:", [o.name for o in sess.get_outputs()])
res = sess.run(None, {"X": X_test[:2]})
print("outputs")
pprint.pprint(res)
output names: ['label', 'probabilities', 'C', 'l1_ratio']
outputs
[array([1, 0], dtype=int64),
array([[2.4983161e-03, 8.6063814e-01, 1.3686356e-01],
[9.7875208e-01, 2.1247936e-02, 2.7418587e-08]], dtype=float32),
array([2.]),
array([0.5])]
此解决方案的主要缺点是增加了预测时间,因为 onnxruntime 会为每次预测复制常量。可以将这些常量存储在单独的 ONNX 图中,或者将它们删除。
选择输出¶
下一个函数删除模型中不需要的输出,不仅仅是常量。下一个模型只保留概率。
simple_onx = select_model_inputs_outputs(new_onx, ["probabilities"])
sess = InferenceSession(
simple_onx.SerializeToString(), providers=["CPUExecutionProvider"]
)
print("output names:", [o.name for o in sess.get_outputs()])
res = sess.run(None, {"X": X_test[:2]})
print("outputs")
pprint.pprint(res)
# Function *select_model_inputs_outputs* add also promote an intermediate
# result to an output.
#
output names: ['probabilities']
outputs
[array([[2.4983161e-03, 8.6063814e-01, 1.3686356e-01],
[9.7875208e-01, 2.1247936e-02, 2.7418587e-08]], dtype=float32)]
此示例仅使用内存中的 ONNX 图,从不保存或加载模型。这可以通过使用以下代码片段来完成。
保存模型¶
加载模型¶
model = load("simplified_model.onnx")
sess = InferenceSession(model.SerializeToString(), providers=["CPUExecutionProvider"])
print("output names:", [o.name for o in sess.get_outputs()])
res = sess.run(None, {"X": X_test[:2]})
print("outputs")
pprint.pprint(res)
output names: ['probabilities']
outputs
[array([[2.4983161e-03, 8.6063814e-01, 1.3686356e-01],
[9.7875208e-01, 2.1247936e-02, 2.7418587e-08]], dtype=float32)]
脚本的总运行时间:(0 分钟 0.034 秒)