MelWeightMatrix¶

MelWeightMatrix - 17¶

版本¶

名称: MelWeightMatrix (GitHub)
域: main
起始版本: 17
函数: False
支持级别: SupportType.COMMON
形状推断: True

此版本运算符已可用 自从版本 17。

摘要¶

生成一个 MelWeightMatrix，可用于根据梅尔尺度上的 [lower_edge_hertz, upper_edge_hertz] 范围，将包含线性采样频谱（来自 DFT 或 STFT）的 Tensor 重新加权为 num_mel_bins 频率信息。此函数根据以下公式定义了以赫兹为单位的频率的梅尔尺度：

mel(f) = 2595 * log10(1 + f/700)

在返回的矩阵中，所有三角形（滤波器组）的峰值均为 1.0。

返回的 MelWeightMatrix 可用于右乘形状为 [帧数, num_spectrogram_bins] 的线性尺度频谱值（例如 STFT 幅度）的频谱图 S，以生成形状为 [帧数, num_mel_bins] 的“梅尔频谱图” M。

属性¶

output_datatype - INT (默认值为 '1')

输出张量的数据类型。必须严格是 TensorProto 中 DataType 枚举值之一，其值对应 T3。默认值是 1 = FLOAT。

输入¶

num_mel_bins (异构) - T1

梅尔频谱中的频带数。
dft_length (异构) - T1

原始 DFT 的大小。原始 DFT 的大小用于推断单侧 DFT 的大小，其被理解为 floor(dft_length/2) + 1，即频谱图仅包含非冗余 DFT 频点。
sample_rate (异构) - T1

用于创建频谱图的输入信号的每秒采样数。用于计算每个频谱图频点对应的频率，这决定了它们如何映射到梅尔尺度。
lower_edge_hertz (异构) - T2

包含在梅尔频谱中的频率的下界。这对应于最低三角形频带的下边缘。
upper_edge_hertz (异构) - T2

最高频带的所需上边缘。

输出¶

输出 (异构) - T3

Mel 加权矩阵。输出的形状为: [floor(dft_length/2) + 1][num_mel_bins]。

类型约束¶

T1 在 ( tensor(int32), tensor(int64) ) 中

约束为整数张量。
T2 在 ( tensor(bfloat16), tensor(double), tensor(float), tensor(float16) ) 中

约束为浮点张量
T3 在 ( tensor(bfloat16), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ) 中

约束为任何数值类型。