quantize(batchSize:input:output:axis:scale:bias:filterParameters:)
Quantizes the input tensor and writes the result to the output tensor.
Declaration
static func quantize(batchSize: Int, input: BNNSNDArrayDescriptor, output: BNNSNDArrayDescriptor, axis: Int? = nil, scale: BNNSNDArrayDescriptor?, bias: BNNSNDArrayDescriptor?, filterParameters: BNNSFilterParameters? = nil) throwsParameters
- batchSize:
The number of input-output pairs to process.
- input:
The descriptor of the input.
- output:
The descriptor of the output.
- axis:
The index of the axis to which the function applies scale and bias. Set to
nilto dequantize the entire tensor using scale and bias. - scale:
The scale, set to
nilfor a scale of1.0. - bias:
The bias, set to
nilfor a bias of0.0. - filterParameters:
Runtime filter parameters.
Discussion
The following code quantizes a single-precision matrix to a 16-bit integer matrix. The code applies the scale along the zeroth axis and, therefore, the scale tensor contains four elements.
static func quantize() {
let inputValues = [1, 2, 3, 4,
5, 6, 7, 8] as [Float]
let input = BNNSNDArrayDescriptor.allocate(
initializingFrom: inputValues,
shape: .matrixRowMajor(4, 2))
let output = BNNSNDArrayDescriptor.allocateUninitialized(
scalarType: Int16.self,
shape: input.shape)
let scale = BNNSNDArrayDescriptor.allocate(
initializingFrom: [1, 10, 100, 1000] as [Float],
shape: .vector(4))
try? BNNS.quantize(batchSize: 1,
input: input,
output: output,
axis: 0,
scale: scale,
bias: nil)
// Prints:
// [1, 20, 300, 4000,
// 5, 60, 700, 8000]
print(output.makeArray(of: Int16.self)!)
input.deallocate()
output.deallocate()
scale.deallocate()
}