quantization¶
MRFI quantization methods
A Quantization has two static function quantize()
and dequantize()
,
both have args x
and other args have specified in config file.
These two function should modify x
inplace, DO NOT return something.
quantize()
should make input x
into a integer tensor with float32 type,
aka. pseudo quantization, therefore pytorch can forward them correctly.
Warning
A integer bit flip error mode always need a quantization.
The bit_width
argument and the result integer range (e.g. -128~127) should be
consist with corresponding error mode argment.
Since MRFI does not check value bound for performance reason, wrong arguments or
wrong implemention of quantization may silently lead to unexpected experiment result.
Runtime dynamic quantization
If you set dynamic_range
to "auto" in MRFI config,
this value will be set to max range of the input tensor automatically by MRFI.
This feature can be used to simulate runtime dynamic quantization.
However, it should be noted that fault injection can also cause changes in the dynamic range of the later layer.
SymmericQuantization
¶
Simple symmeric quantization.
Uniformly mapping a float tensor in range [-dynamic_range*scale_factor, +dynamic_range*scale_factor]
into integer range [-2**(bit_width-1)+1, 2**(bit_width-1)-1]
.
Outliers are clipped.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bit_width | int |
Often 8 or 16. | required |
dynamic_range | float |
Usually maximum of values. | required |
scale_factor | float |
Extra factor on dynamic range. | 1.0 |
PositiveQuantization
¶
Simple positive quantization.
Uniformly mapping a float tensor in range [0, dynamic_range*scale_factor]
into integer range [0, 2**(bit_width)-1]
.
Outliers are clipped.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bit_width | int |
Often 8 or 16. | required |
dynamic_range | float |
Usually maximum of values. | required |
scale_factor | float |
Extra factor on dynamic range. | 1.0 |
FixPointQuantization
¶
Fixpoint quantization.
Quantize a float tensor into binary fix point representation integer_bit.decimal_bit
.
So the input dynamic range is [-2**integer_bit, 2**integer_bit]
, outliers are clipped.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
integer_bit | int |
Integer bits of value. | required |
decimal_bit | int |
Decimal bits of value. | required |