float_params2

版本 1.0.1 (2.7 KB) 作者: Marco Cococcioni

MATLAB Code for Parameters of Floating-Point Arithmetics

关注

0.0

(0)

20.0 次下载

更新时间 2021/6/10

查看许可证

`float_params2` is a MATLAB function for obtaining the parameters of several

floating-point arithmetics. The parameters are built into the code and are

not computed at run time.

The parameters are

- the unit roundoff,

- the smallest positive (subnormal) floating-point number,

- the smallest positive normalized floating-point number,

- the largest floating-point number,

- the number of binary digits in the significand (including the

implicit leading bit)

and the arithmetics supported are

- bfloat8,

- bfloat16,

- IEEE half precision (fp16),

- IEEE single precision (fp32),

- IEEE double precision (fp64),

- IEEE quadruple precision (fp128).

The code was developed in MATLAB R2020a and works with versions at least

back to R2016b.

This is a small extension to float_params of Nick Higham, to which I added the

support to the 8-bit Brain Float, as proposed at Intel by Naveen K. Mellempudi.

More details can be found here: https://arxiv.org/abs/1905.12334

I also renamed NVIDIA tf32 into tf19, just to reflect that it is a 19-bit precision float.

引用格式

Marco Cococcioni (2025). float_params2 (https://www.mathworks.com/matlabcentral/fileexchange/93835-float_params2), MATLAB Central File Exchange. 检索时间: 2025/3/11.

MATLAB 版本兼容性

创建方式 R2021a

兼容任何版本

平台兼容性

Windows macOS Linux

标签添加标签

致谢

参考作品: float_params

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

float_params2.m

版本	已发布	发行说明
1.0.1	2021/6/10	very small update	下载
1.0.0	2021/6/10		下载