argmax for tensors with custom type index and AVX2 optimization (mex)

版本 1.2.0.0 (33.0 KB) 作者: Emanuele Ruffaldi
MEX based argmax for tensors supporting user specified type for the resulting index
43.0 次下载
更新时间 2017/9/3

This MEX function provides the argmax functionality in Matlab for the purpose of avoiding the syntax of the max function from Matlab
[~,Y] = max(X,[],dim)
In addition it allows to return the indices in a user specified type (e.g. int32) and not just the default double.
Speed: when using -march=native in machines with AVX2 it allows interesting speedups in comparison to Matlab (except for double). Using AXV2 256bit registers it is possible to compute the maximum in parallel over elements of 2,4,16 or even 32 for types respectively double,float/int32,int16 and int32. The interesting part is the propagation of the indices because a AVX2 max is trivial. For using this feature it is necessary to pass -march=native to mex (e.g. modifying the XML configuration).

Added comparison of the results using the indices: result from Matlab and this could could differ in indices if the matrix contains duplicate values.

Usage:
Y = argmax(X, dim, int16(0)); % returns indices as int16

TODOs:
- min
- min and max in one pass
- check on dimension and specified type
- remake in C using Python for code generation

引用格式

Emanuele Ruffaldi (2024). argmax for tensors with custom type index and AVX2 optimization (mex) (https://github.com/eruffaldi/mat_argmax_nd), GitHub. 检索来源 .

MATLAB 版本兼容性
创建方式 R2012b
兼容任何版本
平台兼容性
Windows macOS Linux
类别
Help CenterMATLAB Answers 中查找有关 Call Python from MATLAB 的更多信息
标签 添加标签
mex
致谢

参考作品: ARGMAX/ARGMIN

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

无法下载基于 GitHub 默认分支的版本

版本 已发布 发行说明
1.2.0.0

AVX2 optimization: float, double, int32, int16 and int8
Comprehensive testing across types and dimensions with by value verification and speed

1.0.0.0

Better title

要查看或报告此来自 GitHub 的附加功能中的问题,请访问其 GitHub 仓库
要查看或报告此来自 GitHub 的附加功能中的问题,请访问其 GitHub 仓库