How to get the calculation amount of deep network FLOPS? Analyze Network app does not seem to count this metric?
6 次查看(过去 30 天)
显示 更早的评论
In the matlab analyzeNetwork app, the general CNN model can have the required number of parameters, the size of the feature map, but no flops?...
采纳的回答
Walter Roberson
2021-9-24
This is quite unlikely to happen in the near future, if ever.
The translation of cuda calls into machine instructions depends on the level of optimization, and the ability of the compiler, and the cuda version. The translation of machine instructions into gflops depends on the other instructions scheduled and on the exact model — because even within one architecture, they put out models with different numbers of controllers (SMs) and very different implementations of double precision. The models with the highest double precision performance are never the models with the highest single precision, and it is not uncommon for the model from the previous architecture that had the highest double precision, to have higher double precision than most of the models with the new architecture.
3 个评论
Walter Roberson
2021-9-24
If not predict, then can matlab measure gflops? That clearly depends upon what tools Nvidia provides.
What Nvidia provides is counters of a series of different classes of instructions. Nvidia also provides a performance graph based upon assigning a weight to each of the classes of instructions. The person running the tool can configure the weights.
But... the weights they use do not correspond to any actual model. And all the instructions in the same class are given the same weight, even though the different instructions may have different graduation rates. That is, some of the instructions are limited as to the number that may be executed simultaneously, at rates much lower than using the number of clock cycles per instruction would expect. The handling of square root and reciprocal square root is especially odd, due to some work needed to handle 0 and infinity according to ieee standards.
So... you cannot convert between the counters and gflops without knowing which instructions were being executed because members of the classes can have quite different performance.
The architecture for the 3000 series has some interesting changes for integer work that has to be taken into consideration when measuring gflops.
Remember though that gflops has to do with FLOATING point operations per second, but models might be programmed in integer. If a model is mostly integer, should the gflops measure be near zero, since few floating point operations were done?
更多回答(2 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Parallel and Cloud 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!