What does a cross (horizontal line) in the regression plot of a neural network mean with multivariant input and output?

4 次查看(过去 30 天)
Hello everyone,
I have trained a neural network and got the below regression plot.
First of all I have nomalized every sample, by substracting its mean value over the samples and dividing this with its standard deviation. So that all input and output is normalized and in the same range. Is that allowed, or do I introduce any errors into my data.?
I have tried several network structures and always get such a cross in the regression plot, but as I guess, some of the output data seems to be insensitive to the input data. Is that right? If so, how can I check that.
Thanks for your help! Best regards,
Pablo

采纳的回答

Cris LaPierre
Cris LaPierre 2020-12-2
编辑:Cris LaPierre 2020-12-3
Normalizing is a standard preprocessing step. It is helpful when you have several inputs to your model that are of different scale. It helps prevent any one feature from dominating the model due to its scale. When you have a single input, this is unnecessary. Also, this is for preprocessing. I don't think it makes sense to do this after the fact, and could be affecting your visualization.
A cross would suggest there are two different types of data in your data set-one with a relationship and one without. The horizontal part indicates data points that have no relationship between the Target and the Output.
  1 个评论
Pablo Noever
Pablo Noever 2020-12-3
编辑:Pablo Noever 2020-12-3
Thank you Cris for your reply.
First of all; as I have guessed and you have confirmed the cross in the relationship is due to missing links between input and output of samples. By performing a sensitivity analysis on the normalized in and output data of the samples I have discarded every Input that does not contribute to any output parameter and every output parameter that is not affacted by any input parameter. By that all data points on the horizontal axes disapear. Thanks for that. The result is as follows:
So I get a very goog agreement correlation of target (sample output) and ANN output.
Second; I agree that normalizing the sample inputs is necessary. Put also normalizing the sample output (targets) can be helpfull so that you have all data in the same range, so that you can better estimate if the approximation of all data is good. I have performed the same without normalizing the targets, see below and you can see a good agreement (R=1). But the lowest values are in a very small range compared to the other, thus it is hard to evaluate their deviations, as this is fairly not obvious due to the scale.
Nevertheless this is just some detail. The main problem is solved. Insensitive target values can not be represented by the ANN, due to missing linkage to the input parameters and thus produce a horizontal line in the regression plots. A sensitivity analysis is a good method to discard all insensitive in and output parameters.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Image Data Workflows 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by