How can I normalize 3D points to be between 0;1 without changing the distribution ?

16 次查看(过去 30 天)
Hello everyone,
I am completely new to matlab and I would like to know if there is a way to normalize 3D points to be between 0;1 without changing the distribution ? It's important to not change the distribution because I will need to compare two scatter plots after that step.
I can't use the mapminmax function because it's between -1 1. Is there another way to do it ?
Thank you in avance for you help,
Marine
  2 个评论
Guillaume
Guillaume 2018-12-10
What does it mean for a point (2D or 3D) to be between 0 and 1 (1D value!)? Do you mean that each coordinate should be between 0 and 1 (i,.e point are within a quadran of the unit square/cube)? Or that their distance from origin should be between 0 and 1 (i.e within a quadrant of the unit circle/sphere)? Or something else?
What does not changing the distribution mean? The distribution in regards to what?
Marine Bertschy
Marine Bertschy 2018-12-10
编辑:Marine Bertschy 2018-12-10
I mean that the points are distributed in a 3D space (here the point corresponds to the position of a neuron in a slide (3D space).
My goal is to compare the relative positions of two 3D scatters plots (corresponding to the manual and the automatic counting of the neurons of the slide).
But the problem is that I manually counted the neurons using micrometers and the algorithm that we used to automatically do the counting is using pixels values, that why I need to normalize the (X,Y,Z) of each points to compare both results between eachother.
Not changing the distribution means that I don't want to change the relative distribution of the points into the 3D space.
I'm sorry, I'm not really comfortable with maths language.

请先登录,再进行评论。

采纳的回答

John D'Errico
John D'Errico 2018-12-10
编辑:John D'Errico 2018-12-10
Sorry, but as soon as you do any normailzation at all, you ARE changing the distribution.
It may be that you are not changing the distribution in any way that is significant to you, but it is somewhat difficult to know what you would find significant.
As well, you have not even said what normalization means to you, since I can certainly think of different ways you might attempt such an act, some of them more disruptive than others to the "distribution" of the numbers.
So, consider an arbitrary set of numbers, living in R^3. You can visualize this set as describing a sort of 3-dimensional potato, floating in space.
Simplest might be a simple shift, or translation. Can we shift the entire potato anywhere in space, by adding a constant to all numbers? Such a translation is just a shift of the implicit origin of that space, but it does indeed change the "distribution" in at least a simple sense. We might add the same number to each of x, y, and z, or a different value to each. But if we add a different value to each channel, this might be argued to be a more significant distortion to the distribution. How do I know what you would take to be significant?
Next, we might scale each of x, y, and z by some constant. The same constant, or one that is specific to each channel. Does that matter? Well, it does, again, to some extent. Suppose I start with a set of points on a perfect sphere, but then scale each of x, y, and z by different constants? Now, what was once a perfect sphere is now an ellipsoid, so no longer a sphere. So while we can translate a sphere around anywhere we want by moving the implicit origin, after that translation, a sphere is still a sphere, is still a sphere. However, an ellipsoid is NOT a sphere. Topologically, the two are equivalent though, in the sense that no holes or strange folds were introduced.
So it really may matter what you consider to be validly normalized. When has the distribution changed?
Consider a simple example, in only ONE dimension. Since you talk about distributions, I'll do exactly that.
X = exp(randn(1,100));
Here, I have created a random variable X. X is in fact lognormally distributed. But now, let me perform asimple translation on X.
Y = X - 1;
Y is a translated version of X. And, while X was indeed lognormally distributed, Y does NOT follow a lognormal distribution. (You could describe it as a shifted lognormal distribution though.)
I know, this may all seem pedantic. But the fact is, you said you did not wish to change the distribution, and any normalization will indeed do exactly that, on at least some level.
  3 个评论
John D'Errico
John D'Errico 2018-12-10
Then you need to be very careful in how you approach the normalization. I'll make up an example, since I lack any data. I'll do it in two dimensions, since that makes it easy to plot things to understand what I did.
xy = [rand(10,1)*5,randn(10,1)]
xy =
4.5013 -0.84177
2.5401 0.47066
2.3184 -0.10761
0.64343 0.43052
1.8446 0.4701
2.0855 0.097055
2.9005 -1.7308
2.8352 0.51591
0.95134 -0.40305
0.17789 -0.12938
Some garbage data, but it will suffice to get my points across. Assume thiscorresponds to aset of 10 pairs of numbers, (x,y), thus each row of the matrix is one point in the (x,y) plane. We can plot them as
plot(xy(:,1),xy(:,2),'o')
grid on
axis equal
untitled.jpg
Just some random, garbage data. But lets pretend that the where things lie in space is important. Now, in order to normalize the two sets to lie in [0,1], we might decide to start in one of two ways. We could translate x by subtracting the minimum over all of the vector x. Likewise, translate y by subtracting the minium of y from each y.
xy1 = xy - min(xy,[],1)
xy1 =
4.3234 0.889
2.3622 2.2014
2.1405 1.6232
0.46554 2.1613
1.6667 2.2009
1.9077 1.8278
2.7226 0
2.6573 2.2467
0.77345 1.3277
0 1.6014
So here, we have shifted x down, but y was shifted up.
min(xy,[],1)
ans =
0.17789 -1.7308
The net result is that one point in the new xy1 is now at zero in x and in y, but they are different points! So we translated the two variables by different amounts. That may be improper for your goals. Or not. Only you know.
Alternatively, we might have done the translation so that we shifted them equally.
min(min(xy))
ans =
-1.7308
xy2 = xy - min(min(xy))
xy2 =
6.2321 0.889
4.2708 2.2014
4.0491 1.6232
2.3742 2.1613
3.5754 2.2009
3.8163 1.8278
4.6313 0
4.5659 2.2467
2.6821 1.3277
1.9087 1.6014
Clearly not the same as what we did to get xy1. But here the potatoe was shifted by the same amount in each variable.
Now, we can also decide to scale the data. A linear scaling is the same idea. For example, we might rescale each dimension by the maximum of that variable now. Thus, here are two rescalings of the data:
xy1s = xy1./max(xy1,[],1)
xy1s =
1 0.39569
0.54637 0.97986
0.49509 0.72247
0.10768 0.96199
0.38551 0.97961
0.44124 0.81357
0.62974 0
0.61463 1
0.1789 0.59097
0 0.71278
xy2s = xy2./max(max(xy2))
xy2s =
1 0.14265
0.6853 0.35324
0.64972 0.26045
0.38096 0.3468
0.5737 0.35315
0.61237 0.29329
0.74314 0
0.73265 0.3605
0.43037 0.21305
0.30626 0.25696
You need to recognize the differences, and decide which is appropriate for your problem.
plot(xy1s(:,1),xy1s(:,2),'o')
grid on
axis([0 1 0 1])
axis equal
axis square
untitled.jpg
Compare that to...
plot(xy2s(:,1),xy2s(:,2),'o')
grid on
axis([0 1 0 1])
untitled.jpg
In both cases, the data was "normalized" to live in the unit square. I cannot know which meaning of a normalization that does not change their relative positions you would prefer. But you should be able to make the choice, and how to do the normalization should be clear, whether in 2-d or in 3-d.
Marine Bertschy
Marine Bertschy 2018-12-10
I think the best way for me is to rescale my data.
Thank you very much for your time and your help regarding my question.

请先登录,再进行评论。

更多回答(1 个)

KSSV
KSSV 2018-12-10
A = randsample(1000,100) ;
normA = A - min(A(:)) ;
normA = normA ./ max(normA(:)) ;
subplot(211)
plot(A)
subplot(212)
plot(normA)

产品


版本

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by