How to represent gray scale images as affine subspaces?
4 次查看(过去 30 天)
显示 更早的评论
M
2023-10-23
How to represent gray scale images as affine subspaces?
4 个评论
回答(4 个)
Image Analyst
2023-10-23
I don't know what you mean. What's the context? What do you mean by "model"? What do you mean by "affine subspaces"? Do you just want to warp or spatially transform the image?
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
1 个评论
M
2023-10-23
编辑:M
2023-10-23
@Image Analyst @Matt J In the attached paper they represented the images in Affine subspaces, I am asking generally if there is a popular method/code of representing the image in affine space.
My data is huge attached is a sample.
Matt J
2023-10-23
编辑:Matt J
2023-10-23
One way, I suppose would be to train an affine neural network with the Deep Learning Toolbox, e.g.,
layers=[imageinputLayer(120,160,1);
convolution2dLayer([120,160],N) )
regressionLayer];
XTrain=images;
YTrain=zeros(1,1,N,size(XTrain,4));
net=trainNetwork(XTrain, YTrain, layers,.... )
but you would need a truly huge number of images and good regularization for the training to be well-posed. You should probably look at augmentedImageDatastore.
34 个评论
M
2023-10-23
@Matt this is a normal CNN, I am looking for affine representation and Grassmann manifold.
Matt J
2023-10-23
编辑:Matt J
2023-10-23
this is a normal CNN
I don't know why you'd call it "normal". There are no ReLUs or pooling layers, and the filter kernel is the size of the whole image (you could have equivalently used a fullyconnectedLayer as well). The weights of the CNN will provide equations for the best fitting N-dimensional affine subspace to your image data set.
If you do not want an equation matrix, what form should the result take?
M
2023-10-24
编辑:M
2023-10-24
@Matt J The weights of the CNN will provide equations for the best fitting N-dimensional affine subspace to your image data set.
How did you reach to this? what is the mathematics behind it?
net=trainNetwork(XTrain, YTrain, layers,.... )
What is the options that fit this NN?
@Matt J I tried to do the following, but I think what I have done is wrong
But I keep getting the following error: Number of observations in X and Y disagree.
Also, I still didnt get what is the relation between this and affine and grassmann space!!
layers=[ imageInputLayer([120 160 1], 'Normalization', 'rescale-zero-one')
convolution2dLayer([120,160],7200)
regressionLayer];
options = trainingOptions('adam', ...
'MiniBatchSize',200, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-3, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',20, ...
'ValidationData', {images_Test, labels_Test}, ...
'ValidationFrequency',200, ...
'Shuffle','every-epoch', ...
'Plots','training-progress');
Net = trainNetwork(XTrain, YTrain, layers, options);
Catalytic
2023-10-24
编辑:Catalytic
2023-10-24
The paper you've posted is about image classification, so I assume that's your task as well. As I understand @Matt J's approach, the idea is that you would separate your images into classes and train a separate neural network for every image class. The weights of each network give you the affine subspace that each of your classes belong to within the Grassmannian manifold, and the output of the network will be near zero for any image within that subspace. Therefore, if you want to classify a new image, you pass it through each of the networks and see which network gives the smallest output.
M
2023-10-24
编辑:M
2023-10-24
the idea is that you would separate your images into classes
I have 3 classes , so I have to separate each class in image set and save each weight after training?
The weights of each network give you the affine subspace that each of your classes belong to within the Grassmannian manifold,
How did you reach to this conclusion, what is the mathematics behind it?
and the output of the network will be near zero for any image within that subspace.
do you mean that all the labels will be zeros?
Therefore, if you want to classify a new image, you pass it through each of the networks and see which network gives the smallest output.
How that can be done? and What is the advange of doing this? Do you think this is a computually cost comparing with CNN since we have only a single weight there?
Matt J
2023-10-24
编辑:Matt J
2023-10-24
I have 3 classes , so I have to separate each class in image set and save each weight after training?
Or you can just save the net object that trainNetwork gives you.
How did you reach to this conclusion, what is the mathematics behind it?
For some matrix A and vector b, an affine subspace is the set of vectors x satisfying A*x+b=0. The Grassmanian manifold, as I understand it, is the set of all {A,b} pairs defining subspaces of a fixed dimension k. The neural network above implements A*x(:)+b where x is an input image and A and b are the training weights. We want to find the weights such that A*x(:)+b=0 over each class of images.
do you mean that all the labels will be zeros?
No, the network responses will be zeros. It's a regression network.
What is the advange of doing this?
I don't know that there is an advantage. You brought the Sharma article to the party. Does it say there should be an advantage relative to conventional CNNs?
M
2023-10-24
编辑:M
2023-10-24
@Matt J your apporach is different that what was suggested in the paper, but it may give a good result if we calculate the Grassman distance between the training and testing subspaces
But I am not able to test and evaluate your approach, could you please suggest a continuation for your following code?
layers=[imageinputLayer(120,160,1);
convolution2dLayer([120,160],N) )
regressionLayer];
XTrain=images;
YTrain=zeros(1,1,N,size(XTrain,4));
net=trainNetwork(XTrain, YTrain, layers,.... )
Matt J
2023-10-24
编辑:Matt J
2023-10-24
This example ran for me, but I wouldn't expect you to run it exactly this way. You probably want to set up an augmentedDataStore, as I mentioned before. You will need to read examples of that in the documentation.
N=10;
layers=[imageInputLayer([120,160,1])
fullyConnectedLayer(N)
regressionLayer];
XTrain=rand(120,160,1,5);
YTrain=zeros(1,1,N,size(XTrain,4));
options = trainingOptions('adam', ...
'MiniBatchSize',200, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-3, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',20, ...
'ValidationFrequency',200, ...
'Shuffle','every-epoch', ...
'Plots','training-progress');
net=trainNetwork(XTrain, YTrain, layers,options);
M
2023-10-24
编辑:Matt J
2023-10-24
They do the same thing. You can use analyzeNetwork to verify that the activations are 1x1xN in both cases.
N is the number of image right?
N is the number of rows in the matrix equations A*x=b. So, if you are considering a Grassmanian manifold Gr(n,k) of k-dimensional affine subspaces of then N=n-k.
Keep in mind that the total number of layer weights will be 120*160*N+N, so you mustn't be too liberal with your choice of N.
Do I have to train 3 neural nets, one for each class and pass the test image through these NN?
Yes. You must train 3 neural nets and each of these neural nets must be trained only with images in the corresponding class.
Also how can I test this neural net? Pass an image?
Yes. If the image is a member of the class that the NN is supposed to detect, the outputs should all be zero, or close to zero, depending on how well the test image agrees with the equations for the affine subspace, A*x(:)+b=0.
M
2023-10-24
编辑:M
2023-10-24
@Matt J Thanks a lot for you answers.
Could you please tell me based on what do we choose the number of N?
If it is a member of the class that NN is supposed to detect, the outputs should all be zero, or close to zero
How can I see the outputs?
Also, How can we extract the pair of (A,B) from the neural net so I can use them in further analysis?
Matt J
2023-10-24
编辑:Matt J
2023-10-24
Could you please tell me based on what do we choose the number of N?
No, I've never seen this kind of classification done before. Doesn't Sharma make some sort of choice of what subspace dimensions the images are supposed to live in?
How can I see the outputs?
Also, How can we extract the pair of (A,B) from the neural net so I can use them in further analysis?
Once the network is trained, A will be the Weights property of the fullyConectedLayer and b will be the Bias property.
Matt J
2023-10-24
编辑:Matt J
2023-10-24
Well, in general, we can write the estimation of A,b as the norm minimization problem,
where X is a matrix whose rows are your training images (from one class). In principal, there are a number of solution approaches to this when X and A are of small dimension. However, you seemed to imply that X will be a very big matrix, probably too big to store in RAM. Indeed also, you will need a large number of images depending on how many unknown variables A(i,j) b(i) that you are expecting to use to represent the subspaces. Ideally, you would want size(X,1) to be considerably greater than N=size(A,1).
The advantage of using trainNetwork is that it is uniquely suited to large data: it allows your X data to be kept in image data stores instead of in RAM and breaks the optimization down into steps which use only small sub-chunks of X . It also has regularization methods built into it, which is helpful when you don't have a lot of data.
M
2023-10-25
编辑:M
2023-10-25
@Matt J Thank you, But how can I verify that the obtained weight satisfies A*x(:)+b=0, ? Also I tried to do test with the same input images by Y = predict(net,xTrain), it didn't give zeros or close to zero!
X is a matrix whose rows are your training images
Do you mean that I have to convert each image to row to perfom the compution?
I tried the following the answer was not zero:
W = net.Layers(2,1).Weights;
B = net.Layers(2,1).Bias;
A = XTrain;
s = size(A);
img1 = reshape(A,s(1)*s(2)*s(3),s(4));
>> W*img1+B;
%or
norm(img1'*W'+B');
Matt J
2023-10-25
编辑:Matt J
2023-10-25
I would suggest first testing on simulated low-dimensional image data that you know obeys an affine model perfectly,
images=randn(3,2)*randn(2,1000) + randn(3,1); %3x1 images
These toy images live in , so we can verify whether the NN find the correct result using simple plane-fitting, e.g., by downloading planarFit() from here,
p=planarFit(images);
A=p.normal;
b=-p.distance;
norm(A*images+b) %very close to zero
ans = 4.8727e-14
plot(p); legend Images
Matt J
2023-10-25
编辑:Matt J
2023-10-25
I don't understand what you did and what this plot represents. Is this generated from your real images? Your real images are not 3x1. They are 120x160.
In any case, if your images are in the same class, but do not lie in or near to a common affine subspace, then your model fails, and the method fails.
M
2023-10-25
@Matt J Yes, they are 120×160. I understood you wrong in the previous answer. Sorry.
In any case, if your images are in the same class, but do not lie in or near to a common affine subspace
How to test this?
Matt J
2023-10-25
Well, the neural network training determines that. Assuming the training is successful (and that is an important assumption), the training responses will be close to zero.
Matt J
2023-10-25
The output of the network. Remember as well that "close to zero" is a relative term. I don't think we expect your image classes to fit the model without errors, and we don't know how big the errors will be. We just have to hope that it is small enough to allow the classes to be distinguished.
Matt J
2023-10-25
编辑:Matt J
2023-10-25
Here's a working example of the plane fitting, but with a neural network. I had to turn of L2-regularization to get it to work. Not sure what that will mean for your real use case.
images=randn(3,2)*randn(2,1000) + randn(3,1); %3x1 images
images=reshape(images,3,1,1,[]);
N=1;
layers=[imageInputLayer([3,1,1],'Normalization','none')
fullyConnectedLayer(N,'WeightL2Factor',0,'BiasL2Factor',0 )
regressionLayer];
XTrain=images;
YTrain=zeros(1,1,N,size(XTrain,4));
options = trainingOptions('adam', ...
'MiniBatchSize',100, ...
'MaxEpochs',50, ...
'InitialLearnRate',1, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',20, ...
'ValidationFrequency',200, ...
'Shuffle','every-epoch');
close all force
net=trainNetwork(XTrain, YTrain, layers,options);
Training on single CPU.
|========================================================================================|
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | RMSE | Loss | Rate |
|========================================================================================|
| 1 | 1 | 00:00:00 | 1.78 | 1.6 | 1.0000 |
| 5 | 50 | 00:00:00 | 0.06 | 2.0e-03 | 1.0000 |
| 10 | 100 | 00:00:00 | 8.13e-03 | 3.3e-05 | 1.0000 |
| 15 | 150 | 00:00:00 | 1.83e-03 | 1.7e-06 | 1.0000 |
| 20 | 200 | 00:00:00 | 9.71e-05 | 4.7e-09 | 1.0000 |
| 25 | 250 | 00:00:01 | 1.20e-05 | 7.2e-11 | 0.1000 |
| 30 | 300 | 00:00:01 | 1.44e-06 | 1.0e-12 | 0.1000 |
| 35 | 350 | 00:00:01 | 3.81e-08 | 7.3e-16 | 0.1000 |
| 40 | 400 | 00:00:01 | 6.86e-09 | 2.4e-17 | 0.1000 |
| 45 | 450 | 00:00:01 | 6.34e-09 | 2.0e-17 | 0.0100 |
| 50 | 500 | 00:00:01 | 5.71e-09 | 1.6e-17 | 0.0100 |
|========================================================================================|
Training finished: Max epochs completed.
A=net.Layers(2).Weights; s=norm(A);
A=A/s;
b=net.Layers(2).Bias/s;
norm(A*images(:,:)+b)
ans = single
2.7123e-06
p=planarFit.groundtruth(images(:,:), A,-b);
plot(p)
M
2023-10-26
I tried to change so many parameters but still I am getting very high numbers of mini batch rmse and mini batch loss as wel very high number of norm(A*images(:,:)+b), I dont know what is the problem!
Do you have any suggestion please. thanks
M
2023-10-26
编辑:M
2023-10-26
because the norm of my real data is 1.3900e+05
And
Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | RMSE | Loss | Rate
| 200 | 2600 | 00:01:20 | 9.26 | 42.9 | 1.0000e-09 |
My real image which belong to the same class 2608 images are attached in the link without augmentation https://we.tl/t-hDRA25aaJ1
N=15;
layers=[imageInputLayer([120 120 1],'Normalization', 'rescale-zero-one')
fullyConnectedLayer(N,'WeightL2Factor',0,'BiasL2Factor',0 )
regressionLayer];
YTrain=zeros(1,1,N,size(XTrain,4));
options = trainingOptions('adam', ...
'MiniBatchSize',200, ...
'MaxEpochs',200, ...
'InitialLearnRate',1, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',20, ...
'ValidationFrequency',200, ...
'Shuffle','every-epoch', ...
'Plots','training-progress');
net=trainNetwork(XTrain, YTrain, layers,options);
A=net.Layers(2).Weights;
%s=norm(A);
%A=A/s;
b=net.Layers(2).Bias;
Matt J
2023-10-26
编辑:Matt J
2023-10-26
The training curves look pretty good to me. But you can't use rescale-to-one normalization. That ruins the affineness of the network. Also, the normalization of the weights and biases will be important.
A=net.Layers(2).Weights;
s=vecnorm(A,2,2);
A=A./s;
b=net.Layers(2).Bias./s;
M
2023-10-26
编辑:M
2023-10-26
When I tried
ayers=[imageInputLayer([120 120 1],'Normalization','none')
fullyConnectedLayer(N,'WeightL2Factor',0,'BiasL2Factor',0 )
regressionLayer];
I got
Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | RMSE | Loss | Rate
| 200 | 2600 | 00:01:25 | 2752.34 | 3.8e+06 | 1.0000e-09 |
The RMSE and Loss are very high! Why?
And with the above and weight and Bias normalization I got norm 820!
But you can't use rescale-to-one normalization. That ruins the affineness of the network. Also, the normalization of the weights and biases will be important.
I noticed good enhancement in the norm with this comment, but I didnt get why That ruins the affineness
M
2023-10-26
编辑:M
2023-10-26
@Matt J I want to ask you please if I use any optimization technique and so on to update the weight and bias with cost function A*x(:)+b=0, and estimate "N", Would that give better results?
Also, If I use the optimizqtion to estimiate the (A, B and N) with with cost function A*x(:)+b=0, would that be better?
Thanks
Matt J
2023-10-26
You have more experience than I do. As I said before, I've never seen this classification strategy before.
Matt J
2023-10-27
编辑:Matt J
2023-10-27
Well, in general, we can write the estimation of A,b as the norm minimization problem,
If X can be fit in RAM, you could just use svd() to solve it
N=14;
X=images(:,:)';
vn=vecnorm(X,inf,1);
[~,~,V]=svd([X./vn, ones(height(X),1)] , 0);
Abt=V(:,end+1-N:end)./vn';
A=Abt(1:end-1,:)';
b=Abt(end,:)';
s=vecnorm(A,2,2);
[A,b]=deal(A./s, b./s);
43 个评论
M
2023-10-30
编辑:M
2023-10-30
@Matt J Thank you, Could you please explain the code more?
But I didn't get why did you convert the images as concatenated matrices? as well why did you use this type of norm insead of frob norm
Do you know what is the best way of the test may be for all the classes?
I am getting the following error
load('XT.mat')
size(XT)
ans = 1×4
120 120 1 1700
N=14;
X=XT(:,:)';
size(X)
ans = 1×2
204000 120
vn=vecnorm(X,inf,1);
[~,~,V]=svd([X./vn, ones(height(X),1)] , 0);
Abt=V(:,end+1-N:end)./vn';
Arrays have incompatible sizes for this operation.
A=Abt(1:end-1,:)';
b=Abt(end,:)';
s=vecnorm(A,2,2);
[A,b]=deal(A./s, b./s);
norm(A*X(:,:)+b)
Matt J
2023-10-30
编辑:Matt J
2023-10-30
You'll probably have to downsample the images so that you have more samples than pixels. I've used sepblockfun from this FEX download,
to do so.
load('XT.mat')
XT=sepblockfun(XT,[3,3],'mean');
size(XT)
ans = 1×4
40 40 1 1700
N=14;
X=reshape(XT,[], size(XT,4)).';
size(X)
ans = 1×2
1700 1600
vn=max([vecnorm(X,inf,1),1],1);
[~,~,V]=svd([X, ones(height(X),1)]./vn , 0);
Abt=V(:,end+1-N:end)./vn';
A=Abt(1:end-1,:)';
b=Abt(end,:)';
s=vecnorm(A,2,2);
[A,b]=deal(A./s, b./s);
PercentError = norm(A*X.'+b)/norm(X.')*100
PercentError = 0.0058
Matt J
2023-11-1
编辑:Matt J
2023-11-1
vn=max([vecnorm(X,inf,1),1],1); %For normalizing X
[~,~,V]=svd([X, ones(height(X),1)]./vn , 0); %Solve equations [X,1]*[A';b']=0
Abt=V(:,end+1-N:end)./vn'; %Pick best N solutions and undo normalization
A=Abt(1:end-1,:)'; %Separate [A';b'] into A and b parts.
b=Abt(end,:)';
Matt J
2023-11-2
You need to pick N solutions because we want A to have N rows.
You need to undo the normalization so that the A,b solution will work on the original, unnormalized image data.
M
2023-11-27
编辑:M
2023-11-27
Hello @Matt J
I am trying your code with XTrain with size 120*120*40000
But I am getting the following error:
Error using svd
Requested array size is too large. Consider using the 'econ' option.
How can I use the economy option in your code please? Or is there any other solution please?
Thanks
M
2023-11-28
I tried the following but I got this error
Error using tall/reshape
Size arguments must be real integers.
Do you have any idea please?
N=4;
XTrain = tall(Data_S);
X=reshape(XTrain,[], size(XTrain,4)).';
size(X)
vn=max([vecnorm(X,inf,1),1],1);
[~,~,V]=svd([X, ones(height(X),1)]./vn , 0);
Abt=V(:,end+1-N:end)./vn';
A=Abt(1:end-1,:)';
b=Abt(end,:)';
s=vecnorm(A,2,2);
[A,b]=deal(A./s, b./s);
Matt J
2023-11-28
编辑:Matt J
2023-11-28
Maybe as follows? I don't know how much RAM you have.
N=4;
XTrain = Data_S;
X=reshape(XTrain,[], size(XTrain,4)).';
size(X)
vn=max([vecnorm(X,inf,1),1],1);
%Process as tall
T=tall([X, ones(height(X),1)]./vn);
clear X
[~,~,V]=svd( T , 0);
clear T
Abt=gather( V(:,end+1-N:end) ) ./vn';
A=Abt(1:end-1,:)';
b=Abt(end,:)';
s=vecnorm(A,2,2);
[A,b]=deal(A./s, b./s);
M
2023-11-28
Error using tall/reshape
Size arguments must be real integers.
This error for this line
X=reshape(XTrain,[], size(XTrain,4)).';
M
2023-11-28
Still I am getting the same error
Evaluating tall expression using the Parallel Pool 'Processes':
Evaluation 0% complete
Error using tall/svd
Requested array size is too large. Consider using the 'econ' option.
Learn more about errors encountered during GATHER.
Error in tall/gather (line 50)
[varargout{:}, readFailureSummary] = iGather(varargin{:});
M
2023-11-29
编辑:M
2023-11-29
@Matt J I upgraded my RAM to 64 GB but still I am getting now this error with svd(___,"econ"):
Evaluating tall expression using the Parallel Pool 'Processes':
Evaluation 0% complete
Error using indexing
Index in position 2 is invalid. Array indices must be positive integers or logical
values.
Learn more about errors encountered during GATHER.
Error in tall/gather (line 50)
[varargout{:}, readFailureSummary] = iGather(varargin{:});
The problem in this line: Abt=gather( V(:,end+1-N:end) ) ./vn';
M
2023-11-29
@Matt J when I check the memory :
It seems the problem is not the memory, I cant figure the problem, Do you have any suggestion please!
memory
Maximum possible array: 74782 MB (7.841e+10 bytes) *
Memory available for all arrays: 74782 MB (7.841e+10 bytes) *
Memory used by MATLAB: 3974 MB (4.167e+09 bytes)
Physical Memory (RAM): 64758 MB (6.790e+10 bytes)
M
2023-11-29
编辑:M
2023-11-29
Update
I coverted the XTrian to 4D instead of 3D ; I mean with size 120*120*1* 39698 instead of 120*120* 39698,
And now I am getting this message since two hr and Matlab still busy but the Evaluation 45% is not changing
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to parallel pool with 12 workers.
Evaluating tall expression using the Parallel Pool 'Processes':
- Pass 1 of 2: Completed in 10 sec
Evaluation 45% complete
Matt J
2023-11-29
Not sure, but if it is hanging, it is for hardware-specific reasons. Maybe try it on a different computer?
M
2023-12-5
编辑:M
2023-12-5
I want to ask you a question please,
If I want to calculate the grassman distance between the subspaces,
Which subspace I have to consider in the code?
Is (A*XTrain'+b) a subspace and (A*XTest+b) a subspace?
Where XTrian is all the Training Images and XTrest is one test image?
Note, I tried to use the PercentError = norm(A*XTest.'+b)/norm(XTest.')*100 to evaluate the performance but it is failed
Another question,
Why did you consider V in the computions not U, [~,~,V]=svd( T , 0);
Thanks
Matt J
2023-12-5
Is (A*XTrain'+b) a subspace and (A*XTest+b) a subspace?
The subspace is defined by the pair of matrices {A,b}, but I don't know how the Grassman distance is defined nor how the {A,b} representation would be used to compute it.
Why did you consider V in the computions not U, [~,~,V]=svd( T , 0);
V contains the (approximate) null vectors of T.
M
2023-12-5
编辑:M
2023-12-5
@Matt J but when I tried to compute A'*A it didnt give me the identity which is supposed to give Identity, that's why I am asking about V.
The subspace is defined by the pair of matrices {A,b},
So the test and train should belong to the same subspace, but still I am not able to evaluate this point.
@Torsten do you think it can be applied in this case? when we have pair of matrices {A,b}, of training and there is an X matrix for test
Can we measure it as the following between the directions of (A*XTrain'+b) and (A*XTest+b) ? where A and B in both are the same?
There purpose is to judge that a matrix is belong to the subspace or not(Classification)
[~,S,~] = svd(directions1.'*directions2);
distance = sqrt(sum(arrayfun(@(i)acos(S(i,i))^2,1:size(S,1))))
or by
subspace(directions1,directions2)
Matt J
2023-12-5
编辑:Matt J
2023-12-5
but when I tried to compute A'*A it didnt give me the identity which is supposed to give Identity
I don't know why you think it should.
There purpose is to judge that a matrix is belong to the subspace or not(Classification)
If so, then maybe an alternative distance metric could be,
distance(x)=norm(A*x+b)
M
2023-12-5
M
2023-12-5
You can check in the 2nd reference that
[A, b0] orthogonal affine coordinates if [A, b0] ∈ R n×(k+1) , A TA = I, A T b0 = 0
Matt J
2023-12-5
编辑:Matt J
2023-12-5
Failed, it gave me less norm for images that don't belong to the same class compared to the norm for images that belong to the same class
Then either you have insufficient training data or you need to increase N.
[A, b0] orthogonal affine coordinates if [A, b0] ∈ R n×(k+1) , A TA = I, A T b0 = 0
That's not what A and b are in this context.
M
2023-12-5
编辑:M
2023-12-5
I have 40,000 training and 8 testing, At least this should be indicator to Algorithm performance!
As well I tried different number of N, but I didn't find improvement in the performance
That's not what A and b are in this context.
So what is the A and b in this context? Are not the direction and origin?
Matt J
2023-12-5
编辑:Matt J
2023-12-5
As well I tried different number of N, but I didn't find improvement in the performance
Then perhaps the images simply do not follow an affine model. Or, the dimension of the subspace is very low and requires a very large N (but if so that will be computationally problematic).
So what is the A and b in this context? Are not the direction and origin?
No, they are simply matrices of equation coefficients. The equation for the subspace is A*x=b.
Matt J
2023-12-5
编辑:Matt J
2023-12-5
You could also try a somewhat modified pre-normalization from what we had before,
N=4;
XTrain = tall(Data_S);
X=reshape(XTrain,[], size(XTrain,4)).';
mu=mean(X,1); %<-----
X=X-mu; %<-----
vn=max([vecnorm(X,inf,1),1],1);
[~,~,V]=svd([X, ones(height(X),1)]./vn , 0);
Abt=V(:,end+1-N:end)./vn';
A=Abt(1:end-1,:)';
b=Abt(end,:)';
b=b+A*mu(:); %<-----
s=vecnorm(A,2,2);
[A,b]=deal(A./s, b./s);
Torsten
2023-12-5
If you would in short explain how images are usually given, how this representation can be transformed to a representation by an affine subspace and how you plan to cluster images for which the distance between the subspace representations is "near", maybe more participants in this forum could help.
M
2023-12-5
编辑:M
2023-12-5
With N=14 the basis you are talking about would fill a 40000x(40000-N) matrix. It would be unrealisitic to compute it and store in RAM.
The direction and origins approach to compute the distance and kernel may work, because it is used in the literature, But I cant reach for a conclusion how to use get them
M
2023-12-5
编辑:M
2023-12-11
how images are usually given,
My images are frame based representation (Grayscale) , I have 3 classes , currently I Picked up for each class 40000 images for training and 9000 images for testing
how this representation can be transformed to a representation by an affine subspace
@Matt J suggeted to use SVD to represent each class as affine subspace as AX+b=0
how you plan to cluster images for which the distance between the subspace representations is "near",
I am thinking to have a subspace for each class, then for test for one frame I can compute the distance between it and the 3 subspaces, the lowest distance will indicate that this frame belongs to the this class
Torsten
2023-12-5
Ok, with this explanation I think Matt will remain the only participant in the discussion :-)
Matt J
2023-12-5
I am thinking to have a subspace for each class
It's important to keep in mind that this could be a bad modeling assumption. If you are getting bad results because the subspaces are not truly affine, we cannot help you overcome that.
Torsten
2023-12-5
编辑:Torsten
2023-12-5
So you say you have 3 classes for your images that are known right from the beginning.
Say you determine the affine subspace for each 49000 images that best represents your image and you compute the mutual distance by SVD between these 49000 affine subspaces (which would give a 49000x49000 matrix). Now say you cluster your images according to this distance matrix into 3 (similar) clusters derived from the distance matrix.
The question you should ask yourself is: would these three clusters resemble the 3 classes that you think the images belong to right at the beginning ?
If the answer is no and if you consider the 3 classes from the beginning as fixed, then the distance measure via SVD is not adequate for your application.
M
2023-12-6
编辑:M
2023-12-6
@Matt J unfortunately the pre-normalization, and increasing the number of N didnt improve the performance.
I sure that the problem in the indicitor (norm), other indicators may provide better results as Grassman Kernel and distance. because that's proved in the literature. but still I am looking how can I apply them
Matt J
2023-12-5
编辑:Matt J
2023-12-5
So how Can I get the direction and origins so I can compute the grassman distance?
Here is another variant that gives a Basis/Origin description of the subspace.
N=100; %estimated upper bound on subspace dimension
X=reshape(XTrain,[], size(XTrain,4)); %<----no tranpose
mu=mean(X,2);
X=X-mu;
[Q,~,~]=qr(X , 0);
Basis=Q(:,1:N); %Basis direction vectors
Origin=mu-Basis*(Basis.'*mu); %Orthogonalize
18 个评论
Torsten
2023-12-6
编辑:Torsten
2023-12-6
Just one question: Is it really true that in the end, you want to characterize the difference between two images by one single number, namely their distance in the grassman metric ? Just naively thought: can this ever work for such a complex thing like an image ? Or are the images themselves quite similar with only slight differences that are to be worked out ?
Matt J
2023-12-6
编辑:Matt J
2023-12-6
why did you decide to take into consideration only the first N columns of Q
The final columns of a QRP decomposition contribute the least to the decomposition, and might be attributable to noise. For example, here is a wide matrix X of rank N=2 as well as a noise-corrupted version, Xnoisy:
X=repmat( rand(4,2) ,1,4)
X = 4×8
0.8430 0.7861 0.8430 0.7861 0.8430 0.7861 0.8430 0.7861
0.7153 0.5335 0.7153 0.5335 0.7153 0.5335 0.7153 0.5335
0.4414 0.4160 0.4414 0.4160 0.4414 0.4160 0.4414 0.4160
0.0416 0.8772 0.0416 0.8772 0.0416 0.8772 0.0416 0.8772
Xnoisy=X + 1e-4*randn(size(X))
Xnoisy = 4×8
0.8429 0.7862 0.8430 0.7861 0.8431 0.7862 0.8430 0.7861
0.7155 0.5336 0.7153 0.5334 0.7151 0.5336 0.7152 0.5335
0.4414 0.4159 0.4413 0.4160 0.4414 0.4160 0.4414 0.4159
0.0418 0.8770 0.0415 0.8772 0.0417 0.8771 0.0416 0.8771
When we take the QRP decomposition of X, we see that the last two rows of R are essentially 0, which means that only linear combinations of the first 2 columns of Q are used in the decomposition to regenerate the columns of X. The final two columns of Q can be discarded. This makes sense, because we know that X is rank-2.
[Q,R,~]=qr(X,"econ"); R
R = 4×8
-1.3583 -0.9308 -1.3583 -0.9308 -0.9308 -1.3583 -0.9308 -1.3583
0 -0.7432 -0.0000 -0.7432 -0.7432 -0.0000 -0.7432 -0.0000
0 0 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0 0 0 -0.0000 -0.0000 0.0000 -0.0000 0.0000
When we do this with Xnoisy, the final two rows are still quite small, but the effect of the noise is that they are not as close to zero as they should be. Therefore, the final two columns of Q need to be discarded based on some other criterion besides R(i,:)=0.
[Q,R]=qr(Xnoisy, "econ"); R
R = 4×8
-1.1912 -1.0617 -1.1911 -1.0615 -1.1911 -1.0617 -1.1911 -1.0616
0 0.8472 -0.0003 0.8474 -0.0001 0.8473 -0.0002 0.8473
0 0 0.0003 0.0000 0.0004 -0.0000 0.0003 -0.0000
0 0 0 -0.0002 -0.0002 -0.0001 -0.0002 -0.0000
You don't necessarily have to choose N manually, though. You could have used some threshold on the diagonal values of R, e.g.,
N=find(abs(diag(abs(R))> 0.01*max(abs(R(:)))) ,1,'last')
N = 2
M
2023-12-6
编辑:M
2023-12-6
Many papers in the literature used the affine grassmannian, grassman distance, grassman kernal, etc.. for image classification!
In Addition, in my application the images themselves quite similar
But I am stuck in how can I test a one image whether it belongs to a certain affine subspace
Torsten
2023-12-6
this is not a new approach for image classification!
Many papers in the literature used the affine grassmannian, grassman distance, grassman kernal, etc.. for image classification!
It was just a question. From my naive view without background in Image Processing, it's hard to believe that one number could suffice to classify images in similar groups.
But I am stuck in how can I test a one image whether it belongs to a certain affine subspace
But if both images can be represented by an affine subspace, we derived the formula to compute the distance of these two subspaces. If the distance is 0, they belong to the same affine subspace. Or is my train of thought too simple ?
M
2023-12-6
编辑:M
2023-12-6
@Torsten your thoughts are reasonable, same can applied to the Grassman Kernel.
But if we formulate a subspace of all Training images as MAtt suggested by finding A and b in AX+b=0 where X is all the training images, and a subspace for only one test image, how we can we compute the distance between these two subspaces.
I know the formulas but I still cant get how to apply it, should I solve the AX+b=0 and get the direction and origin and then apply?
Torsten
2023-12-6
Didn't we have the formula for the distance of two affine spaces defined by (A1,b1) and (A2,b2) ?
Here, (A1,b1) is the best affine subspace you could get from your training with the 40000 images and (A2,b2) is the affine subspace for the test image.
M
2023-12-6
@Torsten I wanted to try with that but I dont why Matt answered me that it is not the same that we can apply in the grassman distance.
So what is the A and b in this context? Are not the direction and origin?
No, they are simply matrices of equation coefficients. The equation for the subspace is A*x=b.
Also, what about the Basis and origin of the QR, can we apply the grassman distance and kernel on them?
Basis=Q(:,1:N); %Basis direction vectors
Origin=mu-Basis*(Basis.'*mu); %Orthogonalize
Matt J
2023-12-6
编辑:Matt J
2023-12-6
Also, what about the Basis and origin of the QR, can we apply the grassman distance and kernel on them?
@M You're the one who told us that you could. That's why you asked us how to calculate them. You wrote earlier "So how Can I get the direction and origins so I can compute the grassman distance? ".
M
2023-12-6
编辑:M
2023-12-6
@Matt J Yes, But my question is the Basis and origin of the QR in the context of A and b ? so I can apply here or in Kernel formula
A denotes to direction and b to origin
function distance = GrassmannDist(A1, b1, A2, b2)
% Compute Grassmann distance of two afine spaces embeded in R^m of the same dimension of
% the form V = { A*x + b : x in R^n }
% where A is (m x n) matrix and b is (m x 1) vector.
% Usualy n <= dimesnion(V) = rank(A) < m.
pMat1 = GetpMat(A1, b1);
pMat2 = GetpMat(A2, b2);
s = svd(pMat1'*pMat2);
distance = norm(acos(min(s,1)));
end
function pMat = GetpMat(A, b)
A = orth(A);
b = b(:);
b0 = [b-A*(A'*b); 1];
b0 = b0/norm(b0);
A(end+1,end) = 0;
pMat = [A,b0];
end
Matt J
2023-12-6
编辑:Matt J
2023-12-6
Note that there is a reason that I separated my A,b computation proposal and my Basis/Origin proposal into separate answers. They are very different.
The A,b description of a subspace involves finding the null vectors of a matrix. The equation A*X+b=0 or equivalently [X.',1]*[A,b].'=0 is the same as saying that [A,b].' is in the null space of the matrix [X.',1]. In order to find [A,b], we need to compute the null vectors of [X.',1], which are traditionally given by the final V columns of its SVD.
Conversly in the Basis/Origin description, we are searching for a set of basis vectors for the column space of X. As I explained above, these vectors lie in the initial columns of Q in the QRP decomposition of X.
M
2023-12-7
@Matt J Thanks you so much Matt for the clarification!
which are traditionally given by the final V columns of its SVD.
is there here a criteria for selcting the N final V columns of its SVD as you suggested for N initial columns of Q?
Also, Regarding Basis/Origin description, can you give me an idea how do we usually use this information for classification?
Matt J
2023-12-7
编辑:Matt J
2023-12-7
is there here a criteria for selcting the N final V columns of its SVD as you suggested for N initial columns of Q?
The dimension of the subspace that you fit to X will be NumPixels-N, where NumPixels is the total number of pixels in one image.
Also, Regarding Basis/Origin description, can you give me an idea how do we usually use this information for classification?
No, pursuing it was your idea. You said it would help you compute the Grassman distance, whatever that is.
M
2023-12-11
编辑:M
2023-12-11
I reached to a conclusion that representing the images as it is as vectors in a subspace is not a good idea!
Especially if the test images are not typical for the training set.(stacking the images as vectors causes problems!)
I think I have to do some feature extractions of region of interest first then representing the features as subspaces.(whether as matrices or vectors) , Still I am thinking how to do that.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Custom Training Loops 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)