Binary GA returns floating point numbers

9 次查看(过去 30 天)
Hello,
I am trying to find optimal wavelengths in NIR spectra to perform PLS regression. I have working code but the solution includes sometimes floating point numbers. My question is now how to tell ga that only 0 and 1 are possible gene values.
Or can I simply say that any non zero value is true?
% Set seed for reproducability
rng(42);
% Download data and define arrays
url = 'https://figshare.com/ndownloader/files/1649903';
filename = 'data.xlsx';
data = readmatrix(websave(filename, url));
X = data(2:end,9:end);
y = data(2:end,1);
% Average blocks of 4 wavelengths
Xavg = mean(reshape(X, [size(X,1), size(X,2)/4, 4]), 3);
% Define the fitness function
fitness_function = @(solution) 1.0 / sqrt(mean((y - regressor(Xavg(:,logical(solution)), y)).^2));
% Define the initial population
init_pop = generate_initial_population(size(Xavg,2), 50, 25);
% Define the GA instance
options = optimoptions('ga', 'PopulationSize', 50, 'InitialPopulationMatrix', init_pop, ...
'MutationFcn', {@mutationadaptfeasible, 0.3}, 'CrossoverFcn', @crossoverscattered, ...
'EliteCount', 10, 'MaxGenerations', 100, 'UseParallel', true, 'Display', 'iter');
% Run GA
[solution, fval] = ga(fitness_function, size(Xavg,2), [], [], [], [], zeros(size(Xavg,2),1), ones(size(Xavg,2),1), [], options);
function y_pred = regressor(X, y)
% Specify parameter space
parameters_gs = 1:6;
best_mse = inf;
best_n_components = 0;
for n_components = parameters_gs
% Define PLSRegression object
[~,~,~,~,beta] = plsregress(X, y, n_components);
% Fit to data
y_pred = [ones(size(X,1),1) X] * beta;
% Calculate a final y with best choice of parameters
mse = mean((y - y_pred).^2);
if mse < best_mse
best_mse = mse;
best_n_components = n_components;
end
end
[~,~,~,~,beta] = plsregress(X, y, best_n_components);
y_pred = [ones(size(X,1),1) X] * beta;
end
function init_population = generate_initial_population(array_size, solutions_per_pop, number_of_bands)
% Starts with a boolean array of zeroes
init_population = false(solutions_per_pop, array_size);
% Define an index array the size of the spectral wavelengths
index_array = 1:array_size;
for i = 1:solutions_per_pop
% Randomly shuffle the array in place
index_array = index_array(randperm(length(index_array)));
% Select the first number_of_bands of the shuffled array and use it to flip the population array
init_population(i, index_array(1:number_of_bands)) = ~init_population(i, index_array(1:number_of_bands));
end
init_population = double(init_population);
end
Thanks for helping
F

采纳的回答

Walter Roberson
Walter Roberson 2024-2-19
% Set seed for reproducability
rng(42);
% Load data and define arrays
data = readmatrix('Data/File_S1.xlsx');
Error using readmatrix
Unable to find or open 'Data/File_S1.xlsx'. Check the path and filename or file permissions.
X = data(2:end,9:end);
y = data(2:end,1);
% Average blocks of 4 wavelengths
Xavg = mean(reshape(X, [size(X,1), size(X,2)/4, 4]), 3);
% Define the fitness function
fitness_function = @(solution) 1.0 / sqrt(mean((y - cv_regressor(Xavg(:,logical(solution)), y)).^2));
% Define the initial population
init_pop = generate_initial_population(size(Xavg,2), 50, 25);
% Define the GA instance
options = optimoptions('ga', 'PopulationSize', 50, 'InitialPopulationMatrix', init_pop, ...
'MutationFcn', {@mutationadaptfeasible, 0.3}, 'CrossoverFcn', @crossoverscattered, ...
'EliteCount', 10, 'MaxGenerations', 100, 'UseParallel', true, 'Display', 'iter', ...
'PopulationType', 'bitstring');
% Run GA
[solution, fval] = ga(fitness_function, size(Xavg,2), [], [], [], [], zeros(size(Xavg,2),1), ones(size(Xavg,2),1), [], options);
function y_pred = regressor(X, y)
% Specify parameter space
parameters_gs = 1:6;
best_mse = inf;
best_n_components = 0;
for n_components = parameters_gs
% Define PLSRegression object
[~,~,~,~,beta] = plsregress(X, y, n_components);
% Fit to data
y_pred = [ones(size(X,1),1) X] * beta;
% Calculate a final y with best choice of parameters
mse = mean((y - y_pred).^2);
if mse < best_mse
best_mse = mse;
best_n_components = n_components;
end
end
[~,~,~,~,beta] = plsregress(X, y, best_n_components);
y_pred = [ones(size(X,1),1) X] * beta;
end
function init_population = generate_initial_population(array_size, solutions_per_pop, number_of_bands)
% Starts with a boolean array of zeroes
init_population = false(solutions_per_pop, array_size);
% Define an index array the size of the spectral wavelengths
index_array = 1:array_size;
for i = 1:solutions_per_pop
% Randomly shuffle the array in place
index_array = index_array(randperm(length(index_array)));
% Select the first number_of_bands of the shuffled array and use it to flip the population array
init_population(i, index_array(1:number_of_bands)) = ~init_population(i, index_array(1:number_of_bands));
end
init_population = double(init_population);
end

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Problem-Based Optimization Setup 的更多信息

产品


版本

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by