GA vs patternsearch for Integer problem with equality constraints

I am attempting to solve a simple problem using GA.
My problem has 8 decision variables, all integers, with lb of zeros and ub of TWOS.
I need to ensure that the sum of the variables is always equal to 2.
Therefore I used ga with the following set up:
[x,fval,exitflag,output] = ga(@(x)ObjFun,nvars,A,b,[],[],lb,ub,[],IntCon,opts)
I am writing A and b so that an equality condition is enforced (A(:,2)=-A(:,1) and b(2)=-b(1))
I am solving this simple problem to test the convergence and reliability of ga in a simplified version of my actual problem, before jumping to the real monster.
This problem has 36 possible combinations results. They are:
Comb=[00000002,00000011,00000020,00000101,00000110,00000200,00001001,00001010,00001100,00002000,00010001,00010010,00010100,00011000,00020000,00100001,00100010,00100100,00101000,00110000,00200000,01000001,01000010,01000100,01001000,01010000,01100000,02000000,10000001,10000010,10000100,10001000,10010000,10100000,11000000,20000000]
I solved them all outside of GA and all the possible results (objective function results) are:
R=[97.23,86.40,109.57,93.80,94.21,97.67,89.55,89.75,85.00,95.27,93.71,94.11,94.65,90.41,97.55,88.71,88.81,89.54,85.25,84.12,93.78,93.26,93.61,94.14,89.88,94.07,88.99,97.17,89.43,89.60,90.22,85.84,90.17,85.09,84.29,93.70]
as a student working on research I am trying to get a stable, reliable and fast optimization code to start with those smaller versions of my problem and then build into the big one.
using GA, with all default options, I get no convergence with an exitflag of -2.
however, if I increase the population size (default is 10*8) to 100, 200... I start getting convergence of an exiitflag of 1. HOWEVER, that depends on how I set the RNG.
using 400 population works for all RNG I have tried. but this means that for my actual problem, that will be a lot more complex, I will need such a large population that the problem will take forever to solve. That's my intuition.
so my questions now are:
1) are there other GA options I should try to have a fast and reliable code, given this example and al these conditions?
2) is it possible to use patternsearch enforcing integer constraints for This problem that I am solving?
however, even setting those options, matlab goes to non-integer decision variables in the second interation.)
Thanks for anyone that took the time to read all this,
Conrado

1 个评论

Note that my objective function has no repeated result.
There is only 1 optimal solution.
I am looking for the minimum one.

请先登录,再进行评论。

回答(1 个)

An exitflag of -2 means that GA never even found a feasible starting point for your problem. A larger population size lets it randomly SOMETIMES find a feasible start point.
If you are looking for something that is seriously consistent, and will run quickly, then GA is going to be problematic, at least as you want to use it. However, I think you have parameterized the problem completely incorrectly.
Effectively, the problem is you are trying to search over a huge search space, with a constraint that seriously limits the locus of feasible points.
Your statement is the constraint is the sum of these variables MUST be exactly 2, in a 0-1 binary IP.
All that means is you can represent the problem with EXACTLY TWO integer variables. Represent the set of ALL solutions as an integer of the form:
X = 2^n1 + 2^n2
where n1 > n2. Period. That is really all you need to do.
Now your problem has NO equality constraints on the sum of X. NONE. As well, it becomes a simpler problem, because the search space is now only a two variable integer space, with the limits on n1 being 1 to nmax, and n2 is constrained to run between 0 and n1-1.
Now, don't do this the wrong way. DO NOT form X as the sum of those two powers of 2, because if the problem is at all large, then you will see overflows. That would happen if nmax is greater than 52.
But all you need to do is search over the space I have described. n1 and n2 define the locations of the non-zero elements. DONE. And not only that, but TRIVIALLY DONE. The result will be efficient as hell, because you never need to worry about finding a feasible point. EVERY point is now feasible point. And a 2-dimensional search space is an effectively trivial space to search over.
Now, suppose you decide to make the problem more difficult? Suppose you decide that the sum was actually going to be 3, or 5, and there are hundreds of variables in the binary integer IP? Still trivial to solve. Still relatively efficient as hell. Suppose we have 5 variables? Just solve this as a problem where you will optimize over a search space that is again of the form
X = 2^n1 + 2^n2 + 2^n3 + 2^n4 + 2^n5
Again, don't actually form that big number. All n1,n2,n3,n4,n5 are now is the locations of where the corresponding binary bit appears in the binary representation of X. Now you will have the constraints
nmax >= n1 > n2 > n3 > n4 > n5 >= 0
Again, this means you would just have an integer search space. Still simple to solve. Massively efficient compared to the kludge of an approach you were trying to use.

12 个评论

wait, I dont understand why you said that the exitflag of -2 means that ga didnt find a feasible starting point.
I have binary decision variables and the only constraints I have is to ensure that their sum must be 2.
I monitor GA and it does start and tries only candidate solution that satisfies the constraints but in the end it diverges to one that doesnt satisfy the constraints and stops.
This is a simplified version of a larger problem.
I am writing it with binary decision variables to simplify it.
The actual problem is integer, but not binary :(
so changing the way the problem is formulated may work in this simplified version but not the actual version later on unfortunately
Ok, I CORRECTED MY QUESTION, I had put that the ub matrix was of ONES but it is actually of TWOS
I still need clarification here. I am not sure I understand your proposed solution:
currently I have 8 decision variables x(1)..x(8) and their limits are lb=[0..0] and ub=[2..2]
and the constraints:
x(1)+..+x(8) ≤ 2
x(1)+..+x(8) ≥ 2
that is how my problem is formulated. but keep in mind that in the final version of this code my ub will not be TWOS, and the constraints will not be sum to 2. but these numbers are always EQUAL.
I honestly dont understand how exactly you are telling me to rewrite it, although I can see it should be possible.
I belieive that what John is suggesting is to consider yourself as having two particles that can each take a value from 1 to 8. There are only 8*8 = 64 such values (and John cuts this in half because the particles are indistinguishable, but I will act as if they are distinguishable for clarity of argument). If particle 1 is at 6 and particle 2 is at 8, then you have the binary value
0 0 0 0 0 1 0 1
If particle 1 and particle 2 are both at 3 then you have the value
0 0 2 0 0 0 0 0
There is no need to optimize using a slow, balky ga solver. Instead, just do exhaustive search over the 64 possible values. Or speed it up more by using about half that number of values.
Alan Weiss
MATLAB mathematical toolbox documentation
I agree, this small problem shouldnt be solved the way i did, but I did it to test for when I have a problem with over 100 options (instead of only 0,1 and 2 as options) and when it doesnt converge, I will need to understand why.
I understand I could have 'flipped' the way i wrote the original problem in this simple case.
but as in my previous answer, I will need to increase the problem dimensions.
so now I have 8 decision variables with 3 options for each
but later i will have 8 decision variables with 100 options for each
thats wy I am reluctant to go with that 'flip the problem' solution.
Finally, we get a hint as to the real problem size. However, you still don't understand though, that the way you are trying to constrain the problem using a sum equality constraint makes the problem a huge one, but with effectively very tight constraints. Then it becomes essentially impossible for GA to find feasible points at all. It spends most of its time just looking around to locate the feasible solutions. Most of the test points will not in fact be even feasible.
My point is, while your approach SEEMS mathematically elegant, it is in fact an impractical, inefficient method to define the problem, because this forces GA to search for integer solutions to a linear equality constraint. The net result is GA becomes very slow to run, even if it manages to do anything at all. Mathematical elegance and programming efficiency are not always the same thign. In fact, it is often the case where they are antithetical.
So in the case of your real problem, you say you have 8 decision variables, with 100 options for each? Does that mean your real problem is of the form where exactly two of them must be non-zero, but they can take on any of 100 values each? That is not even a very large problem. What is the search space you really need to search over? Then we can show you how to encode the problem in a more efficient way.
I have no problem with changing my approach
i dont understand how yours would be implemented
I would really appreaciate if you could write the problem the way you formulated in the similar way I did mine:
"I have 8 decision variables x(1)..x(8) and their limits are lb=[0..0] and ub=[2..2]
and the constraints:
x(1)+..+x(8) ≤ 2
x(1)+..+x(8) ≥ 2"
I cant fully understand the implementation of your idea.
thanks,
and the hint for the problem size that you were relieved to finally get maybe not be accuarate. I cited hundred but realisticaly it will be as large as I can make it work. dozens? hundreds? thousands? I dont want to predefine that now while the most simple is still giving GA a hard time...
Again, you are trying to be mathematically elegant, while avoiding the questions I have asked.
x(1)+..+x(8) ≤ 2
x(1)+..+x(8) ≥ 2"
This is just a cute way of saying you want the sum to be 2. That is a just a tricky way to write an equaliity constraint.
Regardless, what is the search space? You show your example has possible feasible points of:
00000002, 00000011, etc.
If the constraint were the sum must be 3, does that mean
00000111, 00000003, 00000012
are all possibilities?
Do you ALWAYS have exactly 8 decision variables? I am asking this because your real problem is clearly different from the one you are posing, and I need to understand that to help you to find a better way to encode the solution space in a way that does not require the sum constraint.
==============================================================
Since you seem so determined to have me solve the problem you don't really have, I'll do so now. In the case of this simple problem, you might formulate the objective function as:
function result = obj(N)
X = zeros(1,8);
X(N(1)) = X(N(1)) + 1;
X(N(2)) = X(N(2)) + 1;
% compute the objective, as a function of the vector X.
stuff...
end
Since you give no clue as to your objective function, that should be sufficient. You will need to fill in the part about stuff, sicne you never how us what your objective is. In the above encoding scheme, N is an INTEGER vector of length 2. Both elements are constrained to be integer, and the bounds are given as
LB = [1 1];
UB = [8 8];
It is as easy as that. As you can see, there will NEVER be a need in this for inequality constraints to masquerade as an equality constraint. There is not any need for an equality constraint at all, because that is built implicitly into how I have formulated the search space.
Now, if you really want help, can you please answer my questions above???
I missed the questions you had asked.
here are the answers to all of them:
1) I have fixed 8 decision variables
2) We can safely assume I have 100 options for each.
3) This means that each one of them can assume a value of 0 to 100, however the sum of all of them must be 100.
indeed, nothing huge, yet using my "mathematically elegant" approach GA is having a hard time with only 2 options. I dont understand why.
sorry for missing your questions, I hope now you are clear about the problem and can forget that "have me solve the problem you don't really have" statement.
my objective function is getting the x values to call an analysis software and compute the results, there is no closed form solution that can be written in an equation.
Unless I missed something in the proposed formulation, the problem I have with using x(1) and x(2) with lb=[1 1 ] and ub=[8 8] is that this means that I can have x=[1 2] as well as x=[2 1] and they both yield the same result. Is that why you add the constraints: nmax >= n1 > n2 > n3 > n4 > n5 >= 0 ?
Ok yes it is. I wll try it now

请先登录,再进行评论。

类别

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by