Okay, it's my turn. Let's try something absolutely ridiculous.
f = @(x)sum(sum(triu(x)));
testme = [halfsum_poly(A) halfsum_poly(B) halfsum_poly(C)];
isgood = isequal(testme,ref)
function summa = halfsum_poly(A)
y = [0 0 1 0]*mxsz + 0.5;
mk = poly2mask(x,y,sz(1),sz(2));
Wait, as ridiculous as that is, it actually works? Of course it works. These are basic image processing tools. It doesn't use triu(), and it even works for non-square inputs. I don't think anyone would expect this silliness to be fast, but it doesn't have the worst complexity out of the examples on this page. For large inputs (1000x1000), it's comparable in speed to many of the examples here.
Would your TA accept this if you turned it in for your homework? Wanna gamble?
Are there more sensible ways to solve the problem using logical indexing? Yes.
Are there ways which might seem wasteful, but might have advantages? Yes.
Are there ways to do it with loops that are both concise and fast? Yes.
What's the lesson here? If you're going to post an answer among many others, try to post something that adds to the information present. Test it to make sure it works (you can run it right here in the editor). Describe what your code does (comments and otherwise). Does your answer provide particular benefits? Does it have relative drawbacks?