For a typical classification problem where each input image receives a single label you can use the 'confusion' function to calculate these statistics: http://www.mathworks.com/help/nnet/ref/confusion.html
However, it sounds like you have a special case where each input image gives a full output of length 701 where any number of them can be 0s or 1s. My guess is that the above function was not designed for this use, and it may even be that the statistics you mention are not well defined for the type of problem you have described above.
I would recommend that you look at the Wikipedia page on Confusion Matrices, which nicely explains the various statistics: https://en.wikipedia.org/wiki/Confusion_matrix
It may be possible to simply convert each matrix into a long vector of 1s and 0s and from here calculate precision and recall according to the formulas, however I'm not fully convinced this would be the correct approach as it discards a vast amount of semantic meaning and may only serve to provide some ballpark statistics