Please refer to the documentation link -
My understanding is that you can use it for monochrome images, the inputsize depends in the number of channels for the image itself, [height, width, depth] where depth is the number of channels, 1 for a monochrome.