is an image a 2d grid or a cube
1 次查看(过去 30 天)
显示 更早的评论
I am having an argument with my colleague that an image is a 3 dimensional object
He says that in geometric sense an image is a 2d square and not a cube.
But since the image has 3 components height,width and channels i feel that it should be treated as a cube.
The following article from cs231n supports my point
For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 5*5*3 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.
Moreover,in pre deep learning we had techniques for image segmentation such as mean shift,HOG feature descriptor,they all relied on the assumption of an image being a cube.(I am just making that u[)
@Image Analyst would appreciate your take on this.
2 个评论
Rik
2021-9-23
You forgot to actually tag Image analyst, so I edited your question for you.
A color image can generally be considered a 2D object, where each element can have multiple properties. It depends on the specific application whether that would be considered 3D or not. In medical imaging (and probably outside of medical imaging) there exists a concept of 2.5D, which uses a slab of a 3D object.
Many of the tools used in deep learning are derived from very large datasets, which tended to be RGB jpeg images. If your application doesn't use RGB, but grayscale instead, the solution tends to be to duplicate your one channel to 3, or to leave two channels as 0.
I'm not sure this topic is best suited to this forum, since it is only vaguely related to Matlab. I think one of the StackOverflow fora would be a better choice of venue.
采纳的回答
Jan
2021-9-23
This is a question of taste. It depends on what you want to consider as elements of the array.
- An image is a 2D object containing pixels with e.g. 8, 16, 24, or 32 bits. These bits can be represented e.g. as 3 Byte for a 24 bit image in UINT8 format.
- An image can be sees as 3D array if you consider the color value of e.g. RGB channels as position in the color space.
For Matlab and other programming tools, an image is a pile of bytes and some information about the structure without any meaning. The structure for storing the image information is chosen such, that they can be processed efficiently. The question, if the data are 2D or 3D is an artificial interpretation. Both views are valid.
0 个评论
更多回答(1 个)
Image Analyst
2021-9-23
It's a question of semantics. A gray scale image is a 2-D image in that it takes 2 values (x,y) or (column, row) to refer to a single pixel.
A volumetric image, like CT or MRI, is a 3-D volumetric image in that it takes 3 values (x,y,z) or (column, row, slice) to refer to a pixel, both programmatically and intuitively.
A color image is normally thought of (outside of a computer program) as a 2-D "thing" by most people. However each location has 3 or more color values. Three for RGB and more for hyperspectral images. So you can intuitively think of the color image as a single 2-D image, or as a stack of several 2-D images where each 2-D image represents one color channel. Of course, in coding/computer programming if you have the color image as a single variable (instead of separate variables for each color channel) then you'll need 3 indexes to index into the array to specify a single pixel's value for that particular color at that particular location. So in that "coding" sense, a color image is a 3-D image. Well it's at least certainly a 3-D variable regardless of how you want to think about it in a layman's/non-computer sense.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 3-D Volumetric Image Processing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!