is an image a 2d grid or a cube

Question

sparsh garg 2021-9-23

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1459224-is-an-image-a-2d-grid-or-a-cube

回答： Image Analyst 2021-9-23

采纳的回答： Jan

I am having an argument with my colleague that an image is a 3 dimensional object

He says that in geometric sense an image is a 2d square and not a cube.

But since the image has 3 components height,width and channels i feel that it should be treated as a cube.

The following article from cs231n supports my point

For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 5*5*3 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.

Moreover,in pre deep learning we had techniques for image segmentation such as mean shift,HOG feature descriptor,they all relied on the assumption of an image being a cube.(I am just making that u[)

@Image Analyst would appreciate your take on this.

2 个评论
显示无隐藏无

Rik 2021-9-23

You forgot to actually tag Image analyst, so I edited your question for you.

A color image can generally be considered a 2D object, where each element can have multiple properties. It depends on the specific application whether that would be considered 3D or not. In medical imaging (and probably outside of medical imaging) there exists a concept of 2.5D, which uses a slab of a 3D object.

Many of the tools used in deep learning are derived from very large datasets, which tended to be RGB jpeg images. If your application doesn't use RGB, but grayscale instead, the solution tends to be to duplicate your one channel to 3, or to leave two channels as 0.

I'm not sure this topic is best suited to this forum, since it is only vaguely related to Matlab. I think one of the StackOverflow fora would be a better choice of venue.

sparsh garg 2021-9-23

Ice comets will fall in the middle of the sahara before i post a question on stackoverflow,atleaset you /people at other forums are open minded to accept questions,there a question will be accepted only if the moderator likes it,otherwise

"this question is closed/deleted as it;s not related.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Jan 2021-9-23

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1459224-is-an-image-a-2d-grid-or-a-cube#answer_793589

This is a question of taste. It depends on what you want to consider as elements of the array.

An image is a 2D object containing pixels with e.g. 8, 16, 24, or 32 bits. These bits can be represented e.g. as 3 Byte for a 24 bit image in UINT8 format.
An image can be sees as 3D array if you consider the color value of e.g. RGB channels as position in the color space.

For Matlab and other programming tools, an image is a pile of bytes and some information about the structure without any meaning. The structure for storing the image information is chosen such, that they can be processed efficiently. The question, if the data are 2D or 3D is an artificial interpretation. Both views are valid.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

Image Analyst 2021-9-23

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1459224-is-an-image-a-2d-grid-or-a-cube#answer_793639

It's a question of semantics. A gray scale image is a 2-D image in that it takes 2 values (x,y) or (column, row) to refer to a single pixel.

A volumetric image, like CT or MRI, is a 3-D volumetric image in that it takes 3 values (x,y,z) or (column, row, slice) to refer to a pixel, both programmatically and intuitively.

A color image is normally thought of (outside of a computer program) as a 2-D "thing" by most people. However each location has 3 or more color values. Three for RGB and more for hyperspectral images. So you can intuitively think of the color image as a single 2-D image, or as a stack of several 2-D images where each 2-D image represents one color channel. Of course, in coding/computer programming if you have the color image as a single variable (instead of separate variables for each color channel) then you'll need 3 indexes to index into the array to specify a single pixel's value for that particular color at that particular location. So in that "coding" sense, a color image is a 3-D image. Well it's at least certainly a 3-D variable regardless of how you want to think about it in a layman's/non-computer sense.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

is an image a 2d grid or a cube

2 个评论
显示无隐藏无

采纳的回答

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

更多回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

is an image a 2d grid or a cube

2 个评论 显示 无隐藏 无

采纳的回答

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

更多回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论