Segmenting A text document

1 次查看(过去 30 天)
Sam Terry
Sam Terry 2013-5-6
I'm working on segmentation of characters in a document (kannada:-South indian Language), I am facing the problem wherein, the segmented characters are being overwritten at the same location. Hence I am able to access only the last segmented character. And also the characters are getting segmented in a random order. I request anyone to please help me out in Segmenting the document properly. Below I have given the code which i am working on .(take any Kannada text document as the input image).
%%Image segmentation and extraction
%%Read Image
imagen=imread('kann.jpg');
%%Show image
figure(1)
imshow(imagen);
title('INPUT IMAGE WITH NOISE')
%%Convert to gray scale
if size(imagen,3)==3 % RGB image
imagen=rgb2gray(imagen);
end
%%Convert to binary image
threshold = graythresh(imagen);
imagen =~im2bw(imagen,threshold);
%%Remove all object containing fewer than 30 pixels
imagen = bwareaopen(imagen,30);
pause(1)
%%Show image binary image
figure(2)
imshow(~imagen);
title('INPUT IMAGE WITHOUT NOISE')
%%Label connected components
[L Ne]=bwlabel(imagen);
%%Measure properties of image regions
propied=regionprops(L,'BoundingBox');
hold on
%%Plot Bounding Box
for n=1:size(propied,1)
rectangle('Position',propied(n).BoundingBox,'EdgeColor','g','LineWidth',2)
end
hold off
pause (1)
%%Objects extraction
figure
for n=1:Ne
[r,c] = find(L==n);
n1=imagen(min(r):max(r),min(c):max(c));
imshow(~n1);
pause(0.5)
end
[EDITED, Jan, Code formatted - please do this by your own - Thanks]
  1 个评论
Jan
Jan 2013-5-6
Do you mean, that "take any Kannada text document as the input image" is an easy job for the readers of this forum? It is not for me.
It seems, like it is a contradiction: You can access only the last character, and the characters are segmented in random order. How can you know, when you get only one?
What does "random" order exactly mean? Does it change from run to run?

请先登录,再进行评论。

回答(1 个)

Image Analyst
Image Analyst 2013-5-6
编辑:Image Analyst 2013-5-6
I didn't examine the code in detail but it looks like you're overwriting n1 each time. Did you want to index that so you save all values of it?
Characters get segmented in a top to bottom order, then move over to the next column looking for the next "true" pixel, and so on across all columns. It is not random.
  2 个评论
Sam Terry
Sam Terry 2013-5-7
Yes i would like to index it to save all the values of the segmented characters. could you please help me go ahead with that?
Image Analyst
Image Analyst 2013-5-7
n1{n} = imagen(..........
but you already have the labeled image so they are all saved already. I don't know what imagen() does.

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by