How can I read from a file into a char array?

29 次查看(过去 30 天)
I have a large text file, and I need to calculate the number of times each individual letter occurs in the file. The easiest way I can think of to do that would be to have an array where each entry is a single char from the file, then run an array function on the whole thing and sum the number of times each letter is found. However I am having trouble getting the text from the file into a char array. I have tried using fileread, which reads the entire file to a single entry in a string array, and I have tried using textscan, which reads the file into a cell array split up by words. Does anyone know if I can just get the file straight into a char array?
  2 个评论
John
John 2014-9-28
编辑:John 2014-9-28
When you use fileread to read the text in a file you actually get a char array.
Let's say testfile.txt contains the text:
this is a test file
If you use fileread like this:
fileContents = fileread('testfile.txt')
fileContents will be a char array with the individual characters. Check that that is so with:
class(fileContents) %Should echo 'char'
isvector(fileContents) %Checks if fileContents is a vector, should return 1/true
The overall problems seems like a college homework assignment :-) so I will refrain from providing a solution. There are a couple of ways to do keep a count of each character in the char array. One way would be to keep count of the characters you encounter while iterating through the char array in a Map container, where the keys are the individual characters and the values are the populations of the unique characters in the char array.
Also, the unique function provides a pathway to another solution.
Zachary
Zachary 2014-9-29
I re checked my code, and I was completely mistaken, readfile, does in fact give me an array of chars. I had tried vectorizing my code, which kind of still seems like magic to me, and I guess I was incorrectly accessing my data. Thanks!

请先登录,再进行评论。

采纳的回答

per isakson
per isakson 2014-9-28
编辑:per isakson 2014-9-29
Try
str = fileread( filespec );
num = double( str );
nch = histc( num, [1:255] ); % fix [32:255]
A little test - added later
>> char( find( histc( double('abcd1234'), [1:255] ) ) )
ans =
1234abcd

更多回答(1 个)

Geoff Hayes
Geoff Hayes 2014-9-28
Zachary - I think that you are on the right path using fileread. If I follow the fileread example,
io_contents = ...
fullfile(matlabroot,'toolbox','matlab','iofun','Contents.m');
filetext = fileread(io_contents);
Note that filetext is a 1x4244 array of char elements. So you can either loop over each element and update your "counting" array, or try something else. Remember that each character has an ASCII code, so we could use that to our advantage. If we convert the character array into a numeric array, we could then use a histogram function (for example histc) to determine the counts for each character
charBinCounts = histc(double(uint8(filetext)),0:1:127);
So we take the 1x4244 character array filetext and then convert it to the 8-bit unsigned integers and convert to double (I needed to do both conversions because of histc). Then pass this numeric array to the histc function with the bins given by 0,1,2,...,126,127 (since unsigned 8-bit integers have values from 0 through to 127).
charBinCounts contains the counts for each character.

类别

Help CenterFile Exchange 中查找有关 Matrix Indexing 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by