Error in data formatting when creating a matrix of only certain columns from a very large txt file in MATLAB
6 次查看(过去 30 天)
显示 更早的评论
I have a .txt data file with 8 columns and 4,377,008 lines. The first row includes header titles, and the rest of the rows are numbers in a variety of formats.
For my code, I only want to use the data from columns 1, 2 and 5 in the txt file.
Column 1 is time in s, and the data in this column is positive numbers with up to 2 decimal places. (e.g. 78.95)
Column 2 is a node number, which is a whole, positive number with values into the 10s of thousands. I.e., anywhere between 0 and 60,000 (e.g, 35, or 36125)
Column 5 is flow rate, and can be a positive or negative values ranging from 10 decimal places to 2-digit or 3-digit numbers. (e.g. -69.0052 or 0.00714967)
I have used a simple readmatrix code:
Q_data = readmatrix("filename.txt");
remove = [3, 4, 6, 7, 8];
Q_data(:, remove) = []
And I have also tried with selectedData, but in both cases the formatting of the data has completely changed... It appears to be standardised the the same number of decimal points, and no numbers higher than 10. For example:
This it what the data in the txt file looks like (only including columns 1,2 and 5):
82.5, 12, -44.7079
and this is the equivalent row of data in the matrix in matlab:
0.0083 0.0012 -0.0045
As you can see, the format of the numbers has changed and I am losing some valuable information that I need to process the data.
Is there a way I can fix this? I would really appreciate any help :) Thank you in advance!
p.s. apologies if the formatting is weird, or I am missing any relevant information - this is my first time asking a question here!
2 个评论
Mitchell Thurston
2024-12-7
this could just be the way the data prints out, formatting in scientific notation for the second column which you said could go up to the tens of thousands. what happens on this command?
fprintf("%f, %f, %f\n", Q_data(10,:))
采纳的回答
更多回答(1 个)
Walter Roberson
2024-12-7
You are examining the outputs by using disp() (or implied disp(), such as just naming the variable on the command line.)
The default output format is "format short". "format short" is going to examine the maximum absolute value of all of the data, and determine the number of decimal places based on the overall maximum absolute value.
You would be better off looking at the data with "format long g" in effect.
0 个评论
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!