Error in data formatting when creating a matrix of only certain columns from a very large txt file in MATLAB

6 次查看(过去 30 天)
I have a .txt data file with 8 columns and 4,377,008 lines. The first row includes header titles, and the rest of the rows are numbers in a variety of formats.
For my code, I only want to use the data from columns 1, 2 and 5 in the txt file.
Column 1 is time in s, and the data in this column is positive numbers with up to 2 decimal places. (e.g. 78.95)
Column 2 is a node number, which is a whole, positive number with values into the 10s of thousands. I.e., anywhere between 0 and 60,000 (e.g, 35, or 36125)
Column 5 is flow rate, and can be a positive or negative values ranging from 10 decimal places to 2-digit or 3-digit numbers. (e.g. -69.0052 or 0.00714967)
I have used a simple readmatrix code:
Q_data = readmatrix("filename.txt");
remove = [3, 4, 6, 7, 8];
Q_data(:, remove) = []
And I have also tried with selectedData, but in both cases the formatting of the data has completely changed... It appears to be standardised the the same number of decimal points, and no numbers higher than 10. For example:
This it what the data in the txt file looks like (only including columns 1,2 and 5):
82.5, 12, -44.7079
and this is the equivalent row of data in the matrix in matlab:
0.0083 0.0012 -0.0045
As you can see, the format of the numbers has changed and I am losing some valuable information that I need to process the data.
Is there a way I can fix this? I would really appreciate any help :) Thank you in advance!
p.s. apologies if the formatting is weird, or I am missing any relevant information - this is my first time asking a question here!
  2 个评论
Mitchell Thurston
Mitchell Thurston 2024-12-7
this could just be the way the data prints out, formatting in scientific notation for the second column which you said could go up to the tens of thousands. what happens on this command?
fprintf("%f, %f, %f\n", Q_data(10,:))
Izzy
Izzy 2024-12-22
This gives one line of the code, with each bit of the data in the right format! Unfortunately when I try it for the whole matrix, it seems to print the whole column in one go, like an array, instead of one after the other... (i.e. 1, 1, 1, 1, 1, [...], 2, 2, 2, 2, 2, [...], 3, 3, 3, 3, 3, [...] while I need 1, 2, 3; 1, 2, 3; 1, 2, 3, [...]). I have used fprintf("%f, %f, %f\n", Q_data) and fprintf("%f, %f, %f\n", Q_data(:,:))which both give the same result described above. I am not sure if this is me misunderstanding the command or not though...
I have checked the data and happy to see that it is all saved in the correct format which is my main concern, so I think my question is practically answered - thank you for your response, and sorry it took me so long to reply!

请先登录,再进行评论。

采纳的回答

Izzy
Izzy 2024-12-22
This came from a misunderstanding on my part! The data is saved correctly in the matrix, but since it was printing incorrectly, and I was using this as a check method, I thought there was an issue with the data.
Using fprintf("%f, %f, %f\n", Q_data(10,:)) successfully prints 1 row of data in the correct format, if this is useful to anyone in future applications!

更多回答(1 个)

Walter Roberson
Walter Roberson 2024-12-7
You are examining the outputs by using disp() (or implied disp(), such as just naming the variable on the command line.)
The default output format is "format short". "format short" is going to examine the maximum absolute value of all of the data, and determine the number of decimal places based on the overall maximum absolute value.
You would be better off looking at the data with "format long g" in effect.

类别

Help CenterFile Exchange 中查找有关 Programming 的更多信息

产品


版本

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by