readtable error : read all columns into a single column

15 次查看(过去 30 天)
I am using readtable to read a *.txt file into table
data = readtable('test.txt', 'Format','auto')
the data file have 10 columns, but in the loaded data table, all columns are cancatecated into a single column, any ideas how to fix this?
thank you.
  4 个评论
dpb
dpb 2023-11-9
You forgot to attach the data file so we have to guess...
First, try
data = readtable('test.txt');
with no additional parameters; generally the input scanner is pretty competent about finding the delimiter on its own.
If that fails, then telling it what the delimiter is will be next; the image you attached is from Excel where you imported it, it is NOT the text file; the Q? is whether Excel was able to detect the delimiter on its own or did you have to use the whizard to separate columns -- if the text file is fixed-width, that is generally the case and would be one likely cause for your result in MATLAB.
Guohua
Guohua 2023-11-9
Please see the attached .txt file, adding "ExpectedNumVariables" won't solve the issue.
I also tried "Format"
readtable(filepath,'Format','%s%s%s%f%f%f%f%f%f%f%f%f%f')
This wont work either.

请先登录,再进行评论。

回答(2 个)

Cris LaPierre
Cris LaPierre 2023-11-9
It looks like MATLAB is not correctly identifying the delimitar, which is a tab. You can manually specify that.
Some fo the datatypes are also incorrectly selected. You can use theImport Tool to ineratively set the import parameters and then generate a script.
You could also do that manually.
opts = delimitedTextImportOptions("NumVariables",13);
opts.Delimiter = "\t";
opts.VariableNamesLine = 1;
opts.DataLines = [2,inf];
opts = setvartype(opts,["datetime", "string", "string", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double"]);
data = readtable("data_20091130.txt",opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
data = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 "03216N103" <missing> NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 30-Nov-2009 "016275109" <missing> NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002034957" <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 30-Nov-2009 <missing> <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 30-Nov-2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 30-Nov-2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 30-Nov-2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 30-Nov-2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 30-Nov-2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 30-Nov-2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386
  1 个评论
Cris LaPierre
Cris LaPierre 2023-11-9
Using your second approach also works, but with some additional specifications.
readtable("data_20091130.txt","Format",'%{MM/dd/yyyy}D%s%s%f%f%f%f%f%f%f%f%f%f',...
'Delimiter','\t','TextType','string')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly __________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 11/30/2009 "03216N103" "" NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 11/30/2009 "016275109" "" NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002034957" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 11/30/2009 "" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 11/30/2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 11/30/2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 11/30/2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 11/30/2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 11/30/2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 11/30/2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386

请先登录,再进行评论。


Voss
Voss 2023-11-9
filename = 'data_20091130.txt';
% detect import options, specifying tab as delimiter:
opts = detectImportOptions(filename,'Delimiter','\t');
% set variable 2 and 3's data to be interpreted as character vectors:
opts = setvartype(opts,opts.VariableNames([2 3]),{'char','char'});
% read the file:
T = readtable(filename,opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ _____________ ___________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 {'03216N103'} {0×0 char } NaN NaN -0.26873 NaN NaN 1 NaN {0×0 char} NaN NaN 30-Nov-2009 {'016275109'} {0×0 char } NaN NaN 0.028927 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'860341106'} {'B298752'} NaN NaN -0.19644 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'816074405'} {'B01RQH8'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002034957'} {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'00846U101'} {'2520153'} 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 {0×0 char} 7 NaN 30-Nov-2009 {0×0 char } {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'004403101'} {'2005650'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'043436104'} {'2855855'} 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 {0×0 char} 10 NaN 30-Nov-2009 {'013104104'} {'2012467'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002824100'} {'2002305'} -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 {0×0 char} 10 0.49791 30-Nov-2009 {'00081T108'} {'B0G7SZ5'} -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 {0×0 char} 3 NaN 30-Nov-2009 {'G0450A105'} {'2740542'} -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 {0×0 char} 5 NaN 30-Nov-2009 {'025195405'} {'B064919'} -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 {0×0 char} 1 NaN 30-Nov-2009 {'008190100'} {'2034524'} -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 {0×0 char} 9 NaN 30-Nov-2009 {'032654105'} {'2032067'} 1.3453 -0.11685 0.14799 2.7512 0 1 0 {0×0 char} 9 0.65386

类别

Help CenterFile Exchange 中查找有关 MATLAB Report Generator 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by