How can I access a nested struct data by indexing of the fields?

28 次查看(过去 30 天)
Hi,
I'm trying to make loop to take some data from an url using webread. Each loop I collect a set of struct format data. Some field names are mainted (fixed) so I can use it in the loop. However, some names in struct data are specific, which difficults get them. I tried different ways of indexing but none worked.
Here an example of the loop
for i = 1%:numel(LotusID);
Prefixe = ('https://lotus.naturalproducts.net/api/search/simple?query=');
Sufixe = LTS0120864 %LotusID(i,1);
url_lotus = strcat(Prefixe,Sufixe);
html_lotus = webread(url_lotus);
end
I want access all information in this nested structure. However, I don't known the names of subfields to access and even create an indexing for them.
Here an example of the nested struct data.
html_lotus.naturalProducts.taxonomyReferenceObjects
Inside of this structure (html_lotus.naturalProducts.taxonomyReferenceObjects) there are 12 [1x1 struct]. So, I can't use an indexing because all them are [1x1 struct];
'x10_x_x_1021_NP980041X'
'x10_x_x_1021_NP900144C'
'x10_x_x_1248_CPB_x_x_57_x_x_302'
...
I want to get all information inside nested structured, but I dont know the names and I don't have indexing of them.
Specifically in this case (html_lotus.naturalProducts.taxonomyReferenceObjects.x10_x_x_1021_NP980041X) there are 3 anothers subsctructures for each one to them to then finally to get the data, properly.
How can I get this data. I tried to convert all nested into cells but didn't work
Thank you!
Alan

采纳的回答

Stephen23
Stephen23 2023-2-9
编辑:Stephen23 2023-2-9
url = 'https://lotus.naturalproducts.net/api/search/simple?query=';
sfx = 'LTS0120864';
raw = webread(strcat(url,sfx))
raw = struct with fields:
originalQuery: 'LTS0120864' determinedInputType: 'LOTUS ID' naturalProducts: [1×1 struct]
"However, I don't known the names of subfields to access ..."
You can use FIELDNAMES() to get a cell array of the fieldnames:
fnr = fieldnames(raw)
fnr = 3×1 cell array
{'originalQuery' } {'determinedInputType'} {'naturalProducts' }
You can loop over those and use dynamic field names to access to the field content:
For example the 3rd field contains this, another structure:
raw.(fnr{3})
ans = struct with fields:
id: '604b8c8612e4996162762032' lotus_id: 'LTS0120864' wikidata_id: 'http://www.wikidata.org/entity/Q27140246' contains_sugar: 0 heavy_atom_number: 21 smiles: 'Oc1ccc2c(c1)OC[C@@H]1c3cc4c(cc3O[C@H]21)OCO4' inchi: 'InChI=1S/C16H12O5/c17-8-1-2-9-12(3-8)18-6-11-10-4-14-15(20-7-19-14)5-13(10)21-16(9)11/h1-5,11,16-17H,6-7H2/t11-,16-/m1/s1' inchikey: 'HUKSJTUUSUGIDC-BDJLRTHQSA-N' inchi2D: 'InChI=1S/C16H12O5/c17-8-1-2-9-12(3-8)18-6-11-10-4-14-15(20-7-19-14)5-13(10)21-16(9)11/h1-5,11,16-17H,6-7H2' inchikey2D: 'HUKSJTUUSUGIDC' smiles2D: 'Oc1ccc2c(c1)OCC1c3cc4c(cc3OC21)OCO4' sugar_free_smiles: 'OC1=CC=C2C(OCC3C4=CC=5OCOC5C=C4OC23)=C1' deep_smiles: '' traditional_name: '(+)-maackiain' synonyms: [] cas: [] iupac_name: '(1S,12S)-5,7,11,19-tetraoxapentacyclo[10.8.0.0²,¹⁰.0⁴,⁸.0¹³,¹⁸]icosa-2,4(8),9,13,15,17-hexaen-16-ol' contains_ring_sugars: 0 contains_linear_sugars: 0 collection: [] molecular_formula: 'C16H12O5' molecular_weight: 284.2641 npl_noh_score: 0 npl_score: 1 npl_sugar_score: 0.6364 number_of_carbons: 16 number_of_nitrogens: 0 number_of_oxygens: 5 number_of_rings: [] max_number_of_rings: 15 min_number_of_rings: 5 number_repeated_fragments: [] sugar_free_heavy_atom_number: 21 sugar_free_total_atom_number: 21 total_atom_number: 33 bond_count: 25 xrefs: [] fragments: [1×1 struct] fragmentsWithSugar: [1×1 struct] murko_framework: 'O1c2cc3OC4c5ccccc5OCC4c3cc2OC1' ertlFunctionalFragments: [1×1 struct] ertlFunctionalFragmentsPseudoSmiles: [1×1 struct] pubchemFingerprint: [135×1 double] pubfp: [] pfCounts: [1×1 struct] circularFingerprint: [35×1 double] substructureFingerprint: [] extendedFingerprint: [180×1 double] pubchemBits: 'AB4cAAAAAAAAAAAAAAAAAACAYBAAAAwHAQAAAAAAAIACAABYAAAQAACwKAUZwAxwAQBgAAFABEIAAEAQAAQEABARAGAREbjkRGGMWEUeRKUDqHAd4AMHP3CFAIAQABACAEIAQAgACAE=' pubchemBitsString: '0000000001111000001110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000011000001000000000000000000000110000111000001000000000000000000000000000000000000000000000000000000000000001010000000000000000000000000110100000000000000000000010000000000000000000000011010001010010100000100110000000001100110000000011101000000000000000000001100000000010000000000000100010000001000010000000000000000000000010000010000000000000100000001000000000000000001000100010000000000000000110100010001000100000011101001001110010001010000110001100010001101010100010011110000010001010100101110000000001010100001110101110000000011111000000111000001111110000001110101000010000000000000001000010000000000000001000010000000000000001000010000000000000001000010000000000000001000010' chemicalTaxonomyNPclassifierPathway: 'Shikimates and Phenylpropanoids' chemicalTaxonomyNPclassifierSuperclass: 'Isoflavonoids' chemicalTaxonomyNPclassifierClass: 'Pterocarpan' chemicalTaxonomyClassyfireKingdom: 'Organic compounds' chemicalTaxonomyClassyfireSuperclass: 'Phenylpropanoids and polyketides' chemicalTaxonomyClassyfireClass: 'Isoflavonoids' chemicalTaxonomyClassyfireDirectParent: 'Pterocarpans' allChemClassifications: {6×1 cell} taxonomyReferenceObjects: [1×1 struct] allTaxa: {38×1 cell} tanimoto: [] alogp: 2.4420 alogp2: 5.9634 amralogp: 72.0780 apol: 40.1715 bcutDescriptor: [6×1 double] bpol: 20.7825 eccentricConnectivityIndexDescriptor: 402 fmfDescriptor: 0.9524 fsp3: 0.2500 fragmentComplexityDescriptor: 949.0500 gravitationalIndexHeavyAtoms: 'NaN' hBondAcceptorCount: 0 hBondDonorCount: 1 hybridizationRatioDescriptor: 0.2500 kappaShapeIndex1: 13.4400 kappaShapeIndex2: 5 kappaShapeIndex3: 2.0663 manholdlogp: 2.6700 petitjeanNumber: 0.4545 petitjeanShapeTopo: 0.8333 petitjeanShapeGeom: 'NaN' lipinskiRuleOf5Failures: 0 numberSpiroAtoms: 0 vabcDescriptor: 'NaN' vertexAdjMagnitude: 5.6439 weinerPathNumber: 855 weinerPolarityNumber: 37 xlogp: 2.2540 zagrebIndex: 126 topoPSA: 57.1500 tpsaEfficiency: 0.2012
"Inside of this structure (html_lotus.naturalProducts.taxonomyReferenceObjects) there are 12 [1x1 struct]. So, I can't use an indexing because all them are [1x1 struct];"
There are indeed 12 fields, so you can easily use the method I just showed with FIELDNAMES() and dynamic field names. Lets try it now:
sub = raw.(fnr{3}).taxonomyReferenceObjects
sub = struct with fields:
x10_x_x_1021_NP980041X: [1×1 struct] x10_x_x_1021_NP900144C: [1×1 struct] x10_x_x_1248_CPB_x_x_57_x_x_302: [1×1 struct] x10_x_x_1016_0031_9422_73_85046_0: [1×1 struct] x10_x_x_1021_NP0103158: [1×1 struct] x10_x_x_1248_CPB_x_x_28_x_x_3686: [1×1 struct] x10_x_x_3987_COM_95_7069: [1×1 struct] x10_x_x_1055_S_2001_14325: [1×1 struct] x10_x_x_3987_COM_92_6288: [1×1 struct] x10_x_x_1007_BF00564330: [1×1 struct] x10_x_x_1007_S10600_010_9696_0: [1×1 struct] x10_x_x_1016_J_x_x_BMC_x_x_2007_x_x_03_x_x_011: [1×1 struct]
fns = fieldnames(sub);
for k = 1:numel(fns)
tmp = sub.(fns{k});
display(tmp)
%... do whatever you want with that data...
end
tmp = struct with fields:
OpenTreeOfLife: [1×1 struct] NCBI: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
OpenTreeOfLife: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
ITIS: [1×1 struct] NCBI: [1×1 struct] OpenTreeOfLife: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
NCBI: [1×1 struct] OpenTreeOfLife: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
NCBI: [1×1 struct] OpenTreeOfLife: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
NCBI: [1×1 struct] OpenTreeOfLife: [1×1 struct] GBIFBackboneTaxonomy: [2×1 struct]
tmp = struct with fields:
GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
OpenTreeOfLife: [1×1 struct] NCBI: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
iNaturalist: [1×1 struct] ITIS: [1×1 struct] OpenTreeOfLife: [1×1 struct] NCBI: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
OpenTreeOfLife: [1×1 struct] NCBI: [1×1 struct] GBIFBackboneTaxonomy: [1×1 struct]
tmp = struct with fields:
ITIS: [1×1 struct] NCBI: [1×1 struct] OpenTreeOfLife: [2×1 struct] GBIFBackboneTaxonomy: [2×1 struct]
Note that there is no simple, single answer for how to process a nested structure, because how it needs to be processed depends on the structure content and its meaning. MATLAB cannot know this. For example, those 12 fields shown above each consist of a scalar sructure. But those scalar structures have different fieldnames: MATLAB cannot decide for you, if you want to process all of those scalar structures or perhaps you might need to ignore the ones that that are missing a particular field. This is something that only you know, based on your needs. This means that you need to write the code that checks and processes the data according to your needs.

更多回答(1 个)

Askic V
Askic V 2023-2-8
编辑:Askic V 2023-2-8
One quick and dirty solution that includes very unpopular eval function is given here:
Prefixe = ('https://lotus.naturalproducts.net/api/search/simple?query=');
Sufixe = 'LTS0120864'; %LotusID(i,1);
url_lotus = strcat(Prefixe,Sufixe);
html_lotus = webread(url_lotus);
fields = fieldnames(html_lotus);
var_struct = struct();
for i = 1:numel(fields)
eval_stri = ['html_lotus.' fields{i}];
var_aux = eval(eval_stri);
if isa(var_aux,'struct')
fprintf ('\n%s is a cell\n',fields{i});
var_struct = var_aux;
end
end
naturalProducts is a cell
var_struct
var_struct = struct with fields:
id: '604b8c8612e4996162762032' lotus_id: 'LTS0120864' wikidata_id: 'http://www.wikidata.org/entity/Q27140246' contains_sugar: 0 heavy_atom_number: 21 smiles: 'Oc1ccc2c(c1)OC[C@@H]1c3cc4c(cc3O[C@H]21)OCO4' inchi: 'InChI=1S/C16H12O5/c17-8-1-2-9-12(3-8)18-6-11-10-4-14-15(20-7-19-14)5-13(10)21-16(9)11/h1-5,11,16-17H,6-7H2/t11-,16-/m1/s1' inchikey: 'HUKSJTUUSUGIDC-BDJLRTHQSA-N' inchi2D: 'InChI=1S/C16H12O5/c17-8-1-2-9-12(3-8)18-6-11-10-4-14-15(20-7-19-14)5-13(10)21-16(9)11/h1-5,11,16-17H,6-7H2' inchikey2D: 'HUKSJTUUSUGIDC' smiles2D: 'Oc1ccc2c(c1)OCC1c3cc4c(cc3OC21)OCO4' sugar_free_smiles: 'OC1=CC=C2C(OCC3C4=CC=5OCOC5C=C4OC23)=C1' deep_smiles: '' traditional_name: '(+)-maackiain' synonyms: [] cas: [] iupac_name: '(1S,12S)-5,7,11,19-tetraoxapentacyclo[10.8.0.0²,¹⁰.0⁴,⁸.0¹³,¹⁸]icosa-2,4(8),9,13,15,17-hexaen-16-ol' contains_ring_sugars: 0 contains_linear_sugars: 0 collection: [] molecular_formula: 'C16H12O5' molecular_weight: 284.2641 npl_noh_score: 0 npl_score: 1 npl_sugar_score: 0.6364 number_of_carbons: 16 number_of_nitrogens: 0 number_of_oxygens: 5 number_of_rings: [] max_number_of_rings: 15 min_number_of_rings: 5 number_repeated_fragments: [] sugar_free_heavy_atom_number: 21 sugar_free_total_atom_number: 21 total_atom_number: 33 bond_count: 25 xrefs: [] fragments: [1×1 struct] fragmentsWithSugar: [1×1 struct] murko_framework: 'O1c2cc3OC4c5ccccc5OCC4c3cc2OC1' ertlFunctionalFragments: [1×1 struct] ertlFunctionalFragmentsPseudoSmiles: [1×1 struct] pubchemFingerprint: [135×1 double] pubfp: [] pfCounts: [1×1 struct] circularFingerprint: [35×1 double] substructureFingerprint: [] extendedFingerprint: [180×1 double] pubchemBits: 'AB4cAAAAAAAAAAAAAAAAAACAYBAAAAwHAQAAAAAAAIACAABYAAAQAACwKAUZwAxwAQBgAAFABEIAAEAQAAQEABARAGAREbjkRGGMWEUeRKUDqHAd4AMHP3CFAIAQABACAEIAQAgACAE=' pubchemBitsString: '0000000001111000001110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000011000001000000000000000000000110000111000001000000000000000000000000000000000000000000000000000000000000001010000000000000000000000000110100000000000000000000010000000000000000000000011010001010010100000100110000000001100110000000011101000000000000000000001100000000010000000000000100010000001000010000000000000000000000010000010000000000000100000001000000000000000001000100010000000000000000110100010001000100000011101001001110010001010000110001100010001101010100010011110000010001010100101110000000001010100001110101110000000011111000000111000001111110000001110101000010000000000000001000010000000000000001000010000000000000001000010000000000000001000010000000000000001000010' chemicalTaxonomyNPclassifierPathway: 'Shikimates and Phenylpropanoids' chemicalTaxonomyNPclassifierSuperclass: 'Isoflavonoids' chemicalTaxonomyNPclassifierClass: 'Pterocarpan' chemicalTaxonomyClassyfireKingdom: 'Organic compounds' chemicalTaxonomyClassyfireSuperclass: 'Phenylpropanoids and polyketides' chemicalTaxonomyClassyfireClass: 'Isoflavonoids' chemicalTaxonomyClassyfireDirectParent: 'Pterocarpans' allChemClassifications: {6×1 cell} taxonomyReferenceObjects: [1×1 struct] allTaxa: {38×1 cell} tanimoto: [] alogp: 2.4420 alogp2: 5.9634 amralogp: 72.0780 apol: 40.1715 bcutDescriptor: [6×1 double] bpol: 20.7825 eccentricConnectivityIndexDescriptor: 402 fmfDescriptor: 0.9524 fsp3: 0.2500 fragmentComplexityDescriptor: 949.0500 gravitationalIndexHeavyAtoms: 'NaN' hBondAcceptorCount: 0 hBondDonorCount: 1 hybridizationRatioDescriptor: 0.2500 kappaShapeIndex1: 13.4400 kappaShapeIndex2: 5 kappaShapeIndex3: 2.0663 manholdlogp: 2.6700 petitjeanNumber: 0.4545 petitjeanShapeTopo: 0.8333 petitjeanShapeGeom: 'NaN' lipinskiRuleOf5Failures: 0 numberSpiroAtoms: 0 vabcDescriptor: 'NaN' vertexAdjMagnitude: 5.6439 weinerPathNumber: 855 weinerPolarityNumber: 37 xlogp: 2.2540 zagrebIndex: 126 topoPSA: 57.1500 tpsaEfficiency: 0.2012
Now, I'm sure some of the gurus will suggest more elgant way.
But in the code abive you can see how you can use webread and how to check if the variable is a structure, how to get fieldnames and use it to read members of nested structure. Now you can make this more deep in the same way.
Anyway the code above shows you how to get the names of the fields and how to create index of them, which was your initial question.
  2 个评论
Steven Lord
Steven Lord 2023-2-8
There's no need for eval here. Use dynamic field names.
s = struct('x', 1, 'y', 2, 'z', 3)
s = struct with fields:
x: 1 y: 2 z: 3
f = 'y';
two = s.(f) % 2
two = 2
Alternately if you have multiple levels of indexing you can use getfield.
nested = struct('w', 4, 'q', s)
nested = struct with fields:
w: 4 q: [1×1 struct]
thefields = {'q', 'z'};
three = getfield(nested, thefields{:}) % nested.q.z
three = 3

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Structures 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by