How to searh for very similar strings?
54 次查看(过去 30 天)
显示 更早的评论
Hi all,
I am doing a bibliometric analysis and especially, I have to search article titles on references of the citing papers. Here, you can see my code:
for iMS=1:length(MS)
Cit{iMS}=contains({MSCit.References},MS(iMS).Title,'IgnoreCase',true);
end
The code works pretty well, however the data that I can export from Scopus is not perfect. Indeed, article names are not consistent, so the perfect match does not always work. Here two examples:
Case 1:
Real article name: 'Biomethane production from different crop systems of cereals in Northern Italy'
Article name in the reference: 'Biomethane production from different crop systems of cereals in Nothern Italy'
Case 2:
Real article name: 'Methodology for the realisation of accelerated structural tests on tractors'
Article name in the reference: 'Methodology for the realization of accelerated structural tests on tractors'
As you can see, the two titles differ of a tiny character. Due to the fact that I have more than 20000 papers and fixing it by hand can be time-consuming, is there any way to programmatically search for very similar strings? As you can see, the strings might change also in length.
Thank you,
Cheers
0 个评论
采纳的回答
John D'Errico
2019-1-25
编辑:John D'Errico
2019-1-25
You probably want to do some reading here:
Plus, I see lots of code provided.
I'm sure some of those are better than others. And I would never count out anything written by Cleve.
3 个评论
O.Hubert
2024-2-1
Certainly too late, but you could remove the accents and special characters from the string prior to running fzsearch.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Characters and Strings 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!