Bridging the Simulation-Reality Gap: Challenges in Applying ANN Models Trained on Simulated NMR Spectra to Experimental Data

3 次查看(过去 30 天)
We are currently developing an artificial neural network (ANN) model using the Neural Network Fitting App in MATLAB. The primary objective of this model is to predict specific quantities based on the relative areas of individual peaks in NMR spectra. The model has been trained exclusively on simulated NMR spectral data and demonstrates excellent performance metrics on this dataset.
However, when applied to real NMR spectral data, the model's performance degrades significantly, showing poor predictive accuracy. We suspect that discrepancies between the simulated and real data may be affecting the model's generalizability.
If anyone has encountered similar challenges or has experience in bridging the gap between simulated and experimental data in NMR-based machine learning applications, your insights would be greatly appreciated.
Thank you in advance for your support.

回答(1 个)

Shishir Reddy
Shishir Reddy 2025-6-20
Hi Guna
As per my understanding, the performance of your model degrades significantly when applied to the real NMR data. This issue is quite common in machine learning as the models trained on simulated data often do not generalize well to real-world data due to differences in noise, baseline, resolution, and unexpected artifacts.
Kindly refer to the following workarounds ,which could help in resolving the issue -
1. To make simulated data more like real NMR spectra, you can add noise and baseline distortions
noisy_data = simulated_data + 0.01 * randn(size(simulated_data));
x = linspace(0, 1, size(simulated_data, 2));
baseline = 0.01 * (x - 0.5).^2;
baseline_data = noisy_data + baseline;
2. Sometimes raw peak areas are too sensitive. PCA can reduce dimensionality and noise
[coeff, score, ~] = pca(training_data);
X_reduced = score(:, 1:10); % use first 10 principal components
3. Use 10 - 20% of real-world data to train or fine-tune the model
% Assume net is your pretrained network
net = train(net, real_inputs_subset, real_targets_subset);
I hope this helps.

类别

Help CenterFile Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by