SUDE

A scalable manifold learning (SUDE) method that can cope with large-scale and high-dimensional data in an efficient manner

https://github.com/ZPGuiGroupWhu/sude

Dehua

版本 1.0.0 (2.7 MB)

4.0 次下载

(0)

2026/1/13

下载

在 MATLAB Online 中打开

关注

下载

在 MATLAB Online 中打开

关注

Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data (SUDE)

We propose a scalable manifold learning (SUDE) method that can cope with large-scale and high-dimensional data in an efficient manner. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data, and then incorporates the non-landmarks into this skeleton based on the constrained locally linear embedding. This toolkit includes the main code of SUDE, and also two applications for preprocess scRNA-seq and ECG data. This paper has been published in Nature Machine Intelligence, and more details can be seen https://www.nature.com/articles/s42256-025-01112-9.

HOW TO RUN

The sude.m function provides multiple hyperparameters for user configuration as follows

function [Y, id_samp, para] = sude(X, varargin)
%   This function returns representation of the N by D matrix X in the lower-dimensional space and 
%   the ID of landmarks sampled by PPS. Each row in X represents an observation.
% 
%   Parameters are: 
% 
%   'NumDimensions'- A positive integer specifying the number of dimension of the representation Y. 
%                    Default: 2
%   'NumNeighbors' - A non-negative integer specifying the number of nearest neighbors for PPS to 
%                    sample landmarks. It must be smaller than N.
%                    Default: adaptive
%   'Normalize'    - Logical scalar. If true, normalize X using min-max normalization. If features in 
%                    X are on different scales, 'Normalize' should be set to true because the learning 
%                    process is based on nearest neighbors and features with large scales can override 
%                    the contribution of features with small scales. 
%                    Default: True
%   'LargeData'    - Logical scalar. If true, the data can be split into multiple blocks to avoid the problem 
%                    of memory overflow, and the gradient can be computed block by block using 'learning_l' function.                    
%                    Default: False
%   'InitMethod'   - A string specifying the method for initializing Y before manifold learning. 
%       'le'       - Laplacian eigenmaps.
%       'pca'      - Principal component analysis.
%       'mds'      - Multidimensional scaling.
%                    Default: 'le' 
%   'AggCoef'      - A positive scalar specifying the aggregation coefficient. 
%                    Default: 1.2
%   'MaxEpoch'     - Maximum number of epochs to take. 
%                    Default: 50 

The main.m file provides an example

% Input data
clear;
data = csvread('mfeat.csv');
% Obtain data size and true annotations
[~, m] = size(data);
ref = data(:, m);
X = data(:, 1:m-1);
clear data
% Perform SUDE embedding
t1 = clock;
[Y, idx, para] = sude(X,'NumNeighbors',10);
t2 = clock;
disp(['Elapsed time:', num2str(etime(t2,t1)),'s']);
[knnACC, svmACC, clusACC] = ml_eval(X, Y, ref);
disp(['knnACC:', num2str(knnACC),' svmACC:', num2str(svmACC),' clusACC:', num2str(clusACC)]);
plotcluster2(Y, ref);

引用格式

Peng, Dehua, et al. “Sampling-Enabled Scalable Manifold Learning Unveils the Discriminative Cluster Structure of High-Dimensional Data.” Nature Machine Intelligence, vol. 7, no. 10, Sept. 2025, pp. 1669–84, https://doi.org/10.1038/s42256-025-01112-9.

查看更多格式

MATLAB 版本兼容性

兼容 R2022a 到 R2026a 的版本

平台兼容性

Windows
macOS
Linux

在新标签页中打开

版本	已发布	发行说明	Action
1.0.0	2026/1/13		下载

MLA	Peng, Dehua, et al. “Sampling-Enabled Scalable Manifold Learning Unveils the Discriminative Cluster Structure of High-Dimensional Data.” Nature Machine Intelligence, vol. 7, no. 10, Sept. 2025, pp. 1669–84, https://doi.org/10.1038/s42256-025-01112-9.
APA	Peng, D., Gui, Z., Wei, W., Li, F., Gui, J., Wu, H., & Gong, J. (2025). Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data. Nature Machine Intelligence, 7(10), 1669–1684. Springer Science and Business Media LLC. Retrieved from http://dx.doi.org/10.1038/s42256-025-01112-9
BibTeX	@article{Peng_2025, title={Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data}, volume={7}, ISSN={2522-5839}, url={http://dx.doi.org/10.1038/s42256-025-01112-9}, DOI={10.1038/s42256-025-01112-9}, number={10}, journal={Nature Machine Intelligence}, publisher={Springer Science and Business Media LLC}, author={Peng, Dehua and Gui, Zhipeng and Wei, Wenzhang and Li, Fa and Gui, Jie and Wu, Huayi and Gong, Jianya}, year={2025}, month=sep, pages={1669–1684} }