Implementing Breadth first search using connectivity matriX

Question

0 个投票

I have a connectivity matrix in which each row corresponds to two connected nodes or one line segment. Some of these points are connected to each other resulting in several number of clusters. I want to find the number of clusters and number of nodes in each cluster. I know this can be done using Breadth-first search(BFS) or Depth first search(DFS) but the available bfs/dfs codes are specifically written for biographs. I can't use biographs and want to use only connectivity matrix to give me the information. connectivity matrix=[1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7];

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Walter Roberson 2015-5-16

在 MATLAB Online 中打开

0 个投票

cmat = [1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7];
numnode = max(cmat(:));
adj = zeros(numnode,numnode);  %adjacency matrix
%for each pair (i,j) in the connectivity list, set adj(i,j) to 1
fromlist = cmat(:,1);
tolist = cmat(:,2);
fromtoidx = sub2ind([numnode,numnode], fromlist, tolist);  %as linear indices
adj(fromtoidx) = 1;
tofromidx = sub2ind([numnode,numnode], tolist, fromlist);  %connections are bidirectional
adj(tofromidx) = 1;

You now have a matrix of 0's and 1's, with adj(i,j) = 1 meaning there is a link between i and j.

With this adjacency matrix, you can calculate transitions. consider (adj)^N where ^ is matrix power, adj^N being adj*adj*adj...*adj a total of N times. In that adj^N, if there is a non-zero entry at (i,j) then there is a way to get from i to j in N steps.

An advantage of this representation is that it allows you to perform your searches in parallel.

It has been a good 30 years since I read the algorithm to determine cut-edges from adjacency graphs. Scanning some literature it looks like perhaps it isn't trivial.

I think at this point it is worth confirming your definition of "cluster". Does a cluster occur exactly when there is a single cut-edge that partitions the graph into two pieces? If so, what if there are multiple cut-edges? Do you prefer to define in terms of cut-vertices? If there are two sub-graphs, then what is the maximum number of cut-edges permitted between the two sub-graphs before you determine that they are part of the same cluster?

I worked on data classification for several years. The data we were given did not have explicit connectivity information, just high-dimensional coordinates (e.g., concentration measurements for 500 different chemicals). The algorithms we used to cluster were never able to assign restricted adjacency graphs that could say that there was no connection between any two given vertices; they could only find probabilities that a vertex was part of a cluster with particular attributes, such as finding the hyperplane that best divided the vertices into two clusters.

In this forum we used to semi-regularly get questions about how to determine the "right" number of clusters to ask a routine such as k-means to divide the data into. The answer to that is that the best classification is always achieved by placing every node into its own cluster. For any lesser number of clusters to be considered "better" you needed to define a penalty function that encouraged fewer clusters -- and that penalty function had to be problem-dependent.

If you are lucky enough to have a situation with an actual adjacency list, then before we can tell you how many clusters there are, you need to give us a formal definition of how to tell whether whether two vertices are in different clusters. Because anything beyond there being a single cut-edge or single cut-vertex that partitions the graph is no longer obvious and is probably starting to get problem dependent.

2 个评论
显示无隐藏无

nitin arora 2015-5-19

编辑：Walter Roberson 2015-5-19

在 MATLAB Online 中打开

Thanks Walter. I have a set of coordinates of object A and B in a given space. I found the distance of all points from each other, i.e A-A, A-B, B-B distance using pdist function. If the distance between two objects i(A or B) and j (A or B) is less than a threshold(manually set), then i connect i and j using a line segment on a figure. So, my definition of cluster is all the points that are connected by line segments. I am running 3 loops(A-A, A-B, B-B) for connecting i and j. In the same loop, I recorded i and j object index values in a connectivity matrix such that each row represents a connection between i and j. I wrote the following BFS(breadth first search) code but it has several limitations such as

Identifying the last node in a cluster: It does not detect 5 as a part of 1st cluster
Repeats some index(It repeats the index 4)

Please advice. I am really stuck and I am on a tight deadline to figure this out.

clc;
clear all;
close all;
conncomp=[1 2; 2 3; 2 4; 3 4; 3 5];
%// Find all possible node IDs
%//Nitin finds unique number in an array without repition
nodeIDs = unique(conncomp(:)); 
%// Flag that tells us if there are any nodes we should visit
nodeIDList = false(1,numel(nodeIDs));
%// Stores our list of clusters
clusterList = {};
%// Keeps track of how many clusters we have
counter = 1;
%// queue - initialize to empty-keeps track of the connected nodes
queue = [];
nodescluster=[];
while (~all(nodeIDList))    
    % Find any node and make it currently working node
      currentnode = find(nodeIDList == false, 1);
      % Initialize our queue to contain this node and add this node to
      % cluster
      queue=currentnode;
      nodescluster=currentnode;
      %// While our queue is not empty; iteration stops when queue is empty
      %and all the nodes have been visited
      while (~isempty(queue))
          % Grab the current node from previous iteration and remove from
          % queue
           queue(1)=[];    
           if (nodeIDList(currentnode))
              continue;
           end
          %// Mark the current node as visited 
          nodeIDList(currentnode) = true;
   %// Remove those already visited
          visitedNodes = nodeIDList(true);
         UnvistitedNodes = nodeIDList(false);
          %// Retrieve all nodes connected to the current node
          connectedNodes = conncomp(conncomp(:,1) == currentnode, 2);
          queue = [queue,connectedNodes.'];
         nodescluster = [nodescluster,connectedNodes.'];
          %move to one of these connected nodes and set it as current node
          if(numel(queue)~= 0)
          currentnode=queue(1);
          end
      end
      %// Add connected components to its own cluster
      clusterList{counter} = nodescluster;
      counter = counter + 1;
  end
celldisp(clusterList)

Walter Roberson 2015-5-19

tagging myself

请先登录，再进行评论。

Answer 2

Alfonso Nieto-Castanon 2015-5-20

编辑：Alfonso Nieto-Castanon 2015-5-20

在 MATLAB Online 中打开

0 个投票

See help dmperm

If I am interpreting correctly, something like the following might be what you are after:

 c = [1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7]; % edges
 n = max(c(:));                             % number of nodes
 C = sparse(c(:,1),c(:,2),1,n,n);           % adjacency matrix
 C = C|C'|speye(n);                         % forces symmetry
 [idx,~,p] = dmperm(C);                     % magic
 clusters = mat2cell(idx,1,diff(p));        % clusters
 N = numel(clusters);                       % # of clusters
 s = cellfun('length',clusters);            % # of nodes per cluster

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 3

Shan Sham 2018-3-2

0 个投票

Now only starts work with matlab. Am a mathematics student. I need a algorithm for the following Question. Let G and H be two directed graph with V(G)=V(H). Let S and T be a source and sink of the directed graph G. Adj(G) and Adj(H) denotes the adjacency matrices of G and H. I want to find out a path from a vertex of S to a vertex of T in such a way that the first arc must be in G and second arc in H and also the last arc must be in G. Suppose $P:x_1,x_2,\dots,x_i $ is a desired path. Then x_1 \in S and x_i \in T. {x_1 x_2, x_3 x_4,...,x_{i-1}x_i} \subseteq A(G) and {x_2 x_3, x_4 x_5,...,x_{i-2}x_{i-1}} \subseteq A(H). For the below graphs, S={1,2} and T={5,3}. Example of a such a desired path is * 2145* here 2 is in s and 5 is in T and 21,45 is in G and 14 is in H.

Help me to find a efficient algorithm using BFS.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Implementing Breadth first search using connectivity matriX

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（3 个）

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

产品

标签

Community Treasure Hunt

Implementing Breadth first search using connectivity matriX

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（3 个）

2 个评论 显示 无 隐藏 无

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

产品

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论