Connecting to MDCE Services on Worker Nodes from Head Node with MDCS

7 次查看(过去 30 天)
Environment: Red Hat Enterprise Linux 6.0
I am trying to get the mdce service on worker nodes to communicate with the mdce service on the head node. When running the admincenter on ANY node, I can connect to everyone, but can only see the number of cores on the node which I am currently on. Further, the MDCE Status reads "running" on the current node and "unavailable" on all other nodes. When I attempt to start the MDCE Service I receive
"Error on machine cerebro:
The MATLAB Distributed Computing Server is already running.
Use nodestatus to obtain more information."
Because, obviously, the service is running from a previous attempt. Stopping and restarting the services does not help.
When I run
nodestatus -remotehost <currentnode>
everything looks fine. When I run
nodestatus -remotehost <anyothernode>
I receive a series of java exceptions that ends with
"java.net.NoRouteToHostException: No route to host"
The lack of connectivity with nodestatus and the GUI occurs whether I use the computer aliases or the local IP addresses or the remote IP addresses.
I have confirmed that all nodes can communicate with each other using ping and traceroute. In addition, I have confirmed that ports 27350 through 27355 are open on all nodes.
All services are being run as root.
  3 个评论
Jason Ross
Jason Ross 2012-4-12
On the worker, block all ports and punch holes on ports 27350-27357 from the jobmanager.
On the clients, block all ports and punch holes on ports 27370-27375 from the jobmanager.
On the jobmanager, block all ports and punch holes on ports 27350-27355 from all workers, also block all ports and punch holes on ports 27350-27355 from the clients.
Generally the iptables command looks something like this:
iptables -A INPUT -p tcp --source source.hostname.here --dport ! 27370:27375 --syn -j REJECT --reject-with icmp-host-prohibited
Keep in mind that the above is only an example -- you'll likely need to tailor this to your own environment.
Alberto
Alberto 2014-3-13
Hi
Can you be a bit more detailed in what should I modify in the iptables? I get the same error.

请先登录,再进行评论。

采纳的回答

Jason Ross
Jason Ross 2012-4-10
In the Admin Center, there is a "Test Connectivity" test under the "Hosts" menu. Does that come back clean?
It sounds like there is something missing in name resolution:
  • host resolving its own name
  • forward lookup
  • reverse lookup
You can also pass the "-infolevel 2" flag to the nodestatus command. It will tell you the ports that the job manager and workers are using.
You might want to try turning off the firewalls temporarily to see if the port range is too restrictive or something else is "off". I've definitely encountered a fat-finger issue with iptables that caused issues similar to what you are seeing.

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Cluster Configuration 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by