Error adding new node to Gluster cluster

Share this post on:

While working on a Gluster project, I ran into an issue  adding in a third node to an existing cluster running on Oracle Linux 8. This was very frustrating, as the cluster was working fine. I went ahead and built a new cluster in my home lab, to see if I could replicate the issue.

Even in the homelab, I was unable to add nodes to the cluster! The gluster peer probe failed!


[root@gluster1 ~]# gluster peer probe gluster3
peer probe: failed: gluster3 is either already part of another cluster or having volumes configured
 

 

 

After encountering issues adding nodes in two different
environments, I went through some troubleshooting steps including checking
selinux, firewall ports, and completely rebuilding the third node. No joy!

I started digging through log files… and saw this error in /var/log/glusterfs/glusterd.log
every time I tried to add the node…

 

[2023-09-04
17:22:30.130617] E [socket.c:2253:__socket_read_frag] 0-rpc: wrong MSG-TYPE
(1728250847) received from 192.168.200.110:49151
[2023-09-04 17:22:38.710707] E [socket.c:2253:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1728250739) received from 192.168.200.110:49151
[2023-09-04 17:22:39.433087] E [socket.c:2253:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1728250628) received from 192.168.200.110:49151
 

 

 

 

Hmm… odd error. What could cause this? Wrong message type…bad
packet…

Then it hit me like a brick the next day! I encrypted the
cluster, and the 3rd node didn’t have it’s pem file added to the
list of CAs! So I quickly appended  the
.pem file from gluster3 to /etc/ssl/glusterfs.ca on all nodes of the cluster. And
BAM! It worked!

[root@gluster1
~]# gluster peer probe gluster3
peer probe: success
 

 

 

So, why? What happened?

In all my clusters I enable encryption from day one. This
encrypts the traffic between the cluster nodes, adding security, especially
when running clusters in the cloud where you never know who might be listening
to your network traffic. This worked great when the clusters were built, but
when adding a node it’s traffic wasn’t being decrypted correctly because the
gluster1 node was sending an encrypted packet to gluster3, but gluster3 could
not decrypt the packet. This was because gluster3 didn’t have the pem file (
the pem file  stores cryptographic keys
and certificate authorities) from gluster1 or gluster2. This means that glusterd
could not decrypt the packet, so it reported it as a bad packet.

Once I added the pem file to all the nodes, every node could
now decrypt the messages. Putting everything back to normal!

As a note to the gluster developers, PLEASE add in better
error handling. an error message in glusterd.log that said it was a key error
would have been very helpful!

 

Author: admin

Erik is currently an Oracle ACE Director and VP of Enterprise Transformation at Mythics, serving as a lead strategist for Federal, State and Local Government and Commercial customers throughout the United States. These customer engagements include enterprise cloud transformations, data center consolidation and modernization efforts, Big Data projects and implementations of Oracle Engineered Systems. He is a board member of the DC metro area National Capital Oracle User Group, a board member of the Independent Oracle Users Group (IOUG), Cloud Computing Special Interest Group (SIG) and he is actively involved with the Oracle Enterprise Manager SIGs. Erik presents frequently at conferences, including Oracle OpenWorld, Oracle FedForum, COLLABORATE and other user groups and conferences around the United States. He has worked with Oracle and Sun Systems since the mid 90s, and is experienced with most of the core Oracle technologies.

When not flying to the far points of the country from the Atlanta Metro area, he enjoys spending time with his family at their observatory, where the telescopes outnumber the people.

View all posts by admin >

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.