Error adding new node to Gluster cluster

While working on a Gluster project, I ran into an issue  adding in a third node to an existing cluster running on Oracle Linux 8. This was very frustrating, as the cluster was working fine. I went ahead and built a new cluster in my home lab, to see if I could replicate the issue.

Even in the homelab, I was unable to add nodes to the cluster! The gluster peer probe failed!

[root@gluster1 ~]# gluster peer probe gluster3
peer probe: failed: gluster3 is either already part of another cluster or having volumes configured



After encountering issues adding nodes in two different
environments, I went through some troubleshooting steps including checking
selinux, firewall ports, and completely rebuilding the third node. No joy!

I started digging through log files… and saw this error in /var/log/glusterfs/glusterd.log
every time I tried to add the node…


17:22:30.130617] E [socket.c:2253:__socket_read_frag] 0-rpc: wrong MSG-TYPE
(1728250847) received from
[2023-09-04 17:22:38.710707] E [socket.c:2253:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1728250739) received from
[2023-09-04 17:22:39.433087] E [socket.c:2253:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1728250628) received from




Hmm… odd error. What could cause this? Wrong message type…bad

Then it hit me like a brick the next day! I encrypted the
cluster, and the 3rd node didn’t have it’s pem file added to the
list of CAs! So I quickly appended  the
.pem file from gluster3 to /etc/ssl/ on all nodes of the cluster. And
BAM! It worked!

~]# gluster peer probe gluster3
peer probe: success



So, why? What happened?

In all my clusters I enable encryption from day one. This
encrypts the traffic between the cluster nodes, adding security, especially
when running clusters in the cloud where you never know who might be listening
to your network traffic. This worked great when the clusters were built, but
when adding a node it’s traffic wasn’t being decrypted correctly because the
gluster1 node was sending an encrypted packet to gluster3, but gluster3 could
not decrypt the packet. This was because gluster3 didn’t have the pem file (
the pem file  stores cryptographic keys
and certificate authorities) from gluster1 or gluster2. This means that glusterd
could not decrypt the packet, so it reported it as a bad packet.

Once I added the pem file to all the nodes, every node could
now decrypt the messages. Putting everything back to normal!

As a note to the gluster developers, PLEASE add in better
error handling. an error message in glusterd.log that said it was a key error
would have been very helpful!


Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.