[solved] join_cluster fails on cluster setup

Hi,

I have two ejabberd nodes, ejabberd@node01 and ejabberd@node02. They were previously running in a cluster. I made the first one leave the cluster and then tried joining again.

Unfortunately, it fails with these error messages:

Using passed parameter for remote master node name: ejabberd@node02
Using commands:
/opt/ejabberd/sbin/ejabberdctl
/usr/bin/erl
/usr/bin/erlc

{error_logger,{{2015,10,24},{16,52,8}},"Protocol: ~tp: register error: ~tp~n",["inet_tcp",badarg]}
{error_logger,{{2015,10,24},{16,52,8}},crash_report,[[{initial_call,{net_kernel,init,['Argument__1']}},{pid,<0.20.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,322}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}},{ancestors,[net_sup,kernel_sup,<0.10.0>]},{messages,[]},{links,[<0.17.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,27},{reductions,273}],[]]}
{error_logger,{{2015,10,24},{16,52,8}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfargs,{net_kernel,start_link,[['ejabberd@node01',shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2015,10,24},{16,52,8}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}},{offender,[{pid,undefined},{name,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2015,10,24},{16,52,8}},crash_report,[[{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.9.0>},{registered_name,[]},{error_info,{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,133}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}},{ancestors,[<0.8.0>]},{messages,[{'EXIT',<0.10.0>,normal}]},{links,[<0.8.0>,<0.7.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,376},{stack_size,27},{reductions,117}],[]]}
{error_logger,{{2015,10,24},{16,52,8}},std_info,[{application,kernel},{exited,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}}}"}

The network configuration of the machines is still the same as when they were clustered, so I'm not sure what the "inet_tcp",badarg could mean. Both nodes are running ejabberd 15.09.

Hi I have same problem. I

Hi I have same problem.

I install ejabberd-15.09-linux-x86_64-installer.run on debian(jessie). I try to cluster them and I cant.
who did clustering with ejabberd 15.09 ?
I do follow steps:

- I Install erlang(18) and ejabberd
- Create ejabberd.cfg file
- In ejabberdctl.cfg: set(local) INET_DIST_INTERFACE={10,0,0,1} and ERLANG NODE NAME
- same .erlang cookie(/root/.erlangcookie)
I use join_cluster and joincluster commands
/opt/ejabberd-15.09/bin/ejabberdctl join_cluster 'ejabberd@server1'
/opt/ejabberd-15.09/bin/joincluster 'ejabberd@server1'
it return:

--------------------------------------------------------------------

ejabberd cluster configuration

This ejabberd node will be configured for use in an ejabberd cluster.
IMPORTANT: all local data from the database will be lost, and
cluster database will be initialized. All data from the master
node will be replicated to this one.

--------------------------------------------------------------------
Press any key to continue, or Ctrl+C to stop now

Using passed parameter for remote master node name: ejabberd@70_slave
Using commands:
/opt/ejabberd-15.09/bin/ejabberdctl.

and try with easy_cluster.erl(https://raymii.org/s/tutorials/Set_up_a_federated_XMPP_Chat_Network_with...) , but it not working,too

please some body help me.plz

What user are you running

What user are you running ejabberd as ? Can you try that command using

sudo -u -i /opt/ejabberd-15.09/bin/joincluster 'ejabberd@server1'

I install and run ejabberd

I install and run ejabberd with root's user. I use debian and sudo -u -i not work.
I run this command(/opt/ejabberd-15.09/bin/joincluster 'ejabberd@server1') with root's user.
it returns:

ejabberd cluster configuration

This ejabberd node will be configured for use in an ejabberd cluster.
IMPORTANT: all local data from the database will be lost, and
cluster database will be initialized. All data from the master
node will be replicated to this one.

--------------------------------------------------------------------
Press any key to continue, or Ctrl+C to stop now

Using passed parameter for remote master node name: ejabberd@69_slave
Using commands:
/opt/ejabberd-15.09/bin/ejabberdctl

I think your problem (script

I think your problem (script exiting after showing the path of ejabberdctl) is caused by the "exec" command used in the joincluster script to call ejabberdctl. I removed it and the joincluster script continues, unfortunately resulting in the errors above in my case.

It's running under the

It's running under the "ejabberd" user (ejabberd was compiled manually with "./configure --prefix=/opt/ejabberd --enable-user=ejabberd"). I also tried re-joining the cluster as this user.

The "joincluster" script is not in /opt/ejabberd/bin for me, but somewhere under /opt/ejabberd/lib.

Just upgraded both nodes to

Just upgraded both nodes to 15.10, same problem. Can you give me any tips on how to investigate that "Protocol: ~tp: register error: ~tp~n",["inet_tcp",badarg] error?

I found the problem: in our

I found the problem: in our case, ejabberdctl.cfg contains the same IP address for INET_DIST_INTERFACE and ERL_EPMD_ADDRESS. This seems to be a problem for the join_cluster script. I used the default value for INET_DIST_INTERFACE, which allowed the join_cluster script to run and then changed the value back.

Syndicate content