single domain served by a cluster

A single domain can be served by a cluster of ejabberd nodes.

I had no problem to get a cluster with 2 differents erlang nodes : ejabberd1@first, ejabberd2@second,
where first and second are the hostname of 2 differents servers. But when I tried to use the same hostname,
I didn't see any other node in the web admin interface. The cluster don't work,
only one good running db nodes. I've not seen so much doc for that kind of cluster.

Any idea of the configuration of ejabberd master and slave node to have a cluster serving a single domain ?
Is the process of clustering the same like http://www.process-one.net/en/ejabberd/guide_en#htoc20 ?

machine hostname != erlang node name != ejabberd hosted names

Let's see if I understood correctly your plan and how to solve it.

The 'machine hostname' and the 'Jabber host name' can be completely different if you want.

You have two ejabberd nodes, one installed in a machine called "first", and other in a machine called "second".
The name of the Erlang node of ejabberd in first machine is called "ejabbed1@first", and in the second machine it's called "ejabberd2@second".

You configured Mnesia in those nodes to work as a cluster.

You want that both nodes in this cluster serve the Jabber domain "example.com".
In each machine you have an ejabberd.cfg. In both files you put the same:

{hosts, ["example.com"]}.

When a Jabber client tries to connect to the Jabber server "example.com", the client will try to connect to the machine that DNS points to. So, if your DNS says that "example.com" is hosted in "first", the Jabber client will connect to "first" and login in ejabberd.

The user can configure the Jabber client to connect to "second", and login to domain "example.com". For example in Psi's Account Properties:

  • Account -> JabberID: bob@example.com
  • Connection -> Manually Specify Server Host/Port, Host: "second"

If you want that clients randomly connect to first or second when they try to login in example.com, you need to configure your DNS server. This is not a topic of ejabberd. The ejabberd in each of your machines will try to serve the clients that it gets. It is a problem of the clients (or your DNS server) to direct the clients to either the machine first or second.

cluster fall over

Thanks for your explanations. It's clearly now for me.

I have used a load-balancer to test an ejabberd cluster.
I started with a basic configuration with internal authentification of ejabberd on my two servers : server1 & server2.
My erlang nodes are : ejabberd1@server1 & ejabberd2@server2.
My domain ejabberd hostname : mydomain.com

To do that, in the both servers: I've changed the last line of ejabberdctl.cfg and written the erlang node.
I've written the erlang node and host in the ejabberdctl sbin file.
I've changed these lines in the ejabberd.cfg file :

{hosts, ["mydomain.com"]}.
{host_config, "mydomain.com", [{auth_method, [internal, anonymous]}]}.
{acl, admin, {user, "test", "mydomain.com"}}.
{host, "muc.mydomain.com"},

Then I followed the process to cluster the second node, and I didn't replicate any Mnesia DB.
The cluster worked. I could speak between two jabber clients. And when a node broke, the jabber clients have changed to the other node. I could see in the web admin, the two erlang nodes and the virtual ejabberd server mydomain.com.

The problem now is that the cluster falls down. The second node disconnects itself.
Should I replicate some Mnesia DB?
Must I change some lines in my configuration of ejabberd.cfg?

Replicate tables so nodes can work idependently

badarg wrote:

I didn't replicate any Mnesia DB.

The problem now is that the cluster falls down. The second node disconnects itself.

The Mnesia database in the second node gets all the information continuosly from the first node: every user login, every roster, offline messages, vcards... When the first node is down, the second cannot access that information.

badarg wrote:

Should I replicate some Mnesia DB?

Yes, you should tell the Mnesia database that runs in the second node to replicate the tables that you consider important from first node. Probably the interesting ones are: passwd, roster, offline_msg, last_activity, private_storage, vcard, privacy.

badarg wrote:

Must I change some lines in my configuration of ejabberd.cfg?

No, check the ejabberd Guide, it explains the method to tell Mnesia database of second node to replicate those tables.

erlang nodes disconnect themselves in cluster

I have replicated all the tables of the Mnesia DB to test.
But the second erlang node still disconnects itself (seen in the web admin interface of the master node).
(for the two servers : ejabberd is always starting, & mnesia:info(). says that the other db node is stopped)
Best case : the cluster worked one day.

Any idea of what could be wrong in my configuration?

network issues can cause ejabberd nodes to disconnect

I've seen cases where network issues can cause ejabberd nodes to disconnect from each other. So, if this is happening a lot to you, then perhaps you have a poor network connection between the nodes.

I wrote a script that monitors mnesia:info() and restarts one of the nodes if they disconnect from each other.

pls show the script

zjt wrote:

Submitted by zjt on Tue, 2009-02-24 00:53.

I wrote a script that monitors mnesia:info() and restarts one of the nodes if they disconnect from each other.

pls show the script we are also have the same problem.I am not able to reconnect the node when the node disconnect from the network.

pls help me

regards

alagar

Basically, here is a command

Basically, here is a command you can use to determine if the current node believes the other node is stopped.

echo "rpc:call($ERLANG_NODE, mnesia, info, [])." | exec erl -name dbinfo_$THIS_HOSTNAME

and you can parse that output with this perl code

  for ( @mnesia_info ) {
      if ( m/stopped db nodes\s+=\s+\[(.*?)\]/ ) {
          my @stopped_nodes = ();
          $_ = $1;
          while ( /'(.*?)'/g ) {
              push @stopped_nodes, $1;
          }
          return @stopped_nodes;
      }
  }
  return $this_node;  # this node must be down

Here is some perl code that verifies that the other node is listening on port 5222, which is a good indication that it is still serving client traffic.

  my $available = 0;
  eval {
      alarm $commandtimeout;
      my $sock = new IO::Socket::INET (
          PeerAddr => $master_server,
          PeerPort => 5222,
          Proto    => 'tcp',
      ) or die "could not connect: $!";
      $available = 1 if $sock;
      close $sock;
      alarm 0;
  };
  if ( $@ ) {
      syslog LOG_ERR, "unable to connect to master node: $@";
      alarm 0;
  }
  return $available;

At that point, you can stop or restart the node as you see fit.

sorry how to run this script in windows

hi zjt,

how can i run this script in my windows XP system.I am running my two clustered ejabberd node in XP system.pls tell me the steps to run the command in my system.

i don't know about perl script.

pls help me.

regards,

alagar

I don't know. You should be

I don't know. You should be able to run the "erl dbinfo" command to get the status of the running/stopped nodes, but you'll have to experiment to find out exactly how to do that on windows.

The perl code is just used to parse the output of the "erl dbinfo" command, and then probe to see if the other ejabberd node is running, and then stop/restart the node. I just used perl because it is my language of choice. You could install perl on windows. Or you could use another language, or even erlang.

Ultimately, this capability should be built into ejabberd directly. However, I am not a savvy erlang/ejabberd developer.

ok ths zit

thanks for your help

Syndicate content