Cant get both instances to start, only one at a time

Environment:

-CentOS 5.4 X86_64
-Two servers ( jabber111 & jabber112 )
-Ejabberd 2.1.6, installed from binary
-Verified I am able to telnet between machines, and from outside the firewall to the 5280 & 5222. Internally I am also able to telnet 4369 internally
-Cookie set and chmod 600 in /root

I start both instances, and they work independently, but the database is not syncing. I have kind of been basing my install on the guide posted Here, but I have not be having to much luck as of yet. For the time being I am setting this up as root, since thats the only way I have seen it EVER work in a clustered config.

To start eJabberD I am running the following command /opt/ejabberd-2.1.6/bin/ejabberdctl on a vanilla clustered install, the ps output is below, which may answer any enviroment questions specific to ejabber:

root      5951  0.0  0.8 197632 51716 pts/0    Sl   19:19   0:01 /opt/ejabberd-2.1.6/bin/beam.smp -K true -P 250000 -- -root /opt/ejabberd-2.1.6 -progname /opt/ejabberd-2.1.6/bin/erl -- -home /root -name ejabberd@jabber112.orl.___.net -smp auto -noshell -noinput -noshell -noinput -mnesia dir "/opt/ejabberd-2.1.6/database/ejabberd@jabber112.orl.____.net" -s ejabberd -ejabberd config "/opt/ejabberd-2.1.6/conf/ejabberd.cfg" log_path "/opt/ejabberd-2.1.6/logs/ejabberd.log" -sasl sasl_error_logger {file,"/opt/ejabberd-2.1.6/logs/erlang.log"}

However I can not get both instances to start after syncing the databases

If I start one I get this in the erlang.log

                    {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.246.0>},
                       {name,ejabberd_sm},
                       {mfa,{ejabberd_sm,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.253.0>},
                       {name,ejabberd_s2s},
                       {mfa,{ejabberd_s2s,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.256.0>},
                       {name,ejabberd_local},
                       {mfa,{ejabberd_local,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.259.0>},
                       {name,ejabberd_captcha},
                       {mfa,{ejabberd_captcha,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.262.0>},
                       {name,ejabberd_receiver_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_receiver_sup,ejabberd_receiver]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.263.0>},
                       {name,ejabberd_c2s_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_c2s_sup,ejabberd_c2s]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.264.0>},
                       {name,ejabberd_s2s_in_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_s2s_in_sup,ejabberd_s2s_in]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.265.0>},
                       {name,ejabberd_s2s_out_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_s2s_out_sup,ejabberd_s2s_out]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.266.0>},
                       {name,ejabberd_service_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_service_sup,ejabberd_service]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.267.0>},
                       {name,ejabberd_http_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_http_sup,ejabberd_http]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.268.0>},
                       {name,ejabberd_http_poll_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_http_poll_sup,ejabberd_http_poll]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.269.0>},
                       {name,ejabberd_iq_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_iq_sup,gen_iq_handler]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.270.0>},
                       {name,ejabberd_stun_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_stun_sup,ejabberd_stun]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.271.0>},
                       {name,ejabberd_frontend_socket_sup},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               [ejabberd_frontend_socket_sup,
                                ejabberd_frontend_socket]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.272.0>},
                       {name,cache_tab_sup},
                       {mfa,{cache_tab_sup,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.273.0>},
                       {name,ejabberd_listener},
                       {mfa,{ejabberd_listener,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.284.0>},
                       {name,{caps_features,cache_tab_caps_features_1}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_1,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.285.0>},
                       {name,{caps_features,cache_tab_caps_features_2}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_2,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.286.0>},
                       {name,{caps_features,cache_tab_caps_features_3}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_3,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.287.0>},
                       {name,{caps_features,cache_tab_caps_features_4}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_4,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.288.0>},
                       {name,{caps_features,cache_tab_caps_features_5}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_5,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.289.0>},
                       {name,{caps_features,cache_tab_caps_features_6}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_6,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.290.0>},
                       {name,{caps_features,cache_tab_caps_features_7}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_7,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,cache_tab_sup}
             started: [{pid,<0.291.0>},
                       {name,{caps_features,cache_tab_caps_features_8}},
                       {mfa,
                           {cache_tab,start_link,
                               [cache_tab_caps_features_8,caps_features,
                                [{max_size,1000},{life_time,86400}],
                                <0.281.0>]}},
                       {restart_type,permanent},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.281.0>},
                       {name,'ejabberd_mod_caps_xmpp._____.com'},
                       {mfa,{mod_caps,start_link,["xmpp._____.com",[]]}},
                       {restart_type,transient},
                       {shutdown,1000},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.297.0>},
                       {name,'ejabberd_mod_http_bind_xmpp._____.com'},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               ['ejabberd_mod_http_bind_xmpp._____.com',
                                ejabberd_http_bind]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.298.0>},
                       {name,'ejabberd_mod_irc_sup_xmpp._____.com'},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               ['ejabberd_mod_irc_sup_xmpp._____.com',
                                mod_irc_connection]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.299.0>},
                       {name,'ejabberd_mod_irc_xmpp._____.com'},
                       {mfa,{mod_irc,start_link,["xmpp._____.com",[]]}},
                       {restart_type,temporary},
                       {shutdown,1000},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.305.0>},
                       {name,'ejabberd_mod_muc_sup_xmpp._____.com'},
                       {mfa,
                           {ejabberd_tmp_sup,start_link,
                               ['ejabberd_mod_muc_sup_xmpp._____.com',
                                mod_muc_room]}},
                       {restart_type,permanent},
                       {shutdown,infinity},
                       {child_type,supervisor}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.306.0>},
                       {name,'ejabberd_mod_muc_xmpp._____.com'},
                       {mfa,{mod_muc,start_link,
                                     ["xmpp._____.com",
                                      [{access,muc},
                                       {access_create,muc_create},
                                       {access_persistent,muc_create},
                                       {access_admin,muc_admin}]]}},
                       {restart_type,temporary},
                       {shutdown,1000},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_sup}
             started: [{pid,<0.318.0>},
                       {name,'ejabberd_mod_pubsub_xmpp._____.com'},
                       {mfa,
                           {mod_pubsub,start_link,
                               ["xmpp._____.com",
                                [{access_createnode,pubsub_createnode},
                                 {ignore_pep_from_offline,true},
                                 {last_item_cache,false},
                                 {plugins,["flat","hometree","pep"]}]]}},
                       {restart_type,transient},
                       {shutdown,1000},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_listeners}
             started: [{pid,<0.370.0>},
                       {name,{5222,{0,0,0,0},tcp}},
                       {mfa,
                           {ejabberd_listener,start,
                               [{5222,{0,0,0,0},tcp},
                                ejabberd_c2s,
                                [{certfile,
                                     "/opt/ejabberd-2.1.6/conf/server.pem"},
                                 starttls,
                                 {access,c2s},
                                 {shaper,c2s_shaper},
                                 {max_stanza_size,65536}]]}},
                       {restart_type,transient},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_listeners}
             started: [{pid,<0.371.0>},
                       {name,{5269,{0,0,0,0},tcp}},
                       {mfa,
                           {ejabberd_listener,start,
                               [{5269,{0,0,0,0},tcp},
                                ejabberd_s2s_in,
                                [{shaper,s2s_shaper},
                                 {max_stanza_size,131072}]]}},
                       {restart_type,transient},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
          supervisor: {local,ejabberd_listeners}
             started: [{pid,<0.372.0>},
                       {name,{5280,{0,0,0,0},tcp}},
                       {mfa,
                           {ejabberd_listener,start,
                               [{5280,{0,0,0,0},tcp},
                                ejabberd_http,
                                [captcha,http_bind,http_poll,web_admin]]}},
                       {restart_type,transient},
                       {shutdown,brutal_kill},
                       {child_type,worker}]

=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
         application: ejabberd
          started_at: 'ejabberd@ejabber112.orl._____.net'

And when I go to start the failing node I get the following:

tail: /opt/ejabberd-2.1.6/logs/erlang.log: file truncated

=SUPERVISOR REPORT==== 15-Apr-2011::20:19:02 ===
     Supervisor: {local,mnesia_sup}
     Context:    start_error
     Reason:     killed
     Offender:   [{pid,undefined},
                  {name,mnesia_kernel_sup},
                  {mfa,{mnesia_kernel_sup,start,[]}},
                  {restart_type,permanent},
                  {shutdown,infinity},
                  {child_type,supervisor}]

=CRASH REPORT==== 15-Apr-2011::20:19:02 ===
  crasher:
    pid: <0.65.0>
    registered_name: mnesia_recover
    exception exit: killed
      in function  gen_server:terminate/6
    initial call: gen:init_it(gen_server,<0.61.0>,<0.61.0>,
                              {local,mnesia_recover},
                              mnesia_recover,
                              [<0.61.0>],
                              [{timeout,infinity}])
    ancestors: [mnesia_kernel_sup,mnesia_sup,<0.58.0>]
    messages: []
    links: [<0.89.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 2584
    stack_size: 23
    reductions: 3394
  neighbours:

=CRASH REPORT==== 15-Apr-2011::20:19:02 ===
  crasher:
    pid: <0.57.0>
    registered_name: []
    exception exit: {shutdown,{mnesia_sup,start,[normal,[]]}}
      in function  application_master:init/4
    initial call: application_master:init(<0.5.0>,<0.56.0>,
                                          {appl_data,mnesia,
                                           [mnesia_dumper_load_regulator,
                                            mnesia_event,mnesia_fallback,
                                            mnesia_controller,
                                            mnesia_kernel_sup,
                                            mnesia_late_loader,mnesia_locker,
                                            mnesia_monitor,mnesia_recover,
                                            mnesia_substr,mnesia_sup,
                                            mnesia_tm],
                                           undefined,
                                           {mnesia_sup,[]},
                                           [mnesia,mnesia_backup,mnesia_bup,
                                            mnesia_checkpoint,
                                            mnesia_checkpoint_sup,
                                            mnesia_controller,mnesia_dumper,
                                            mnesia_event,mnesia_frag,
                                            mnesia_frag_hash,
                                            mnesia_frag_old_hash,mnesia_index,
                                            mnesia_kernel_sup,
                                            mnesia_late_loader,mnesia_lib,
                                            mnesia_loader,mnesia_locker,
                                            mnesia_log,mnesia_monitor,
                                            mnesia_recover,mnesia_registry,
                                            mnesia_schema,mnesia_snmp_hook,
                                            mnesia_snmp_sup,mnesia_subscr,
                                            mnesia_sup,mnesia_sp,mnesia_text,
                                            mnesia_tm],
                                           [],infinity,infinity},
                                          normal)
    ancestors: [<0.56.0>]
    messages: [{'EXIT',<0.58.0>,normal}]
    links: [<0.56.0>,<0.5.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 610
    stack_size: 23
    reductions: 117
  neighbours:

It doesnt matter what order I start the nodes, I can only get the first one to start, and in the admin panel I see one node listed as started, and the second 'stopped", so It would seem that they are aware of each other....

Now here is the real kicker... I can get them both to show up if I run the following command
erl -name ejabberd@jabber112.orl._____.net -mnesia extra_db_nodes "['ejabberd@jabber111.orl.____.net']" -s mnesia

However as soon as I exit the admin panel reports the second instances as down again. IfI try to sync the database from the erlang console:

Erlang (BEAM) emulator version 5.6.5 [source] [64-bit] [smp:8] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.6.5  (abort with ^G)
(ejabberd@jabber111.orl.______.net)1> mnesia:change_table_copy_type(schema, node(), disc_copies).
{aborted,{already_exists,schema,
                         'ejabberd@jabber111.orl._____.net',disc_copies}}

Any suggestions, I have been racking my brain on this for two days, and am at my wits end!@# :)

thanks!

John

Syndicate content