Late unavailable presence can knock a newer session out of a chatroom

I have a session logged into an ejabberd server that is a member of a chatroom.
I log in a second time with the same user/resource.

From reading: http://www.xmpp.org/rfcs/rfc3921.html#session
It seems that if a second session connects, the first session should be terminated, thus, an unavailable presence should be sent out to other chatroom members and then the second session is notified of success.

However, let's say that the first session's client is no longer listening. It's gotten itself into some kind of problem where the session is broken at its end. I've seen cases where the log prints the "Replaced session for" message up to 20 minutes after the second session has been opened and acknowledged. I assume that this is because of some kind of timeout and that the message is being sent asynchronously.

The consequence of this is that my second session joins the chatroom and subsequently gets logged back out because an unavailable presence is sent to the chatroom for the now terminated first session...

Does this seem right? Is it possible? Have I described the scenario correctly?

It seems to me that the correct way to do this would be for the unavailable presence to be sent out only if there is no session for that user signed into the chatroom at that time.

Here are the appropriate log messages

Here are the appropriate log messages for this with the port numbers highlighted.
At around 2008-10-31 10:00:00 I simulated a disconnect by dropping the network connection of the first session.
At 2008-10-31 10:21:17 my most recent session was knocked out of the chatroom it had joined:

=INFO REPORT==== 2008-10-31 09:57:54 ===
I(<0.233.0>:ejabberd_listener:112) : (#Port<0.446>) Accepted connection {{proxyipaddress},3756} -> {{serveripaddress},5222}

=INFO REPORT==== 2008-10-31 09:57:58 ===
I(<0.236.0>:extauth:73) : extauth call '["auth", "myusername", "ejabberd-service",
"mypassword"]' received data response: [0,1]

=INFO REPORT==== 2008-10-31 09:57:58 ===
I(<0.402.0>:ejabberd_c2s:560) : ({socket_state,tls,{tlssock,#Port<0.446>,#Port<0
.448>},<0.401.0>}) Accepted authentication for myusername

=INFO REPORT==== 2008-10-31 09:57:59 ===
I(<0.402.0>:ejabberd_c2s:805) : ({socket_state,tls,{tlssock,#Port<0.446>,#Port<0
.448>},<0.401.0>}) Opened session for myusername@ejabberd-service/cl-xmpp

=INFO REPORT==== 2008-10-31 10:01:38 ===
I(<0.233.0>:ejabberd_listener:112) : (#Port<0.459>) Accepted connection {{proxyipaddress},3834} -> {{serveripaddress},5222}

=INFO REPORT==== 2008-10-31 10:01:42 ===
I(<0.236.0>:extauth:73) : extauth call '["auth", "myusername", "ejabberd-service",
"mypassword"]' received data response: [0,1]

=INFO REPORT==== 2008-10-31 10:01:42 ===
I(<0.409.0>:ejabberd_c2s:560) : ({socket_state,tls,{tlssock,#Port<0.459>,#Port<0
.461>},<0.408.0>}) Accepted authentication for myusername

=INFO REPORT==== 2008-10-31 10:01:45 ===
I(<0.409.0>:ejabberd_c2s:805) : ({socket_state,tls,{tlssock,#Port<0.459>,#Port<0
.461>},<0.408.0>}) Opened session for myusername@ejabberd-service/cl-xmpp

=INFO REPORT==== 2008-10-31 10:21:17 ===
I(<0.402.0>:ejabberd_c2s:1297) : ({socket_state,tls,{tlssock,#Port<0.446>,#Port<
0.448>},<0.401.0>}) Replaced session for myusername@ejabberd-service/cl-xmpp

Badlop and I had the

Badlop and I had the following conversation:

[11:42:07] <yellowdog> If you would like to use the proxy server I'm using to simulate a network outage for this problem then I can send you on the details and drop the connection for you.
[11:43:40] <badlop> i don't know how much debugging and how much drops i'll need to find the lines of code to change, and how much to test the new code :)
[11:43:54] <badlop> so i'll find a cable to drop myself
[11:44:23] <badlop> if i can write a patch, i'll ping you to test
[11:44:29] <yellowdog> No worries.   Can you reproduce this?
[11:45:13] <badlop> when i get a laptop with ethernet connection, probably :)
[11:45:48] <yellowdog> Well if you want to verify it with my proxy you can...
[11:46:39] <yellowdog> My Erlang isn't great, but perhaps in ejabberd_c2s.erl:handle_info(replaced, _StateName, StateData) the state needs to change before calling send_text... or send_text needs a tighter timeout...
[11:58:52] <badlop> "up to 20 minutes after the second session has been opened and acknowledged"
[12:00:32] <badlop> so:
1. user@server/res joins room@conference.localhost
2. the client crashes in such way that ejabberd doesn't notice
3. user joins again
4. ejabberd 'disconnects' the old connection
5. 20 minutes after 3., the second connection is kicked of the room
right?
[12:01:08] <yellowdog> correct
[12:01:53] <yellowdog> One point I didn't mention, I have {max_user_sessions, 1}. in my ejabberd.cfg
[12:02:00] <badlop> did you find where those '20 minutes' are specified?
[12:02:07] <badlop> maybe that timeout is of TCP, not ejabberd
[12:02:41] <yellowdog> Could be... it's just the time that I observed in the log which I posted to the forum.
[12:03:21] <yellowdog> I don't think having a different timeout is a sufficient solution anyway.   You'd just be lowering the possibility of a race condition.
[12:04:15] <badlop> what i wanted is to make experiments shorter to run :)
[12:04:35] <yellowdog> Ah :)
[12:06:37] <yellowdog> I imagine that the socket timeout would be specified when the socket is created, that would allow the OS to take care of it in a thread-safe way.   I guess 20 minutes could be a default, since it seems rather long, so it would then depend on the library that you're interfacing with.
[12:11:27] <badlop> in 3. room doesn't allow second client to login, because first client is already using the nick
[12:12:43] <badlop> you have this, right?
{access, max_user_sessions, [{1, all}]}.
[12:14:36] <yellowdog> I don't have it defined like that
[12:14:47] <yellowdog> It's defined as I mentioned above.
[12:15:20] <yellowdog> Hmmm.   I'll need to check into what you just said about the room not allowing the second client to join... maybe I'm missing a message from the server.
[12:15:39] <yellowdog> (i.e. my client cl-xmpp is ignoring it)
[12:17:39] <badlop> the option max_user_sessions is mean to be defined like i said:
http://svn.process-one.net/ejabberd/trunk/doc/guide.html#configmaxsessions
[12:18:10] <yellowdog> I see... perhaps it's something that was leftover from our old 1.1.2 installation.   I'll change this and try my test again.
[12:21:27] <badlop> in 3., my client sends:

<presence to='room@conference.localhost/participant'
  xml:lang='es'>
  <show>dnd</show>
  <status>No molestar</status>
  <x xmlns='http://jabber.org/protocol/muc'>
    <password/>
    <history maxchars='10000'
  maxstanzas='20'/>
  </x>
</presence>

and it receives:
<presence from='room@conference.localhost/participant'
  xml:lang='es'
  type='error'
  to='testt@localhost/tka'>
  <show>dnd</show>
  <status>No molestar</status>
  <x xmlns='http://jabber.org/protocol/muc'>
    <password/>
    <history maxstanzas='20'
  maxchars='10000'/>
  </x>
  <error type='cancel'
  code='409'>
    <conflict xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
    <text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>El apodo ya está siendo usado por otro ocupante</text>
  </error>
</presence>
[12:21:42] <badlop> "the nickname is already used by other participant"
[12:28:37] <yellowdog> Hmmm... I'll look into this... I don't think I get that.   It does make sense though.

I've modified the Lisp xmpp

I've modified the Lisp xmpp client that I'm using to be more verbose.

When my client connects/authorizes/creates a session, it starts out by sending presence to the server:

<presence></presence>

It gets a strange response, as if its username were in lower case from the same username in uppercase:

<presence from="3232235945.3434453863.G3448@ejabberd-service/cl-xmpp"
            to="3232235945.3434453863.g3448@ejabberd-service/cl-xmpp"/>

And vice versa:

<presence from="3232235945.3434453863.g3448@ejabberd-service/cl-xmpp"
            to="3232235945.3434453863.G3448@ejabberd-service/cl-xmpp"/>

It then sends a message to a server that controls room access via xmpp messages. This server is just another xmpp-client, but is an administrator on the ejabberd server.
The message my client sends asks to be invited to the chatroom. The server does the following:

<message from="myroom@conference.ejabberd-service"
           to="3232235945.3434453863.G3448@ejabberd-service"
         type="normal">
  <x xmlns="http://jabber.org/protocol/muc#user">
    <invite from="control-server@ejabberd-service/cl-xmpp"><reason>invitation</reason></invite>
  </x>
  <x xmlns="jabber:x:conference" jid="myroom@conference.ejabberd-service">invitation</x>
  <body>
    control-server@ejabberd-service/cl-xmpp invites you to the room myroom@conference.ejabberd-service (invitation)
  </body>
</message>

Next my client joins the room:

<presence from="3232235945.3434453863.G3448@ejabberd-service"
            to="myroom@conference.ejabberd-service/myusername">
  <x xmlns="http://jabber.org/protocol/muc"></x>
</presence>

And receives the confirmation:

<presence from="myroom@conference.ejabberd-service/3232235945.3434453863.G3448"
            to="3232235945.3434453863.G3448@ejabberd-service/cl-xmpp">
  <x xmlns="http://jabber.org/protocol/muc#user">
    <item affiliation="member" role="visitor"/>
  </x>
</presence>

At some point later, the "Replaced session for" message is logged for the old session and my new session receives the 2 messages:

<presence from="3232235945.3434453863.G3448@ejabberd-service/cl-xmpp"
            to="3232235945.3434453863.g3448@ejabberd-service/cl-xmpp"
          type="unavailable">
  <status>Replaced by new connection</status>
</presence>
<presence from="myroom@conference.ejabberd-service/3232235945.3434453863.G3448"
            to="3232235945.3434453863.G3448@ejabberd-service/cl-xmpp"
          type="unavailable">
  <status>Replaced by new connection</status>
  <x xmlns="http://jabber.org/protocol/muc#user">
    <item affiliation="member" role="none"/>
  </x>
</presence>

It seems strange the mixed case messages I'm getting. My usernames are always in uppercase, never in lowercase.
I'll try and look into that aspect a little more as it seems confusing.

I've been able to replicate

I've been able to replicate this using pidgin. I think the reason I don't get a conflict is because the control-server sents the invite...

I'm still not sure what is happening with the mixed case usernames in some of the messages... I did see that when I use Psi it tries to change any username I type in to lowercase. I'm using a case-sensitive external auth script.

Still being knocked out of chatrooms

I thought I would be able to get around this problem if I made sure that I used unique nicknames in the chatroom for the new ejabberd session.

I was however mistaken. It seems that ALL chatroom entries for an ejabberd user JID are signed out, rather than just the ones for the particular ejabberd session that has terminated.

I thought about using unique resources for the subsequent logins but that won't work for 2 reasons:
1) I have explicitly set max_user_sessions to 1.
2) My business logic requires that a user can have only 1 session. I really had hoped that as soon as the new session is created the old session can no longer affect anything.

This seems to be a problem for long lived sessions... I imagine that lots of other applications must maintain connectivity after network downtime...

I think that this situation can be remedied if ejabberd sends the unavailable presence to the chatrooms before trying to send the unavailable presence to the users JID and not after.

Follow up

As a follow-up for anybody that is interested:

First, a quick note on multiple sessions using different resources:
It should be possible for 2 resources on the same basic JID to enter a room and share a nick.
Example scenario:
I connect from home and join a room.
I connect from work and join the same room.
I want to be able to see the messages at home and at work, while at the same time not show up as two separate nicks.

This is outlined in XEP-0045 section 7.1.10:
"However, if the bare JID of the present occupant matches the bare JID of the user seeking to enter the room, then the service SHOULD allow entry to the user, so that the user has two (or more) in-room "sessions" with the same roomnick, one for each resource."

Unfortunately this is not working in the current release of ejabberd as outlined here: http://www.ejabberd.im/node/3258

To accomplish my goal of not having my subsequent session kicked out, I used a combination of unique resources and unique room nicknames. When the first session is replaced, the second session is not affected.
If you are trying to maintain 1 session per username, then it is also vital to set max_user_sessions to 1.

Syndicate content