Just how big can an ejabberd installation get?

Hello. I'm new here and have some questions about ejabberd that I haven't been able to find quick answers to. My apologies if the answers should be obvious.

We're working on a new service that's going to have something chat-like at its core; we're sure we want to use XMPP, and are still speccing out the exact implementation. So:

Just how big can an ejabberd installation get? If things go well, we need to support millions of registered users, with hundreds of thousands active at one time. Is anyone doing that already? One cluster? Several clusters in a farm?

Mnesia - it looks like it may be too small (4 gig hard table size limit, if I read right). Is this likely to be a problem? I've seen mentions of the next version talking to an SQL database. How close to ready is that? If it's not ready, might it be possible for me to shuttle user data in and out of ejabberd at need by some other means?

Thanks for your patience,

Andy Hickmott
Pando Networks

Anyone? Or is there a better

Anyone? Or is there a better place for me to ask this kind of question?

This forum is a good place. B

This forum is a good place. But you will want to ask this on the ejabberd mailing list (adding a link to this thread), to reach different people.

You can ask on jadmin (general list about Jabber server administration) too.

big server: possible hints

Until somebody with better knowledge in ejabberd and Erlang aswers, I'll try to give some hints, probably things you already noticed, and probably with errors since I'm speaking about things I've not tried myself.

we're sure we want to use XMPP, and are still speccing out the exact implementation

You can run tests to check the limitations of the different implementations: how many total accounts (1M, 10M?...), how many roster items (10M, 100M?...), etc.

If things go well, we need to support millions of registered users, with hundreds of thousands active at one time.

In ejabberd:

  • Every registered account is a line in a table. Having several millions is not a problem.
  • Every item in a roster is a line in a table. If every registered user had several items in their roster (like 100 contacts), then this may reach the limit in Mnesia (it's just speculation).
  • If limits are not reached, having millions of registered users is not a problem for CPU or RAM usage. The problem is the number of concurrent connected users at one time, and how active they are (messages/presences sent/received per second).

Is anyone doing that already?

I don't know about ejabberd, but there were some emails on the jadmin mailing list some months ago from people interested in similar things.

One cluster? Several clusters in a farm?

You will probably need two things:

  • Split the user base (if they reach database limits) in several exclusive domains. I mean, split the total user base in groups:
    • by language: en.pando.com, es.pando.com...
    • by country: us.pando.com, uk.pando.com...
    • or by theme: edu.pando.com, games.pando.com...
    Every domain will have its own user base, its database, its server or cluster of servers. If this is possible, I think you will have a lot of work advanced, since every group will be completely independent.
  • Split the computational work for every group in several machines, using an Erlang cluster for each group. Every node in the cluster will share the same database (the same user base). Mnesia will keep updated each node database.

I've seen mentions of the next version talking to an SQL database. How close to ready is that?

ejabberd at CVS has ODBC/PostgreSQL storage for the roster table (the biggest table on the database). If the bottleneck in your situtation is database limits then you will prefer PostgreSQL. If it's computational limits, then you will prefer Mnesia and deploy it in a cluster.

Links:

Thanks

Thanks for the info. I'm not sure it's enough to proceed with, but it looks good. I'll try the mailing list.

Syndicate content