Activating Warp Speed for Hubzilla's DatabaseHow to significantly gain speed and perfomance for Hubzilla by just some minor adjustments.
Note: The DB section is about MySQL/MariaDB. However, PostgresQL will be similar effected. The solultion is database agnostic.
View article
View summary
Preface
As my hub (hub.netzgemeinde.eu) grew I noticed sizable performance issues (including timeouts to hubzilla), originating from the database. The MariaDB in question was on a conventional SAS RAID, therefore the logical step was moving the database to a SSD host, making the old host a slave to the new db master. This significantly improved speed.
However, I noticed something odd: The slave (the former master) was experiencing rather high iowait time (up to 20%). Digging deeper into the issue I found that php's sessions were being stored in the database. And that is a
very bad idea, it's one of the surest ways (apart from missing indexes and so) to bring the database to it's knees!
Let me tell you a story: There was this sports club which had an online shop for it's merchandise which also stored it's sessions in a database. Now against all odds the club won the cup, and guess what happened and who was on on-call duty that evening? I tried to mitigate this by adding 64 (yes, 64) cores to the database server, it just gobbled them up in microseconds so I told the customer that there was nothing we could do about this, just wait until the rush was over - and then make some optimizations, including removing the sessions from the database. (This had to be done by developers). After that (and a couple of other optimizations) the shop went fine. This was the worst offender I've seen, yet by far not the only one.
Back to Hubzilla
The slave
I've changed the replication to ignore the session table:
replicate-ignore-table = hub_netzgemeinde_eu.session
Usually this is something that makes every experienced DBA (including me) cringe - partial replication is one sure way to break your slave/backups, but in this case it's OK, since there is no point in having the session table replicated or backupped - If you have to perform disaster recovery like promoting the slave to master or restoring a backup, the sessions are the least of your concerns :-D
Let's have a look at the perfomance data - io wait:
Guess the time whan I disabled session's replication? ;-) This is rather compelling, isn't it?
This fixed the iowait issue on the slave's host. Now to the master.
The master
Now it's time to fix this issue once and for all:
Take the session handling out of the databaseThere are basically two options: Use file based session handling or external session storage. Both have advantage and disadvantages:
- File based session handling
- Store your sessions in files residing /tmp. Some time ago this was considered to be a bad/slow solution (after all, a database is faster, eh?) However, tmp now usually has got tmpfs as a filesystem, which means /tmp is a RAM disk, which is much, much faster than a database. (If your tmp is not having a tmpfs, try the other option). There are two caveats, though: When the machine where your PHP interpreter (either fpm or mod_php) runs on has to reboot (either due to planned maintenance or powerloss) the sessions will be gone, the users will have to log in again That's an acceptable price to pay. The second one concerns scaling issue: If you run a high-performance setup (i.e. load balancer, more than one webserver) this can't be used since all application servers have to access the shared sessions.
- External session storage
- You run a small key/value in-memory database on another host. The usual ones are memcache or Redis (I myself haven't got mixed experiences with Redis as as session store, but your mileage may vary). All application servers (Apache with PHP) can access the storage, so this is the way to go if you have more than one application server. If you use Redis, the session can also be saved in regular intervals on the disk, therefore you won't lose the sessions if your session storage goes down (reboot, update, whatver), however you will lose some speed advantage. I myself would recommend memcache, it's blazingly fast and stores the session ony in memory, but feel free to toy with Redis.
Setting it up in Hubzilla
Unfortunately Hubzilla doesn't support changing the session handler out of the box. I've written a small patch (which I will submit for review) to enable ist (it's attached to this article):
cd <your instance>/Zotlabs/Web
(copy Session.patch here)
patch < Session.patch
After that you can switch the session handler, by default it's still database, so the patch won't do any harm. If you want to change the handler change .htconfig.php in Hubzilla's base directory:
App::$config["system"]['session_custom'] = true;
App::$config["system"]['session_save_handler'] = "files";
App::$config["system"]['session_save_path'] = "/tmp"
(Or whatever you prefer - memcached, redis,..)
After that all your former sessions will be unavailabe, i.e. users have to login again (including nomad's users), but after that things run rather smoothly.
About the performance gain
In my case it's hard to say, because the ssd is so fast - still the graph seems to indicate that even on a fast ssd (4 cores, 16 GB Ram, 10 GB Innodb buffer pool) there is a perfomance gain. I'd love to see if someone tries this on a slow machine - how much perfomance did you gain? I'd really be interested in feedback.
Conclusion
By changing Hubzilla's session handling one can gain a significant perfomance boost - the only disadvantage is that the sessions may be a little bit volatile (i.e. not reboot safe), but that's a small price to pay.