MAL has been steadily growing since 2009 when the search rankings began to catch up and exceed that of ANN and other competing sites. The server load on a social site grows exponentially over time and the servers have not been upgraded fast enough to keep up with the load. Xinil has been trying every little trick he can to keep things running, even if that means disabling small things here and there, but obviously this has still not been enough to abate the ever-increasing lag issues.
pretty graphs, courtesy of alexa (follow the blue line):
<-- user growth since 2009
<-- pageviews over last 6 mo
Note that although MAL has about the same number of active users as ANN, it also has about double the traffic, on average (because people click more links). You can see the spikes each weekend, the largest recent one being this last christmas, when the lag grew to 9 hours or more on the slowest server.
The cause of the lag is simply too many updates from too many users.... the database servers just can't keep up with those big swings. There are multiple database servers (1 master and 3 slaves iirc) and some of them are slower than others so one of them may make your changes after a few seconds, but another might take an hour under load. Since the server is chosen randomly each time you request a page, your changes may appear and disappear randomly for as long as the slowest server is lagged. You can usually gauge how much each server is lagged by refreshing a busy page like the main forum page a few times and looking at the most recent post times.
I often see people saying "if we just deleted old inactive accounts or old forum posts, the lag would cease!", but actually it's only the active posting users that really matter. Each time you update anything that requires writing data to the servers (e.g. a forum post), the master server has to lock the information to prevent any further writes until that one is processed (imagine that two people post to the forum at the same time, and get the same post number overwriting each other). Because of those locks and coordination with the other servers, writing new information to the databases takes more time than just reading info from it. Now, things like inactive users and old posts.... they never even get requested, so they just sit on a hard drive somewhere and take up hard drive space -- they don't actually contribute to lag at all (except they do slow down searches). I'm simplifying a lot of things, but basically it's the updates you submit that slow mal down most.
The response time is generally acceptable on weekdays when most users are at school or work, but the traffic on the weekend is double that. Long weekends and holidays are especially bad, resulting in the slowest database servers becoming lagged by 7+ hours and making the site close to unusable. New users have trouble verifying and logging in for hours, and established users have everything appear slowly and disappear randomly.... it's extremely frustrating.
Steps are being taken to reduce the lag in any way possible, and we are expecting some serious improvements imminently, but I'm really not sure exactly when it will happen. Everyone knows it's bad, and has been for a long while, but we just have to deal with it until it's fixed. |