Okay, that was less bad than I thought it would be.

The gory details, for the curious: Mastodon runs a service called Sidekiq that manages the event queue, including posts to the local timeline, pushing things out to the federated timeline, pulling things in from the federated timeline, and more. In order to keep that queue from backing up and producing huge latency between when you hit "post" and when things actually appear, it's necessary to run a significant number of connections to the database. However, postgres appears to be snagging a chunk of memory for each of those connections and then not letting it go, so we had a ton of idle processes taking up all our RAM. I was able to clear things by restarting the primary Mastodon processes, but that's not an ideal solution. So more investigation ahead!

· · Web · 4 · 2 · 17

@kfitz thank you for the hard work, and for the info - it’s easier to understand the whole system through one specific example 😀

@kfitz If you haven't had a chance to try out pgBouncer, a friend mentioned it might be worth exploring if you're running into postgres memory problems:

@kfitz @quinnanya Yeah, this sounds to me like db connections that stick around too long. It may be fixable with a configuration change in Rails.

@kfitz If you all need an extra pair of eyes on your situation I’m happy to help. This seems pretty important!

@kfitz also feel free to ignore me! I know adding more cooks doesn’t necessarily help anything 😁. But I’m here if you think I might be useful.

@kfitz Here are some tips for reducing #mastodon server load if it helps, including pgtune, pgbouncer, jemalloc, & restarting sidekiq regularly due to memory leaks

Sign in to participate in the conversation is a microblogging network supporting scholars and practitioners across the humanities and around the world.