Deploying Mastodon on AWS

Mastodon (and the greater ActivityPub/Fediverse) are a fun new world to explore, so I decided to learn more about all the interesting corners by deploying and moving to my own instance. My various learnings and discoveries are documented below in no particular order, in case they are useful to others.

In hosting my own instance I had a goal of minimal maintenance with relatively low cost. I also want to use Terraform as much as possible for infrastructure, though not to configure the box itself. The Terraform I used is shared in my tf-aws-mastodon repository.

And a mild warning: Running your own Mastodon instance is relatively costly and will require ongoing maintenance effort. With my current design the costs are likely $15-25/mo (more on this below) which is far more than masto.host charges for their cheapest tier (Advantages of cheaper hosting and probably a shared Postgres cluster).

Hardware Requirements

Mastodon doesn’t appear to publish any minimum specifications yet, so I tested a few configurations to see what worked.

First Attempt

My first instance started life as a t4g.small (2GB RAM, 2 vCPU) with everything hosted on one box. Primary storage was a 50GB gp3 EBS volume set to the minimum specs.

It almost immediately ran out of memory from nearly every process creating memory pressure.

With considerable tuning I was able to get this instance size to barely work (only occasional OOMs) so I would consider this to be on the smallest end of practical.

Techniques used to reduce memory usage:

Reduce all memory use by Postgres
- Postgres is extremely friendly to low memory situations, especially on new/small instances
Reduce process count for Puma
- Puma loads three processes by default (a controller and two workers). By disabling workers entirely, I was able to reduce the memory usage to just over a third. As long as the user count is fairly low, this should be fine.
Reduce thread count for Sidekiq
- Sidekiq runs 25 workers by default (and pre-warms 25 connections to the database) but personal instances don’t need this much capacity. Reducing this to 5-10 should be fine but doesn’t have as much memory impact as you might hope.
The streaming server seems to be memory efficient so I didn’t touch it
Redis never grew enough to be a concern, but is relatively easy to memory constrain

I discuss these memory reduction changes in more depth in my follow-up post.

Second Attempt

After learning the rough requirements from my first round, I decided to split the database to its own host, allowing future tweaks to operate independently. In this design I am using a t4g.small for all the mastodon processes and a t4g.micro (1GB RAM, 2vCPU) for Postgres.

Primary storage are 25GB gp3 volumes attached to each box, with the minimum specs. This is probably overkill for the mastodon server as I am hosting everything on S3, but the cost is not substantial.

For additional savings, I suspect the postgres instance could be further reduced to t4g.nano (512MB RAM) but I haven’t yet tried that. On the micro size, it is averaging about 1% CPU usage.

Future Attempts

A reasonable question might be “why not containers (and therefore AWS Fargate)?” And to that I say “probably next!”

In short, I wanted to follow the Mastodon install from source guide which expects you to have an “all in one” instance. They also post charts for k8s and containers, so there is a clear path there as well.

Also, while I love running containers on Fargate, I have never run non-ephemeral systems there. Usually I ran containers on Fargate and databases on RDS/DynamoDB/etc, so running a whole stack there will be a new adventure.

Cost on Fargate is also likely to be higher as they have fully committed CPU unlike the t4g type instances designed for bursty traffic. Therefore, the tradeoff would be lower maintenence (due to pre-made containers) against higher cost.

Finally, since Mastodon doesn’t really suggest RAM sizing anywhere (except some commented out sections of their Helm chart) I needed some “real world” experience first.

Other Infrastructure

For media storage I chose to place everything in S3 with Cloudfront as a CDN. While there are much cheaper CDN options my usage is well within the free tier for Cloudfront. My current usage looks to consume ~25GB/mo data on Cloudfront depending on the number of images I post. Storage volume depends on a variety of factors, but with regular automated cleaning I am using about 50GB.

I am not fronting the Mastodon nginx server itself with Cloudfront as 99% of the traffic to nginx is Activitypub POSTs and that would needlessly burn Cloudfront bandwidth with no user impact.

For email I am using SES for no particular reason. For a single person instance you will never actually send any mail.

Cost

The primary cost here is EC2, at ~$22/mo ($18 for compute, $4 for EBS). S3 storage for media is likely to cost a few extra dollars, and the rest should be largely free. With savings plans, the compute cost could be reduced by about 30%, though I won’t do that until I reach a rough minimum arrangement (probably after I play with containers).

Grab Bag of Learnings

Asset precompilation will take a very long time on a t4g.small and literally lock up the box for the duration so consider installing Mastodon on a larger box, and shrinking it once the install is complete
- This is not required in containers as they are pre-baked with assets
- It is possible to compile these “off box” then copy them over, but only the first compile takes forever. Subsequent upgrades tend to be rather faster
All Mastodon components are compatible with IPv6 addresses for Postgres
- Since IPv6 addresses survive instance stop/start in AWS they are a bit easier to use internally rather than mucking about with EIPs
The official Mastodon install guide has a few errors/omissions
- The instructions for getting certificates for nginx don’t work at all but the fix is documented in issue #17375
- Filesystem permissions don’t work without running a chmod that isn’t documented as discussed in issue #5803
Load on your box is pretty strictly based on the number of statuses headed in and out
- Joining a relay will considerably increase your load (and media storage requirements) so weigh the costs. This load is continuous.
- Any post will cause every recipient instance to send numerous requests to your server causing an instant spike in load. This load is extremely localized to posts and subsequent boosts.

Don’t Forget

Unattended upgrades are enabled by default on Ubuntu, but consider configuring it to forcibly restart as needed
Everything can be rebuilt except your postgres database so get backups working (and tested)
- pg_dump/pg_dumpall work well enough for small instances and consider moving to pgbackrest when you have some spare time
- Another option is to use AWS Backup to snapshot the instance regularly
- Redis is not strictly required, but if you lose the database your timeline will start from scratch. Mastodon stores the most recent N events (800 in 4.1.0, 400 before) in a key for your user. I implicitly back this up using AWS Backup since Redis does regular persistence to disk.
Ensure your media storage doesn’t grow without bounds by creating cleanup cron jobs (or using the new admin panel)

Discussion

Please send comments my way by responding to my post on Mastodon.

Hardware Requirements#

First Attempt#

Second Attempt#

Future Attempts#

Other Infrastructure#

Cost#

Grab Bag of Learnings#

Don’t Forget#

Discussion#