Engineering At Scale

2 min readDec 21, 2022

Engineering at scale is much harder but also much simpler than people realize.

At a small scale, almost anything can work. At big scale, choices become constrained…

In the past, the first thing people did to scale applications was to buy a bigger computer. This works OK for a 𝘷𝘦𝘳𝘺 𝘣𝘳𝘪𝘦𝘧 period. Pretty soon you run out of “bigger” or the next step up costs something like the GDP of a small country.

So the only way forward for reasonable people is scaling horizontally. Splitting your work across lots and lots of small/medium machines.

There’s a lot of advantages here. Small/medium machines are plentifully available, cheap, and there is an intrinsic redundancy. There’s just one BIG downside.

You have to figure out: How do you split the work correctly, accurately, and efficiency across 5, 10, 100, 10,000+ machines?

And this is where the difficulty and simplicity of distributed systems comes in. Turns out there are a pretty limited set of ways to make this work. So you need to be 𝘃𝗲𝗿𝘆 𝗮𝘄𝗮𝗿𝗲, 𝗮𝗻𝗱 𝗰𝗵𝗼𝗼𝘀𝗲 𝗰𝗮𝗿𝗲𝗳𝘂𝗹𝗹𝘆.

Those successful methods almost inevitably involve at least 3 key things:

1. Partitioning/sharding/splitting data and computation.
2. Decoupling different parts of the system through messaging systems and async processing
3. Caching

Understanding these concepts is the simple part. Applying them in practice is the hard part.

In future posts, I’ll go over the key techniques used to scale systems.

Engineering At Scale

Written by Brenn