Member-only story
Engineering at scale is much harder but also much simpler than people realize.
At a small scale, almost anything can work. At big scale, choices become constrainedโฆ
In the past, the first thing people did to scale applications was to buy a bigger computer. This works OK for a ๐ท๐ฆ๐ณ๐บ ๐ฃ๐ณ๐ช๐ฆ๐ง period. Pretty soon you run out of โbiggerโ or the next step up costs something like the GDP of a small country.
So the only way forward for reasonable people is scaling horizontally. Splitting your work across lots and lots of small/medium machines.
Thereโs a lot of advantages here. Small/medium machines are plentifully available, cheap, and there is an intrinsic redundancy. Thereโs just one BIG downside.
You have to figure out: How do you split the work correctly, accurately, and efficiency across 5, 10, 100, 10,000+ machines?
And this is where the difficulty and simplicity of distributed systems comes in. Turns out there are a pretty limited set of ways to make this work. So you need to be ๐๐ฒ๐ฟ๐ ๐ฎ๐๐ฎ๐ฟ๐ฒ, ๐ฎ๐ป๐ฑ ๐ฐ๐ต๐ผ๐ผ๐๐ฒ ๐ฐ๐ฎ๐ฟ๐ฒ๐ณ๐๐น๐น๐.
Those successful methods almost inevitably involve at least 3 key things:
1. Partitioning/sharding/splitting data and computation.
2. Decoupling different parts of the system through messaging systems and async processing
3. Caching
Understanding these concepts is the simple part. Applying them in practice is the hard part.
In future posts, Iโll go over the key techniques used to scale systems.