An Intro to Scaling Basics
I recently made commented on a LinkedIn post discussing the differences between "coding" and "software engineering" the other day (though I'll still assert that the former is a skill of the latter and is not a separate profession), that boiled down to - "it's possible that coding may come easy to some people, but software engineering is a much more complex set of skills that won't be overtaken by GenAI anytime soon. These skills include things like scaling the system."
Someone was intrigued enough to ask me for some basic scaling assistance and my response - as it will be here - was that scaling is very nuanced. Still, there are certain principles that hold regardless of what your particular use case is.
My experience is predominantly around backend scaling, so I don't have domain-specific tips at this juncture for unblocking Javascript issues on the front end, but many of the principles should apply in a general sense regardless.
Rule #1 when working toward a scalable solution is - don't scale up for an order of magnitude more than your current expected load. I'm defining orders of magnitude as powers of 10, starting at 10^0 or 1 user.
People often get too carried away thinking about the load they want the system to handle rather than the load it should handle. This isn't limited to neophytes. Entire organizations will overdesign a system in order to match where they want to be rather than where they are.
This is self-defeating for two primary reasons. First, scalable code is necessarily more complex than code built for one user. You need to handle several more edge cases between various types of users and need to support far more systems than just a local PC, which means more ways to fail.
Second, the problems you think you're guarding against preemptively aren't the same scalable problems you'll actually face when you reach the actual user load you desire. That means you've already got a ton of complex code that needs to be worked around in order to add more complex code to solve the issue at hand.
So, start with meeting the needs of one user - either a theoretical user or a Sugar Daddy who will pay you indiscriminate sums just to manage their collection of ant farms through a glorified spreadsheet. If you meet that use case, don't worry about creating generalized ant farm tracking software. Just run with it.
Once you've worked out your scalability for 1 - i.e. you're feature-complete from your initial requirements and have tested sufficiently to ensure the site is usable - you can think about scaling up to 10 users.
I'll admit that I'm exaggerating a bit. Chances are good that if you've scaled for 1 user you've probably scaled for 10 users, but don't take scalability for granted.
If, for instance, your application can only handle single-threaded requests, you may start to see performance problems for 10 users if all users are expected to be online at the same time. This exposes another key point about scalability - know your load profile.
Let's assume that you've scaled up to 10,000 users. Well, are those users all online at the same time, or are they split among different time zones? Scaling for those 2 profiles will be significantly different.
With that in mind, there are a few ways you can take advantage of existing offerings that may significantly reduce your headaches around scalability, especially if you have a small engineering team.
If you're producing static content like a blog or images of your artwork, you can get big gains using something like GitHub pages. GitHub doesn't charge for deploying these types of sites but you may have to pay a nominal fee (usually around ~$15/year) if you want to use a customized domain and may need to pay for additional storage and bandwidth if you're hosting a lot of content, but this is still significantly cheaper (say $50/year for the domain, storage, and bandwidth) than hosting an interactive website.
You'll likely face issues with huge amounts of content, but it's reasonable to assume that most use cases for static sites can handle user loads well into the 1000s with just the basics.
Once you've crossed into the interactive website zone (something that requires a database or other means of maintaining your site's state) things begin to get a bit more interesting and a bit more expensive. This calculator provides a good back-of-the-envelope estimate for load. This article also gives good guidance for estimates.
It seems reasonable that for about $10/mo. you can scale up to about 10k users with few headaches. There are several platforms that permit you to run on their platform and handle autoscaling for you. At low traffic levels, they're either free or low cost (around $5-12/mo), but begin to get pricier with higher loads, so pay attention.
Cost, however, is relative. For you, $100/mo. might be reasonable. If so, it's likely that you can scale up to 100K users for most cases with relative ease while relying on someone else to operate your infrastructure.
It's also likely that, at some point, the autoscaling mechanisms might break down or appear too costly to continue to invest in.
If you've reached that point, you'll want to look into container management systems, with managed Kubernetes as the leading candidate for reaching your next scaling break point. At any stage of the journey, my advice is to leverage the existing tools available, even if it means vendor lock-in. This goes double if your engineering team is small. You're paying a premium in lieu of a seasoned SRE team and the expertise they provide. Keep in mind that even seasoned SREs look for the simplest possible solution rather than attempting to reinvent the wheel. There are big gains and peace of mind to be had by offloading complexity to someone who makes it their full-time job.
This isn't to say that in the epic "build vs. buy" conversation, you should always choose "buy," because, even among reputable vendors, you may get nickel-and-dimed for every service and will likely get sold a horse that's been painted black and white. It's the heart of capitalism to over-market and under-deliver.
But you should evaluate every major purchase, balancing your own instincts against the advice of your engineering team (and management team). Remember, the ultimate goal is to release and maintain your software easily and at a cost you deem reasonable. Using those two as the variables to optimize against, you can be more pragmatic in your choices.
As I mentioned above, these are all very general guidelines, because every scaling situation is unique. If your core clientele is data scientists or ML engineers, chances are good that scaling to 10 or 100 users is going to be much different than scaling to 1000 or 100K users for an e-commerce site.
And even for the companies that are known for massive scaling, like Google, Netflix, or Amazon, the problems will be extremely varied.
For instance, Netflix relies on content that isn't likely to change too often but is very resource-intensive to serve up. They don't need to ensure immediate consistency in the experience so much as they need to make sure that bandwidth doesn't lag on their end.
Google indexes everything under the sun, but - at least with their core search business - they can have some inconsistency as long as everything returns results quickly.
Amazon needs to manage inventory across a seemingly infinite number of items and ensure that inconsistencies are minimized, so users aren't hit with the nasty shock of racing to buy the last model in stock for the latest Beanie Baby.
And all of those pale in comparison, consistency-wise, to travel website scaling. Having worked at two large travel websites, the need to scale to the same number of users isn't as great as the previous companies I mentioned, but the data needs to be as accurate as reason dictates.
People get emotional about travel and vacations and don't like seeing prices spike in between requests or, worse, seeing their dream trip evaporate before their eyes. Items on Amazon can be restocked and people will sigh and accept the hiccup. While travel can be delayed, two people cannot eventually occupy the same plane seat at the same time. If the destination is time-bound (like, say the Olympics, or a Taylor Swift concert), it's imperative that people have accurate information before shelling out a lot of money.
So, if some Product Manager quotes a Google white paper stating that every additional second of search time loses X number of customers, remind them that the scalability of the particular application matters. People would much rather wait 10 seconds for good flight results than get stale flight results that will be repriced immediately after a 1-second search, completely erasing the entire gains of a quick search.
Finally, while there are lessons to incorporate from previous scaling attempts, assume you're going to fail at every step. The best method for moving forward is to measure the immediate bottleneck, do just enough to fix it, and move on to the next bottleneck. It can be frustrating, but learning the basics of scalability and the patience that must accompany it, are valuable and rewarding skills to accumulate.
Until next time, my human and robot friends.
Comments
Post a Comment