Software Testing - Meh, But Even Faster!

November 14, 2024

Software Testing - Meh, But Even Faster!

Today, we continue traversing the landscape of software testing by exploring performance testing.

For a start-up or a micro-operation, performance testing may not be strictly necessary. It's important to ensure that the software functions as you expect it to before worrying whether or not it can scale. If it's a little slow at first, or occasionally wonky, but you still need to focus on the functionality, do that first.

People are forgiving when trying out a brand-new product, as long as it works without major flaws. A little hiccup or delay won't immediately put them off.

Once you've got the functionality handled, though, it's time to look at any reliability or performance bottlenecks. Or, in the case of several of the performance enhancement projects I've been a part of - areas of the stack that you want to improve in order to allow more concurrent customers to access your site.

The first thing to note before testing is to make an ironclad promise to yourself and your team that you won't write new code unless it's absolutely necessary. Performance and reliability problems are usually the result of complexity, and adding more complexity is likely to add performance and reliability problems, rather than reduce them.

The corollary to that is that reducing complexity often solves the performance and reliability problems you're investigating, often without needing to delve into a full investigation of the aforementioned performance and reliability problems. At one point in my career, we were attempting to double the throughput of hotel searches we served per minute. Most of the gains we made arose from tuning a handful of existing database queries. We didn't need to procure more hardware, and we didn't need to rewrite vast swaths of code to achieve our aims - we just needed to simplify some SQL.

This may seem ho-hum to people who just want to up their lifetime lines of code-written metric, but looking at projects like this as a mystery rather than a chore helps keep the team's engagement up. And, if you can solve a problem like doubling throughput without an additional capital outlay or a months-long project, everyone tends to be happy. Don't worry! This is planet Earth. There's always another problem to solve.

Now that we've determined we don't want to clutter our systems with more complexity, the next step is to identify what we're measuring in our performance tests. Though this sounds obvious, it's very easy to select a metric that's adjacent to the problem you're trying to solve and wander gradually further and further from the core issue into foreign territory.

Network latency is always my favorite bogeyman in these situations. In the case I outlined above - doubling the throughput of our hotel searches - it's easy to pick the number of service-to-service calls and either reduce the number of calls or reduce the time spent on each call. Now, we've inadvertently pegged network latency to our throughput problem, even though network latency may not necessarily be the problem.

Fixing the number of service calls or the time per call would've been expensive in terms of development time. We would've needed to rewrite entire portions of our stack for dubious reasons and for gains that were difficult to predict. We would've fixated on addressing a network latency problem while predominantly ignoring the main issue - increasing throughput.

While latency was the ultimate contributor to the issue above, it was due to latency within the database, which requires an entirely different approach than trying to shrink the network in the system. By focusing on the core problem of throughput and looking for innovative solutions that swatted away preconceived notions, we saved ourselves a lot of unnecessary work.

Once you've got a core metric to focus on - or even if you're still searching for one - make sure you establish a baseline for all the affected systems under test.

Don't focus only on the production systems, but also on the pre-production systems that you'll be verifying your experiments against.

Things have changed a bit with the ease of spinning up cloud-based infrastructure, but typically your test or staging system is going to be far smaller and simpler than your production system, and, therefore, not an apples-to-apples comparison.

If you can temporarily spin up your test systems to match your production systems while also handling a similar load profile, then that's obviously an ideal solution. A 1:1 pairing of test to prod is going to give you the best indicator of how changes will work in production.

Usually, though, budget and system complexity makes this difficult or impossible. If this applies to you, then work to stabilize your test environment, regardless of its size, so you can make reasoned arguments about its performance.

Also, take as many notes as possible about the discrepancies between the two systems. Maybe your test system is homogenous and your tests point to the same search locations over and over again, thus relying more heavily on caching infrastructure than your production system will.

Or, maybe the opposite is true, and your production systems issue a handful of very common queries and the test system is designed to try more complex scenarios. In either case, you'll want to align your testing scripts to the realities of the production system as much as possible.

Don't worry if alignment only gets you so far. Your goal here is to develop a formula that correlates the performance in one system with the other. Ideally, you'd be able to say that, if our staging system can handle a throughput of 500 searches per minute, that means production can handle 10k.

Chances are you'll never achieve such a simple relationship, but you're looking for patterns and predictability. Even if your variability is as wild as 25%, you can still say that 500 searches in staging will result in 7500 to 12500 searches in production. You can then use that to guide your work or assume and appropriate amount of risk when releasing your changes based on the level of variability.

Finally, when it comes to testing - or improving performance in general - watch out for premature optimization. If you've got a specific goal - doubling your search capabilities to 10K per minute - stick to that goal. There are tendencies to modularize your systems to make it better for future extensions. Except for the cases where this is blindingly obvious (and I hesitate even saying that, since everyone tends to think their premature optimization is blindingly obvious), don't do it!

Performance enhancements in particular tend to be one-offs. Rarely can you apply the exact same template to the next bottleneck, unless you're dealing with a lowest common denominator situation. Adding code to modularize it in the future just adds complexity and, thus, a greater likelihood that you're working against your aims.

If you don't have a specific goal, avoid the temptation to tinker or overthink to preemptively improve performance. People are very reluctant to add a new service call or reconfigure parts of the stack for the dreaded fear of adding another "hop" to the network. Engineers often make statements against network latency without knowing whether or not the current latency is poor.

If you add a new hop to the network, it greatly simplifies the architecture, and you're still within your permitted bounds for performance, do it! Worry about that network hop when it's an issue. Don't fight against a simple solution based on a remote chance it might make things slightly slower.

That's it for my performance-testing thoughts. I'll round out the testing series by addressing manual testing and the role of QAs next.

Until next time, my human and robot friends.

Search This Blog

Chicago Bot Dog

Software Testing - Meh, But Even Faster!

Comments

Post a Comment

Popular Posts

Words Matter

Coastal Chicago - Year 2: What's Next?