Wednesday, February 27, 2019

An Exercise In Optimization

Sometimes when developing code you can spend to much time trying to optimize code that does not need to be optimized, especially if it makes the code harder to understand (often negatively referred to as "premature optimization").  However, sometimes optimization can be very worthwhile.

A quick example to share.  I was working on a segment of code last week that was taking a long time to run; on the order of 5+ minutes for something that seemed like it should be much quicker.  The code made sense when looked at, but it definitely felt as though this one might be worth spending time to optimize.

Upon closer inspection, I realized that a key index was missing from the database lookup.  This code has recently changed from processing a flat file to pulling information from a Sqlite database.  The table it was having to query was large, and the column it was querying against wasn't part of an index.  Upon adding the index, the run time of the code dropped from over 5 minutes down to 31 seconds.

All done!  Or not.  Once you are there, it's worth making sure you are not overlooking other possible issues.  I realized it was also querying one other table that, although it was smaller, was suffering from the same issue.  After adding an appropriate index for that table as well, the run time had dropped to 15 seconds.

But that's not all!  Upon a closer review of the code, I realized it was repeatedly looking up data with the same parameters in a lot of cases, so I added an in memory cache to cache that data as needed throughout the processing.  This dropped the total run time down to 5 seconds.

And then I realized I had also left out from the caching another lookup that was only needed when populating the cache.  After moving that last lookup to the code that utilized the cache, the run time had dropped again down to 14 milliseconds.

From 5+ minutes to 14 milliseconds.

I wasn't aware I could make it that fast, but by being mindful of what seems reasonable and being careful in my review of how the processing was operating (and not exclusively looking at just the database or just one segment of code), I was able to dramatically improve it.  Sometimes optimization can be very worthwhile.  Just use your best judgement on where it seems worthwhile, and where it would just complicate the code.

No comments:

Post a Comment