Memoization in Javascript
Memoization is one of those great things which makes you ask yourself: “how could I have lived without this?”. This isn’t all about science, it’s about performance awesomeness.
Before we start talking about how memoization works and when to use it, we should get to know what this term actually means, according to a great ancient god of the internet called Wikipedia, this is the definition of memoization:
In computing, memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
Thanks, Wikipedia, this seems like a great and simple explanation.
When should I use memoization?
Well, you should use memoization when you have expensive operations which can have their results cached for later use, this way you won’t be recalculating things and spending unnecessary CPU time.
This technique is specially useful when it comes to dynamic programming, because when we dissolve our code into smaller parts we are able to cache the results only for the subproblems we are dealing with instead of storing every procedure’s result.
Another extremely important factor to decide whether or not you should use memoization is the reusability of the problem’s results. For example: if you’ve got a function that gets the current UNIX timestamp and does an operation using this value you certainly won’t want the result of this function to be cached, because everytime you call it the result may be different, therefore not being cacheable (unless you want some bugs).
You should be aware that my answer is very simplistic and there are other factors you should take into account when using memoization, such as the range of potential inputs and most common inputs. If you are interested into another interesting examples and considerations about memoize please read this StackOverflow discussion.
Enough talking, it’s time we get to code. Open your favorite text editor and follow me!
I’m ready, give me some memoization awesomeness!
Let’s have a humble start and use a common and easy to understand example: the Fibonacci’s sequence.
Our function will calculate the Nth Fibonnacci number recursively. Considering the starting numbers of the sequence as 0 and 1, this is the function we’re going to use:
Now that you’ve seen our raw function, let me show you the whole code we will execute so that you can see every meaningful data printed to the console:
Please notice that I’ve also created a counter in order to know how many times we’ve calculated the fibonacci
value for that index. I also measured the elapsed time between the start and the end of the calculation so that we can compare the results before and after the use of the memoization technique.
When calling our fibonacci
function with 5
as an argument, for example, this is what happens:
- Our function calls
fibonacci(4)
andfibonacci(3)
, because the 5th Fibonacci number is equal to the 4th + 3rd numbers. - Each of these calls does the same thing, asking for the sum of the two fibonacci numbers immediately before
- When asking for
fibonacci(1)
orfibonacci(0)
the function will return the argument value itself
If you’re having trouble understanding this recursion I highly recommend you read the answers to this question in StackOverflow.
Give it a chance and run the code above (preferrably on Node.js so that your browser won’t get stuck). You will notice that the bigger the Fibonacci’s number index gets the most times the fibonacci
value is calculated. You will also notice a meaningful increase in the elapsed time.
When running the code above to get the 20th fibonacci number you will notice that fibonacci(n)
gets calculated 21891 times. This seems a lot (and, in fact, it is), so here comes memoization to the rescue.
We are now going to store the results for the operations happening inside our function on an outer function, which also contains our main fibonacci
calculation algorithm. They key for caching each result will be the input number that was provided. Take a look at our new code:
Basically, it does the same thing as before, it is still recursive, but this time, instead of calling itself and calculating a new result for every number, it checks if that input has already been used and then, if it was, it retrieves the result for that.
For this to work, I just had to create an object called resultsCache
which has key/value pairs in which the key
is the input provided for the function that calculates the fibonacci number and the value
is its result. If that input has not been used yet, I register the result for it on our resultsCache
object before returning the value.
Well, now we’ve got our function let’s add a little bit of code in order to measure it’s performance:
So, we’ve got the same algorithm applied both with and without memoization, let’s run both and compare the results:
- 20th Fibonacci Number (Result: 6765)
- Without memoization
- Elapsed Time: 2ms.
- Calculated fibonacci 21891 times.
- Memoized
- Elapsed Time: 0ms.
- Calculated fibonacci 21 times.
- Without memoization
- 30th Fibonacci Number (Result: 832040)
- Without memoization
- Elapsed Time: 23ms.
- Calculated fibonacci 2692537 times.
- Memoized
- Elapsed Time: 1ms.
- Calculated fibonacci 31 times.
- Without memoization
- 50th Fibonacci Number (Result: 12586269025)
- Without memoization
- Elapsed Time: 564623ms.
- Calculated fibonacci 40730022147 times.
- Memoized
- Elapsed Time: 0ms.
- Calculated fibonacci 51 times.
- Without memoization
As the desired fibonacci index increases, so does the difference between memoized and non-memoized routines. Discovering the 50th fibonacci number without applying the memoize technique took us 564623 milliseconds (which is about 9 minutes) while the memoized version of the same code ran in less than 1 millisecond.
At this point you may be thinking if there is an easier way to create memoized functions. I think we both agree that doing what we just did for every single function we want to apply memoization on our code would be a pain. So, let me answer you: yes there is an elegant way for doing that.
Show me the “Elegant Way”!
Thanks to JavaScript awesome language design we are able to return functions from functions, so we will use this “feature” to return a function which has a cache property inside itself. The code I’m going to show you here was heavily inspired by this article written by Addy Osmany in which he talks about memoization techniques and performance. This may not be the best implementation when it comes do performance, but I’m keeping it simple because I want readers to comprehend memoization and its benefits as a whole instead of implementing the faster available algorithm.
That’s it! Now we’re able to use the memoization technique on any function just by passing it to our memoize
function. Just like:
How about we use this technique with our old fibonacci function? Be careful, just assigning the memoized function to a new variable won’t be enough, because it’s a recursive function, what we’ve really got to do is replace itself with the new memoized function. The whole code would look like this:
And now we’re finally done. I hope you have enjoyed this wild ride through the world of memoization. Before saying goodbye I would like to add some really useful links in which this article was based, they’re all really cool and I’m sure you will learn something with them:
See you soon!
In this post you should’ve learned:
- What is memoization and how it works
- How to decide whether or not you should use memoization
- How to implement memoization on JavaScript