Engineers, be explicit in your Promises
Small tweaks to your asynchronous code can wind up with gigantic rewards.
Raise your hand if you've seen this one before:
export async function getItem() {
const dataOne = await getDataOne();
const dataTwo = await getDataTwo();
return {
"one": dataOne,
"two": dataTwo
};
}
Looks pretty straightforward, right? It's easy to read, it's a small amount of code that looks pretty easy to unit test, and overall this function looks like it could be found in pretty much any Javascript codebase out there. You see variations of this in just about every major language; Java, Scala, Python, C#, you name it and there is some concept of a Future / Promise / whatever you want to call it.
The senior engineers in the audience, though, are reading through this and shaking their heads, because they can see the performance problem here, and they have a one-line fix that will cut our execution time in half.
How Promises express flow of execution.
First, a quick primer, since Javascript treats Futures/Promises a little bit different. Javascript will run your asynchronous code inside the event loop, but you have very little control over how these threads are spawned or how to scale up the capacity of an application that may require a bunch of Promises running concurrently.
In Java, Scala, Kotlin, etc, you can create pools of threads at the application level. Want your Future which accesses your database to run in an isolated pool away from Futures that run expensive computations? Not a problem. The upshot of this is that you need to be more careful about how you size out these various pools, and you need to monitor the pools in order to tune them for optimal performance.
Regardless of the language, it becomes very easy to overwhelm a pool of threads very quickly, and the results of that can be stalled code, slow API responses, and an overall bad user experience. Even the slightest little issue (like what's in our function above) can balloon into a massive user experience problem.
One of my technical interview questions which catch a ton of engineers out.
First of all, for prospective applicants who have come across this article prior to interviewing with me, congratulations, you're going to get bonus points for doing your research. Tell me that you read this article!
Okay - here goes. This is a surprisingly easy question that winds up frustrating so many people. I've used it as a base skill check, but also it's a great way to see how candidates go through a flow of thinking, and seeing when candidates are brave enough to raise their hand for help. So here we are.
I have a method which winds up calling a local county API to get a local list of residents. The API always returns results in the same order, and returns a paginated list of 10 items at a time. You call this API with the request parameter "page=N" to get the various paginated results. page=1, page=2, and so on.
Since this is a local county API, not well funded and not brought up to the kind of performance that we would expect, that county API will return results after 1 second of time. No matter how you call it, it's always grinding away on some local county database that may or may not be indexed correctly, and after 1 second, it will return the json that we need.
Here's my conundrum: I want my method to grab the first 100 residents, only show me people who are old enough to vote, and then spit out those people as JSON, but I want my method to run in less than 2 seconds.
Do you have your solution?
It's amazing how many engineers get stumped by this question. Even engineers that bill themselves out as lead level wind up missing this one, badly. Here's what I wind up seeing the most:
const people = [];
for(let page=0; page<10; i++) {
const results = await callApi(page);
for(person in results) {
if (person.age >= 18) {
people.push(person);
}
}
}
return people;
I've seen a million different variations of this solution in multiple languages, either whiteboarded or written on paper or whatever. Engineers rush through writing this code and they get that really accomplished look on their faces up until the point at which I ask them how quickly it's going to run.
Then there's a bunch of erasing and scribbling and what have you, and this is when engineers get stumped, because they have that for loop just implanted in their brain, and the idea that this function winds up taking 10 seconds to run just breaks a lot of people to the point where they can't recover. (All is not lost here! If through having a collaborative conversation with me, they find their way, then they've shown me how willing they are to raise their hands and ask for help instead of grinding through something for hours.)
The easiest performance gain you're ever going to make.
So alright, what's the correct solution here? The one I would write is this:
const people = [];
for(let page=0; page<10; page++) {
people.push(callApi(page).then((results) => results.filter((person) => person.age >= 18));
}
return Promise.all(people).flat();
This will guarantee you 100 people or fewer in your response, and will take a little over 1 second, because you've spawned 10 Promises that run concurrently and you collect all of their data with the Promise.all method at the end.
Note to engineers: Promise.all is a valuable tool for you to use -- use it. It takes an array of promises and returns an array of what those promises return. Have a Array<Promise<String>>? It'll return Array<String>. This is extremely powerful.
In our case, people winds up being an Array of Arrays, and the call to flat() will flatten that array into its elements, eg [[a, b, c], [d, e, f], [x, y, z]] becomes [a, b, c, d, e, f, x, y, z].
Should you have the word await in your code?
In my opinion the answer to this question is "no". Nearly all modern JS web frameworks will accept a Promise as a return value, or will allow you to respond to the request within a Promise's resolution.
For example, AWS Lambda since Spring of 2018 allows you to return Promises to their handlers, rather than calling the traditional callback function. Express has a bit of middleware that you can incorporate to return Promises directly instead of calling next() or calling res.send().
Given this, you shouldn't need to write the word await in your code; just chain your Promises together and return the correct value.
So what do we do instead of writing the word await? There are two options available:
First: methods that are marked as async are effectively passing Promises around behind the scenes, so any of your methods that have to deal with Promises should be marked as async. You'll find that this winds up being viral - most of your methods will wind up being marked as async.
export async function getItem(id) {
return serviceLayer.getItem(id).then((item) => Object.assign({}, item, { secureProperty: undefined }));
}
Second: be more explicit! Specifically return Promises. Typescript makes this easy, you can declare that your method returns a Promise<T> and it'll throw compile time errors if you aren't returning T within a Promise. I prefer this approach, because you will tend to forget to mark a method as async, and when you're dealing with Promises directly you can more closely read and understand the flow of execution.
export function getItem(id: String): Promise<Item> {
return Promise.resolve().then(() => serviceLayer.getItem(id)).then((item) => Object.assign({}, item, { secureProperty: undefined }));
}
This code may look a bit more verbose, but you can instantly tell what you're dealing with, so six months from now you don't need to ask whether or not this method is working with a Promise.
Recovering from error states with Promises.
The easiest way I've found to deal with exceptions is at the business logic level; call your asynchronous services that return Promises, then use the catch method of a Promise to fall back to a default state. (Unless you're mutating state, you probably don't need to let those random exceptions flow back to the user!)
This winds up being easier than having a ton of try catch blocks within each service-- your service may need to deal with third party outages etc, but let your business logic determine what to do when that happens, not your individual service.
Dealing with these kinds of errors doesn't stop you from needing the occasional try catch block, however it does help with overall readability once you get into that Promise state of mind. "Here is what my code is going to do when it's successful, and here is what my code is going to do when I run into an exception."
What I've learned about readable syntax in asynchronous code.
Javascript has done a relatively good job at abstracting away the ability to create asynchronous methods, and that gives engineers the ability to write high-performing code easily. "With great power comes great responsibility" however, and it's easy to fall into traps that wind up hurting performance.
Often, syntactic sugar within your code can reduce your overall lines of code, but there are times when it's better to be a little more verbose. While the nature of Promise.resolve, reject, then, and catch can be a little daunting, once you get into that mindset it can be very expressive and the readability of your code goes way up. Promises behave in almost a functional programming style, where well-written code can be read quickly and easily.
Last, remember that asynchronous code can be viral. Methods that only hit your CPU or RAM can be synchronous, however the moment you start hitting databases or external services, you get into a situation where more and more asynchronous code gets into your codebase. Make sure that you be very explicit in writing your Promise code, and you'll be able to better support your code in the long term.