Software Development

An exponential backoff is an algorithm that repeatedly attempts to execute some action until that action has succeeded, waiting an amount of time that grows exponentially between each attempt, up to some maximum number of attempts. This can be a useful way to manage network calls to external services, so that temporary errors or glitches don’t cause permanent failures.

The basic structure of an exponential backoff algorithm is a loop that may be implemented like this:

That is, you are creating a loop that’s going to run a set number of times (in this case up to 5 times). Within that loop, you have a try block which attempts to execute the main code you want to run. If that code succeeds, then the loop must immediately end. Otherwise, if an exception is thrown, it is caught and either thrown outside of the loop, if we’re on our last try, or the program waits for a little while before moving on to the next iteration of the loop and trying the code again.

We want the amount of time that the program waits between attempts to do two things. First, we want it to grow exponentially on each loop iteration; if multiple retries fail within a few seconds, we want the next try to wait a little longer before continuing. This is a safeguard against rate limiting, for example, where the main operation we’re making can only happen a certain number of times during some defined interval. Second, we want the retry to have some element of randomness to it. This helps if the program is running multiple times simultaneously; a conflict that resulted between them is all but guaranteed to happen again if both instances of the program try the action again at the same time, every time. It also helps with cases where you’re accessing an external web service that may have been down for a time: you don’t want to have everyone spamming requests at that server as soon as it’s back up, lest it be brought down again. Better to space them out.

Now, in many cases, you won’t need to directly implement an exponential backoff algorithm in your own code; many libraries come with the feature built in. For example, Google’s HTTP Client Library for Java comes with a configurable exponential backoff handler built in. For cases where you’re not using a library that supports it, you’ll likely want to build something for your application that implements an exponential backoff nicely, without having to explicitly write a loop every time you want to make an API call retry-able. Carlos Alexandro Becker wrote a nice little backoff algorithm for Java 8 using functional interfaces that does the trick, though the algorithm differs a bit from what I have above.

At any rate, there’s a look at a basic backoff algorithm that can be implemented in Java.

Software Development

Just a quick one in case this saves someone some time. If you’re used to using Google Cloud Datastore with Google App Engine (in Java), then you know that large string values are passed into the App Engine Datastore API using the Text type. But, if you’re using the gcloud libraries instead ( com.google.cloud.datastore), you don’t have this option. Strings are passed into the entity builder as String objects, or as StringValue objects. E.g.:

So, what do you do when you need to store a String larger than 1500 bytes? It turns out that the Datastore itself sets the 1500-byte limit for indexed properties. You can add a String up to 1MB in size if you explicitly set the property to be unindexed. The Text type in the App Engine libraries is essentially a facade for an unindexed String value.

In order to set an unindexed property in your Java code, you need to define your String as a com.google.cloud.datastore.StringValue object, and set it to be excluded from indexes. Then, pass the StringValue into your entity builder, and you’re good to go:

I saw some answers on how to do this with Node.js, but I had to look up the javadoc for StringValue.Builder to sort this out in the Java libraries.

It’s worth noting that it’s a good practice to set properties as unindexed if you’re not actually going to use them in any Datastore queries, to improve performance and reduce Datastore-related costs.