A laziness trick to relax your code

LMAX Exchange

So, this week on Distilled Derivatives, some more laziness. And also, exceptions. Everyone loves them, really useful things for letting you move error handling into a single location that they are. Now and again though, their misuse can really get in the way of getting something done.

Retrieve the content of the first functioning url

Given n urls in some ordered collection, get the content of the first “working” url, where working is defined to mean “has http code 200 and non empty response content”. Once you’ve found it, put the response body of the url into some local file.

This should be easy, thought I. How wrong I was.

First problem

All the java http libraries throw IOException. Ok, that’s not too bad, we can get past that with only a little evil. Here’s our first go.

  def withUrlConnection() : String = {
    for (url <- urls) {
      try
      {
        val actualUrl = new URL(url)
        val connection = actualUrl.openConnection().asInstanceOf[HttpURLConnection]
        return stringFromHttpUrlConnection(connection)
      }
      catch {
        case e: IOException => {}
      }
    }
    throw new RuntimeException("all failed")
  }

This isn’t too bad – we’re only teetering on the edge of breaking the “don’t use exception for control flow” rules, rather than falling into the abyss. We can simply take the resulting String and write it to a file (implementation elided). Oh, and in case you were curious, here’s the implementation of stringFromHttpURLConnection (we’ll need it again later).

N.B I’ve completely elided any closing of any connection everywhere in this article, because like the technique I’m trying to describe, I am lazy.

  val stringFromHttpUrlConnection: (HttpURLConnection) => String = { connection : HttpURLConnection =>
    Source.fromInputStream(connection.getInputStream).getLines().mkString("n")
  }

The problem worsened, however, when some kind soul tried to help out by creating a function like the following:

Problem 2 – an unhelpful helper

  def withUnhelpfulFunction(path: String) {
    var done = false
    for (url <- urls) {
      try {
        if (!done) {
          downloadFile(url, path)
          done = true
        }
      } catch {
        case e: IOException => {}
      }
    }
  }

  def downloadFile(url: String, path: String) {
    // lobs url response body into path on success, IOException if not. Implementation too boring to include.
  }

At first glance this function looks like it might save us some code, but once you start talking about failure or multiple urls, we’re screwed. It’d be perfect if we wanted a one shot download that noisily failed, and that’s probably the use case the original author had in mind. In general the approach the function takes is a bit evil – it’s performing IO and letting us know we succeeded by not throwing an exception. That’s not the world’s greatest signal.

So, two quite ugly solutions with exceptions. Is there another way?

My hobby: getting lazy evaluation to write my program for me

Functional programming has given us some excellent tools to help decompose this problem, and there’s a neat lazy evaluation trick that can get us out of jail here. What do we need? A function that will:

  • synchronously resolve a url to the response body
  • include the possibility of failure in its return type

Roughly, this is equivalent to URL -> Maybe String (or Option[String], in Scala land). Try as I might, I can not find a scala or java http client library that follows this approach. Dispatch looks terrific but is also asynchronous, and I don’t really want to talk about futures in this post. Handily, numerous other sandbox projects I’ve tinkered with have already forced me into writing my own, so I have a handy function lying around.

  def withHttpLib() : String = {
    val requestor: HttpRequestor = new InsecureHttpRequestor()
    val results = urls.map(url => requestor.performGet(url, stringFromHttpUrlConnection))
    results.find(_.wasSuccessful()).getOrElse(throw new RuntimeException("all failed")).getResult
  }

Look ma, no for or if! Even better, this approach appears much shorter, however, this is mostly because much of the responsibility has moved into the little http library. Let’s walk through the execution of this program.

A little secret

  val urls = Stream("www.digihippo.net/home.html",
                    "www.digihippo.net/index.php",
                    "www.digihippo.net/foo.bar")

The structure the urls are sitting in is actually Scala’s excellent Stream abstraction. By default, operations on Streams are applied lazily. For example, if we invoked map on a particular instance of Stream, execution of the function we pass in is deferred until someone tries to extract an element from the Stream. In our earlier examples, we use Scala’s for comprehension to iterate across urls. In our favourite implementation, you can see that we’ve started applying map to it instead.

N.B These won’t actually work if you do new URL(url) on them. They’re in the current format because my library doesn’t yet work out whether to connect securely or not from the passed in url String.

What’s this http library doing then?

    val requestor: HttpRequestor = new InsecureHttpRequestor()

The requestor here is an object from my http library. It’s insecure because it doesn’t deal with https (it doesn’t lack confidence), just so you know. To execute a given request, one provides a url, and a response decoder – the requestor returns you a type parametrized Http object, which may contain the thing you wanted to decode, or an error. What’s happening underneath? Well, something like this:

  • Attempt to get url
  • On failure, return Http.failure() with some helpful details
  • On success, attempt to parse using provided response decoder
  • If response decoder succeeds, return whatever it decoded
  • If it failed, return Http.failure() with some helpful details

Hold on though – why does our decoder get the opportunity to return any sort of Http? Why doesn’t it just attempt a parse and give up? Well, we have to really think about what makes any given http request a failure. The definition in this post is not necessary universal – sometimes we expect a 404, or a 500, and still wish to parse something from the resulting connection.

As such, at the moment, the decoder can examine any aspect of the created HttpURLConnection and decide for itself, if it wants to, that this request was a failure. Our decoder does no such thing though – it just attempts to grab the response body and whack it into a string. The library will helpfully catch any exception resulting from playing with HttpURLConnection and return you an unsuccessful Http instance with the error information stashed inside.

What is this line really doing?

    val results = urls.map(url => requestor.performGet(url, stringFromHttpUrlConnection))

So, hands up – who thinks we’re firing off http requests as we invoke map here? No-one? Good. Glad to see people are keeping up. What we’re really doing here is creating a new Stream, which at the moment is actually a collection of functions that, when called, will return an Http[String]. So this line actually has no side effects, we’re really just defining how to compute the next elements of results, rather than computing them.

This is where the magic happens

    results.find(_.wasSuccessful()).getOrElse(throw new RuntimeException("all failed")).getResult

Now, the density of this line is considerably higher than the others. If that’s a problem for you, you might be a n00b. So, what the hell is happening?

Well, that find invocation would be better named findFirst. It traverses results, examining each element to see if it was a success. That means it needs to compute the element! So, find actually does this:

  • Execute function to obtain element. N.B this finally executes our map function
  • Examine result to see if it was successful
  • If not, repeat with next element

So, in fact, just by defining results in the way that we have means that find expresses the control flow of our application for us. One could legitimately criticize the first two solutions for repeating this standard library code, and, in fact, functional languages go out of their way to make extracting this sort of legwork easier. If you’re wondering why this is important, measure how many times you write try/catch/finally in java (or similar) next week.

The remainder of the line does the necessarily evil to fulfil something like the original contract, where we wanted a String that we could later write out to a file. We need to extract the actual value from the find result (find actually gives us back Option[Http[String]], because there’s no guarantee any of our results matched), throwing if none were successful, and then we need to extract the String out of the found successful request.

An apology

Purist functional programmers will likely be groaning at this point. The side effects of our little getter don’t happen at the top of our program, necessarily, and the usage of Option (and the Http type, which is pretty similar to Option) is not particularly idiomatic (hitting get like functions on monadic types is frowned upon).

Some side notes

Interestingly this very same function was implemented as part of a bash build script elsewhere, and it embodies a lot of the lessons that we’ve seen in the code here. Here’s a cut down equivalent.

#!/bin/bash

function noisy_curl() {
    echo "issuing request to get $1"
    return `curl -s -f $1`
}

noisy_curl http://www.digihippo.net/main.html ||
noisy_curl http://www.digihippo.net/index.php ||
noisy_curl http://www.digihippo.net/site.html

This is, given how much effort we’ve gone to set up the same effect in Scala, embarrassingly trivial. One can start to see the elegance of the unix way of doing one simple task, and writing the result to stdout, while also indicating success with a return code. Here we use the -f flag to tell curl to ‘fail’ if we get a suitably erroneous looking response code, and then just set up a series of curl invocations such that the first success short circuits the others. One starts to agree with all those people on the internet who think UNIX pipelining looks a helluva lot like composition as seen in functional programs.

We’ve come a long way baby.

So let’s stop (for now). We could certainly expand on our example by doing something more interesting that putting that result String into a file. We could consider approaches where we never parse the whole String but instead hold on to the InputStream instead. These deserve to be posts in their own right though, so we will not discuss them further here.

What did we learn?

  • Lazy evaluation allows you to elide boring control flow from your code. More generally, a functional approach can help you separate the valuable parts of your code from arduous syntactic toil
  • Wrapping exceptions into return types at the point where they are thrown makes life so much easier
  • Communicating errors in exceptions is really only good for one shot calling patterns

…and, again, more generally, exceptions are vastly overused. How often per day do you get a broken web page when browsing? Is it really exceptional? Are browsers dealing with 404 or 500 via an exception path? I seriously hope not. Exception abuse seems particularly prevalent in java, perhaps because so many libraries use IOException as their IO alarm, and possibly also because the language makes it so difficult to work with something like the Http type. Once again, I find myself thinking that Java 8 really cannot arrive soon enough.

Moral of the story

Stop throwing all that stuff around and relax, dude.

Any opinions, news, research, analyses, prices or other information ("information") contained on this Blog, constitutes marketing communication and it has not been prepared in accordance with legal requirements designed to promote the independence of investment research. Further, the information contained within this Blog does not contain (and should not be construed as containing) investment advice or an investment recommendation, or an offer of, or solicitation for, a transaction in any financial instrument. LMAX Group has not verified the accuracy or basis-in-fact of any claim or statement made by any third parties as comments for every Blog entry.

LMAX Group will not accept liability for any loss or damage, including without limitation to, any loss of profit, which may arise directly or indirectly from use of or reliance on such information. No representation or warranty is given as to the accuracy or completeness of the above information. While the produced information was obtained from sources deemed to be reliable, LMAX Group does not provide any guarantees about the reliability of such sources. Consequently any person acting on it does so entirely at his or her own risk. It is not a place to slander, use unacceptable language or to promote LMAX Group or any other FX and CFD provider and any such postings, excessive or unjust comments and attacks will not be allowed and will be removed from the site immediately.