Haskell Is Actually Practical

Haskell has all these language features that seem cool, but then you wonder, what is this actually good for? When am I ever going to need lazy evaluation? What’s the point of currying?

As it turns out, these language constructs come in handy more often than you’d expect. This article gives some real-world examples of how Haskell’s weird features can actually help you write better programs.

Lazy Evaluation

Lazy evaluation means that Haskell will only evaluate an expression if its value becomes needed. This means you can do cool things like construct infinite lists.

To take a trivial example, suppose you want to write a function that finds the first n even numbers. You could implement this in a lot of different ways, but let’s look at one possible implementation (in Python):

def first_n_evens(n):
    numbers = range(1, 2*n + 1)
    return [ x for x in numbers if x % 2 == 0 ]

Here we construct a list with 2n numbers and then take every even number from that list. Here’s how we could do the same in Haskell:

firstNEvens n = take n $ [ x | x <- [1..], even x ]

Instead of constructing a list with 2n elements, we construct an infinite list of even numbers and then take the first n.

Okay, so that’s pretty cool, but what’s the point? When am I ever going to use this in real life?

Why It’s Useful

I recently wrote a simple spam classifier in Python. To classify a text as spam or not-spam, it counts the number of blacklisted words in the text. If the number reaches some threshold, the text is classified as spam. ¹

Before reading further, think for a minute about how you could implement this.

Originally, I wanted to write something like this.

filter the list for only blacklisted words

see if the length of the list reaches the threshold

Here’s the equivalent Python code:

class LazyClassifier(Classifier):
    def classify(self, text):
        return len(filter(self.blacklisted, text.split())) >= self.threshold

This code is simple and concise. The problem is, it requires iterating through the entire list before returning, which wastes a huge amount of time. The text might contain tens of thousands of words, but could be identified as spam within the first hundred.

I ended up implementing it like this:

class ImperativeClassifier(Classifier):
    def classify(self, text):
        words = text.split()
        count = 0
        for word in words:
            if self.blacklisted(word):
                count += 1
                if count >= self.threshold:
                    return 1
        return 0

Instead of using higher-order functions like sum, this implementation manually iterates over the list, keeping track of the number of blacklisted words, and breaks out once the number reaches the threshold. It’s faster, but much uglier.

What if we could write our code using the first approach, but with the speed of the second approach? This is where lazy evalution comes in.

If our program is lazily evaluated, it can figure out when the count reaches the threshold and return immediately instead of waiting around to evaluate the whole list.

Here’s a Haskell implementation:

classify text = (length $ filter blacklisted $ words text) >= threshold

(For those unfamiliar with Haskell syntax, see note.²)

Unfortunately, this doesn’t quite work. If the condition is true for the first k elements of the list then it will also be true for the first k+1 elements, but Haskell has no way of knowing that. If you call classify on an infinite list, it will run forever.

We can get around this problem like so:

classify text =
    (length $ take threshold $ filter blacklisted $ words text) == threshold

Note that the take operation takes the first k elements of the list and drops the rest. (If you call take k on a list with n elements where n < k, it will simply return the entire list.)

So this function will take the first threshold blacklisted words. If it runs through the entire list before finding threshold blacklisted words, it returns False. If it ever successfully finds threshold blacklisted words, it immediately stops and returns True.

Using lazy evalution, we can write a concise implementation of classify that runs as efficiently as our more verbose implementation above.

(If you want, it is also possible to do this in Python using generators.)

Partial Application

In Haskell, functions are automatically curried. This means you can call a function with some but not all of the arguments and it will return a partially-applied function.

This is easier to understand if we look at an example. Let’s take a look at some Haskell code:

add :: Int -> Int -> Int
add x y = x + y

add is a simple function that takes two arguments and returns their sum. You can call it by writing, for example, add 2 5 which would return 7.

You can also partially apply add. If you write add 2, instead of returning a value, it returns a function that takes a single argument and returns that number plus 2. In effect, add 2 returns a function that looks like this:

add2 y = 2 + y

You could also think of it as taking the original add function and replacing all occurrences of x with 2.

Then we can pass in 5 to this new function:

(add 2) 5 == 7

In fact, in Haskell, (add 2) 5 is equivalent to add 2 5: it calls add 2, which returns a unary function, and then passes in 5 to that function.

A similar function could be constructed in Python like so:

def add(x):
    return lambda y: x + y

Then you could call (add(2))(5) to get 7.

Why It’s Useful

To take a simple example, suppose you want to add 2 to every element in a list. You could map over the list using a lambda:

map (\x -> x + 2) myList

Or you could do this more concisely by partially applying the + function:

map (+2) myList

It might seem like this just saves you from typing a few characters once in a while, but this sort of pattern comes up all the time.

This summer I was working on a program that required merging a series of rankings. I had a list of maps where each map represented a ranking along a different dimension, and I needed to find the sum ranking for each key. I could have done it like this:

mergeRanks :: (Ord k) => [Map.Map k Int] -> Map.Map k Int
mergeRanks rankings = Map.unionsWith (\x y -> x + y) rankings

(Note: unionsWith takes the union of a list of maps by applying the given function to each map’s values.)

With partial application, we can instead write:

mergeRanks = Map.unionsWith (+)

This new function uses partial appliation in two ways. First, it passes in + instead of creating a lambda.

Second, it partially applies unionsWith. This call to unionsWith gives a function that takes in a list of maps and returns the union of the maps.

Notice also how mergeRanks is not defined with any arguments. Because the call to unionsWith returns a function, we can simply assign mergeRanks to the value of that function.

Perhaps this example is a bit on the confusing side; I intentionally chose a complex example that has real-world value. Once you grok partial applications, they show up more often than you might think, and you can use them to perform some pretty sophisticated operations.

And I haven’t even mentioned function composition.

Here’s a more complicated usage of partial application combined with function composition that I wrote this summer. See if you can figure out what it does.

yearSlice year stocks =
  filter ((>0) . length) $
  map (filter ((==year) . fyearq)) stocks

In one program of about 500 lines, I wrote about a dozen pieces of code similar to this one.

Pattern Matching

Pattern matching gives us a new way of writing functions. To take the canonical example, let’s look at the factorial function. Here’s a simple Python implementation.

def fac(n):
    if n == 0:
        return 1
    else:
        return n * fac(n - 1)

And the same program written in Haskell:

fac n =
    if n == 0
    then 1
    else n * fac (n - 1)

But we could also write this using pattern matching.

fac 0 = 1
fac n = n * fac (n - 1)

Think of this as saying

the factorial of 0 is 1

the factorial of some number n is n * fac (n - 1)

So pattern matching is more declarative rather than imperative–a declarative program describes the way things are rather than what to do.

Why It’s Useful

Wait, isn’t this just a different way of writing the same thing? Sure, it’s interesting, but what can pattern matching do that if statements can’t?

Well, quite a lot, actually.³ Pattern matching makes it trivial to deconstruct a data structure into its component values. Haskell’s pattern matching intricately relates to how Haskell handles data types.

Suppose we want to implement the map function. Recall that map takes a function and a list and returns the list obtained by applying the function to each element of the list. So map (*2) [1,2,3] == [2,4,6]. (Notice how I used partial application there?)

You may wish to take a moment to consider how you would implement map.

Without using pattern matching, we could implement map like this:

map f xs =
    if xs == []
    then []
    else (f (head xs)) : (map (tail xs))

⁴

But this is a bit clunky, and we can do a lot better by using pattern matching. Think about how to define map recursively:

The map of an empty list is just an empty list.

The map of a list is the updated head of the list plus the map of the tail of the list.

map f [] = []
map f (x:xs) = (f x) : (map f xs)

So much nicer!

This sort of design pattern comes in handy when you’re operating over data structures. To take a real-world example, I recently wrote a function that operated over an intersection of three values:

scoreIntersection (Intersection x y z) = whatever

I could pass in the Intersection type and pattern matching made it easy to pull out the three values into the variables x, y, and z.

Summary

Haskell has a number of language features that appear strange to someone with an imperative-programming background. But not only do these language features allow the programmer to write more concise and elegant functions, they teach lessons that you can carry with you when you use more imperative programming languages.

Many modern languages partially or fully support some of these features; Python, for example, supports lazy evaluation with generator expressions, and it’s possible to implement pattern matching in Lisp. And I’m excited to see that Rust supports sophisticated pattern matching much like Haskell.

If you want to learn more about Haskell, check out Learn You a Haskell for Great Good! Or if you’ve already dipped your toes into the Haskell ocean and want to go for a dive, Real World Haskell can teach you how to use Haskell to build real programs.

P.S. This site is relatively new, so if you see a mistake, please leave a comment and I’ll try and fix it.

Notes

Posted on Dec 16, 2013

I realize this is a terrible way to implement a spam classifier. ↩
Note on Haskell Syntax

The $ operator groups expressions, so
```
length $ filter blacklisted $ words text
```
is equivalent to
```
length (filter blacklisted (words text))
```
The words function splits a string into a list of words. words text is roughly equivalent to Python’s text.split(). ↩
Well, technically nothing, because every Turing-complete language is computationally equivalent. Anything that can be written in Python can also be written in assembly; that doesn’t mean you want to write everything in assembly. ↩
The : operator is a cons–given a value and a list, it prepends the value to the head of the list. ↩