Haskell Is Actually Practical
Haskell has all these language features that seem cool, but then you wonder, what is this actually good for? When am I ever going to need lazy evaluation? What’s the point of currying?
As it turns out, these language constructs come in handy more often than you’d expect. This article gives some real-world examples of how Haskell’s weird features can actually help you write better programs.
Lazy Evaluation
Lazy evaluation means that Haskell will only evaluate an expression if its value becomes needed. This means you can do cool things like construct infinite lists.
To take a trivial example, suppose you want to write a function that finds the first n
even numbers. You could implement this in a lot of different ways, but let’s look at one possible implementation (in Python):
Here we construct a list with 2n
numbers and then take every even number from that list. Here’s how we could do the same in Haskell:
Instead of constructing a list with 2n
elements, we construct an infinite list of even numbers and then take the first n
.
Okay, so that’s pretty cool, but what’s the point? When am I ever going to use this in real life?
Why It’s Useful
I recently wrote a simple spam classifier in Python. To classify a text as spam or not-spam, it counts the number of blacklisted words in the text. If the number reaches some threshold, the text is classified as spam. 1
Before reading further, think for a minute about how you could implement this.
Originally, I wanted to write something like this.
- filter the list for only blacklisted words
- see if the length of the list reaches the threshold
Here’s the equivalent Python code:
This code is simple and concise. The problem is, it requires iterating through the entire list before returning, which wastes a huge amount of time. The text might contain tens of thousands of words, but could be identified as spam within the first hundred.
I ended up implementing it like this:
Instead of using higher-order functions like sum
, this implementation manually iterates over the list, keeping track of the number of blacklisted words, and breaks out once the number reaches the threshold. It’s faster, but much uglier.
What if we could write our code using the first approach, but with the speed of the second approach? This is where lazy evalution comes in.
If our program is lazily evaluated, it can figure out when the count reaches the threshold and return immediately instead of waiting around to evaluate the whole list.
Here’s a Haskell implementation:
(For those unfamiliar with Haskell syntax, see note.2)
Unfortunately, this doesn’t quite work. If the condition is true for the first k
elements of the list then it will also be true for the first k+1
elements, but Haskell has no way of knowing that. If you call classify
on an infinite list, it will run forever.
We can get around this problem like so:
Note that the take
operation takes the first k
elements of the list and drops the rest. (If you call take k
on a list with n
elements where n < k
, it will simply return the entire list.)
So this function will take the first threshold
blacklisted words. If it runs through the entire list before finding threshold
blacklisted words, it returns False
. If it ever successfully finds threshold
blacklisted words, it immediately stops and returns True
.
Using lazy evalution, we can write a concise implementation of classify
that runs as efficiently as our more verbose implementation above.
(If you want, it is also possible to do this in Python using generators.)
Partial Application
In Haskell, functions are automatically curried. This means you can call a function with some but not all of the arguments and it will return a partially-applied function.
This is easier to understand if we look at an example. Let’s take a look at some Haskell code:
add
is a simple function that takes two arguments and returns their sum. You can call it by writing, for example, add 2 5
which would return 7
.
You can also partially apply add
. If you write add 2
, instead of returning a value, it returns a function that takes a single argument and returns that number plus 2
. In effect, add 2
returns a function that looks like this:
You could also think of it as taking the original add
function and replacing all occurrences of x
with 2
.
Then we can pass in 5
to this new function:
In fact, in Haskell, (add 2) 5
is equivalent to add 2 5
: it calls add 2
, which returns a unary function, and then passes in 5
to that function.
A similar function could be constructed in Python like so:
Then you could call (add(2))(5)
to get 7
.
Why It’s Useful
To take a simple example, suppose you want to add 2 to every element in a list. You could map over the list using a lambda:
Or you could do this more concisely by partially applying the +
function:
It might seem like this just saves you from typing a few characters once in a while, but this sort of pattern comes up all the time.
This summer I was working on a program that required merging a series of rankings. I had a list of maps where each map represented a ranking along a different dimension, and I needed to find the sum ranking for each key. I could have done it like this:
(Note: unionsWith
takes the union of a list of maps by applying the given function to each map’s values.)
With partial application, we can instead write:
This new function uses partial appliation in two ways. First, it passes in +
instead of creating a lambda.
Second, it partially applies unionsWith
. This call to unionsWith
gives a function that takes in a list of maps and returns the union of the maps.
Notice also how mergeRanks
is not defined with any arguments. Because the call to unionsWith
returns a function, we can simply assign mergeRanks
to the value of that function.
Perhaps this example is a bit on the confusing side; I intentionally chose a complex example that has real-world value. Once you grok partial applications, they show up more often than you might think, and you can use them to perform some pretty sophisticated operations.
And I haven’t even mentioned function composition.
Here’s a more complicated usage of partial application combined with function composition that I wrote this summer. See if you can figure out what it does.
In one program of about 500 lines, I wrote about a dozen pieces of code similar to this one.
Pattern Matching
Pattern matching gives us a new way of writing functions. To take the canonical example, let’s look at the factorial function. Here’s a simple Python implementation.
And the same program written in Haskell:
But we could also write this using pattern matching.
Think of this as saying
- the factorial of 0 is 1
- the factorial of some number n is n * fac (n - 1)
So pattern matching is more declarative rather than imperative–a declarative program describes the way things are rather than what to do.
Why It’s Useful
Wait, isn’t this just a different way of writing the same thing? Sure, it’s interesting, but what can pattern matching do that if
statements can’t?
Well, quite a lot, actually.3 Pattern matching makes it trivial to deconstruct a data structure into its component values. Haskell’s pattern matching intricately relates to how Haskell handles data types.
Suppose we want to implement the map
function. Recall that map
takes a function and a list and returns the list obtained by applying the function to each element of the list. So map (*2) [1,2,3] == [2,4,6]
. (Notice how I used partial application there?)
You may wish to take a moment to consider how you would implement map
.
Without using pattern matching, we could implement map
like this:
But this is a bit clunky, and we can do a lot better by using pattern matching. Think about how to define map
recursively:
- The map of an empty list is just an empty list.
- The map of a list is the updated head of the list plus the map of the tail of the list.
So much nicer!
This sort of design pattern comes in handy when you’re operating over data structures. To take a real-world example, I recently wrote a function that operated over an intersection of three values:
I could pass in the Intersection
type and pattern matching made it easy to pull out the three values into the variables x
, y
, and z
.
Summary
Haskell has a number of language features that appear strange to someone with an imperative-programming background. But not only do these language features allow the programmer to write more concise and elegant functions, they teach lessons that you can carry with you when you use more imperative programming languages.
Many modern languages partially or fully support some of these features; Python, for example, supports lazy evaluation with generator expressions, and it’s possible to implement pattern matching in Lisp. And I’m excited to see that Rust supports sophisticated pattern matching much like Haskell.
If you want to learn more about Haskell, check out Learn You a Haskell for Great Good! Or if you’ve already dipped your toes into the Haskell ocean and want to go for a dive, Real World Haskell can teach you how to use Haskell to build real programs.
P.S. This site is relatively new, so if you see a mistake, please leave a comment and I’ll try and fix it.
Notes
-
I realize this is a terrible way to implement a spam classifier. ↩
-
Note on Haskell Syntax
The
$
operator groups expressions, solength $ filter blacklisted $ words text
is equivalent to
length (filter blacklisted (words text))
The
words
function splits a string into a list of words.words text
is roughly equivalent to Python’stext.split()
. ↩ -
Well, technically nothing, because every Turing-complete language is computationally equivalent. Anything that can be written in Python can also be written in assembly; that doesn’t mean you want to write everything in assembly. ↩
-
The
:
operator is acons
–given a value and a list, it prepends the value to the head of the list. ↩