Robin Hanson Emulator
After years of toil, I have developed an artificial intelligence that can speak as eloquently as Robin Hanson on any subject.
After years of toil, I have developed an artificial intelligence that can speak as eloquently as Robin Hanson on any subject.
Disclaimer: I haven’t researched or thought about this much, and a lot of what I’m saying is probably derivative or completely wrong. I just wanted to work through some of my thoughts.
What would happen if we implemented basic income guarantees tomorrow?
Assume we’re just talking about the United States here. Assume we don’t have any major technological advances between today and tomorrow, so we can’t automate every single person’s job. Let’s say that the income guarantee is enough to live off of—maybe $30,000.
What would people do? And would the economy continue to generate enough money to be able to pay for everyone’s income guarantee?
When people automatically get $30,000, this dramatically reduces their willingness to work. There are a lot of jobs that people only work because they desperately need a job, and they would really prefer not to. Once they get a basic income guarantee, demand for these jobs will drop dramatically. If the jobs are important, wages will increase until some people once again become willing to take those jobs.
Exactly how much people are willing to work depends on the tax rate. Let’s say we have a progressive taxation scheme which starts much higher than the current tax rate—maybe 50% at the lowest bracket and 90% at the highest (I’m just making up numbers here). That means if you make $30,000 a year for doing nothing and take a job that pays $30,000, now you’re making $45,000 after taxes. People have diminishing marginal utility of money, so people will be less willing to do this, but there should still be a lot of people who want to make more than the basic income and end up taking jobs.
Which jobs will they take?
When people have a basic income, that dramatically changes their incentives to work. In economic terms, supply of labor drops. Which jobs continue to be prominent depends on which jobs have high or low price elasticity of demand for labor.
To get more concrete, let’s think about two jobs: garbage collector and fast food burger flipper. Probably a small minority of the people in these jobs actually enjoy them; if these people suddenly had a guaranteed $30,000 a year, how would the market respond?
People really need garbage collectors, so they have a high willingness to pay for their salaries. Or, more precisely, they have a high willingness to accept higher taxes so that the government can employ garbage collectors. In all likelihood, not enough people will be willing to work as garbage collectors for their current salaries. Demand for garbage collectors is highly inelastic, so as supply of willing workers decreases, wages will increase by a lot. The increase in wages should be enough to incentivize people to continue working as garbage collectors.
The labor supply for burger flippers would similarly decrease. Fast food companies would have to raise salaries by a lot in order to get people to keep working for them, which means they would have to increase food prices. The increased food prices would decrease quantity demanded, and fast food companies would shrink (and possibly disappear entirely). I am probably okay with this.
But since people have less need to work, they should become more willing to work intrinsically enjoyable goods, so we should see an increase in the supply of short films, music, and other similar goods. Interestingly, writing books seems to be so intrinsically enjoyable that the market’s already over-saturated even without a basic income guarantee—publishers get way more manuscripts than they can use.
There’s a spectrum between “everybody intrinsically enjoys this” and “nobody intrinsically enjoys this”, and every job lands somewhere on the spectrum. Even among jobs that most people don’t intrinsically enjoy, we will still see differences. A lot fewer people will work in factory farms, since I can’t imagine that anybody would actually want to do that. But we probably won’t see that big a reduction in the quantity of auto mechanics. A lot of people like working on cars—people often do it as a hobby. We’d expect these people to be willing to work as auto mechanics for only relatively little pay.
I want to talk a little extra about software development since it’s my field. Generally speaking, a lot of programmers enjoy programming, but there are a lot of kinds that are more fun than others. We’d probably see more people starting their own companies and fewer people working software jobs that involve a lot of boring repetition.
This changes the incentives for companies hiring developers. Boring routine work becomes more expensive since fewer developers are willing to do it, so companies have stronger incentives to automate as much work as possible.
There probably won’t be a huge effect since developers tend to make well $30,000, so that extra money doesn’t do as much for them; the most affected jobs will be those that pay less than or about as much as the basic income.
An economy with a basic income guarantee would reduce or remove unimportant jobs while still retaining important jobs. Prices would be higher and people probably wouldn’t buy as much, but the things they’d buy less of would mostly be the things that weren’t really important to begin with. People aren’t perfectly rational; a lot of purchases people make just keep them going on the hedonic treadmill and don’t actually improve their lives.
Perhaps a world with McDonald’s is better than one without, but if it is, it’s certainly not much better, and I wouldn’t feel too bad about it if McDonald’s went out of business after all the low-level employees quit.
Please explain in the comments why I’m wrong about everything. I think the economic effects of a basic income guarantee could be really interesting and possibly surprising, and I want to hear what you think.
Haskell has all these language features that seem cool, but then you wonder, what is this actually good for? When am I ever going to need lazy evaluation? What’s the point of currying?
As it turns out, these language constructs come in handy more often than you’d expect. This article gives some real-world examples of how Haskell’s weird features can actually help you write better programs.
Lazy evaluation means that Haskell will only evaluate an expression if its value becomes needed. This means you can do cool things like construct infinite lists.
To take a trivial example, suppose you want to write a function that finds the first n
even numbers. You could implement this in a lot of different ways, but let’s look at one possible implementation (in Python):
def first_n_evens(n):
numbers = range(1, 2*n + 1)
return [ x for x in numbers if x % 2 == 0 ]
Here we construct a list with 2n
numbers and then take every even number from that list. Here’s how we could do the same in Haskell:
firstNEvens n = take n $ [ x | x <- [1..], even x ]
Instead of constructing a list with 2n
elements, we construct an infinite list of even numbers and then take the first n
.
Okay, so that’s pretty cool, but what’s the point? When am I ever going to use this in real life?
I recently wrote a simple spam classifier in Python. To classify a text as spam or not-spam, it counts the number of blacklisted words in the text. If the number reaches some threshold, the text is classified as spam. 1
Before reading further, think for a minute about how you could implement this.
Originally, I wanted to write something like this.
- filter the list for only blacklisted words
- see if the length of the list reaches the threshold
Here’s the equivalent Python code:
class LazyClassifier(Classifier):
def classify(self, text):
return len(filter(self.blacklisted, text.split())) >= self.threshold
This code is simple and concise. The problem is, it requires iterating through the entire list before returning, which wastes a huge amount of time. The text might contain tens of thousands of words, but could be identified as spam within the first hundred.
I ended up implementing it like this:
class ImperativeClassifier(Classifier):
def classify(self, text):
words = text.split()
count = 0
for word in words:
if self.blacklisted(word):
count += 1
if count >= self.threshold:
return 1
return 0
Instead of using higher-order functions like sum
, this implementation manually iterates over the list, keeping track of the number of blacklisted words, and breaks out once the number reaches the threshold. It’s faster, but much uglier.
What if we could write our code using the first approach, but with the speed of the second approach? This is where lazy evalution comes in.
If our program is lazily evaluated, it can figure out when the count reaches the threshold and return immediately instead of waiting around to evaluate the whole list.
Here’s a Haskell implementation:
classify text = (length $ filter blacklisted $ words text) >= threshold
(For those unfamiliar with Haskell syntax, see note.2)
Unfortunately, this doesn’t quite work. If the condition is true for the first k
elements of the list then it will also be true for the first k+1
elements, but Haskell has no way of knowing that. If you call classify
on an infinite list, it will run forever.
We can get around this problem like so:
classify text =
(length $ take threshold $ filter blacklisted $ words text) == threshold
Note that the take
operation takes the first k
elements of the list and drops the rest. (If you call take k
on a list with n
elements where n < k
, it will simply return the entire list.)
So this function will take the first threshold
blacklisted words. If it runs through the entire list before finding threshold
blacklisted words, it returns False
. If it ever successfully finds threshold
blacklisted words, it immediately stops and returns True
.
Using lazy evalution, we can write a concise implementation of classify
that runs as efficiently as our more verbose implementation above.
(If you want, it is also possible to do this in Python using generators.)
In Haskell, functions are automatically curried. This means you can call a function with some but not all of the arguments and it will return a partially-applied function.
This is easier to understand if we look at an example. Let’s take a look at some Haskell code:
add :: Int -> Int -> Int
add x y = x + y
add
is a simple function that takes two arguments and returns their sum. You can call it by writing, for example, add 2 5
which would return 7
.
You can also partially apply add
. If you write add 2
, instead of returning a value, it returns a function that takes a single argument and returns that number plus 2
. In effect, add 2
returns a function that looks like this:
add2 y = 2 + y
You could also think of it as taking the original add
function and replacing all occurrences of x
with 2
.
Then we can pass in 5
to this new function:
(add 2) 5 == 7
In fact, in Haskell, (add 2) 5
is equivalent to add 2 5
: it calls add 2
, which returns a unary function, and then passes in 5
to that function.
A similar function could be constructed in Python like so:
def add(x):
return lambda y: x + y
Then you could call (add(2))(5)
to get 7
.
To take a simple example, suppose you want to add 2 to every element in a list. You could map over the list using a lambda:
map (\x -> x + 2) myList
Or you could do this more concisely by partially applying the +
function:
map (+2) myList
It might seem like this just saves you from typing a few characters once in a while, but this sort of pattern comes up all the time.
This summer I was working on a program that required merging a series of rankings. I had a list of maps where each map represented a ranking along a different dimension, and I needed to find the sum ranking for each key. I could have done it like this:
mergeRanks :: (Ord k) => [Map.Map k Int] -> Map.Map k Int
mergeRanks rankings = Map.unionsWith (\x y -> x + y) rankings
(Note: unionsWith
takes the union of a list of maps by applying the given function to each map’s values.)
With partial application, we can instead write:
mergeRanks = Map.unionsWith (+)
This new function uses partial appliation in two ways. First, it passes in +
instead of creating a lambda.
Second, it partially applies unionsWith
. This call to unionsWith
gives a function that takes in a list of maps and returns the union of the maps.
Notice also how mergeRanks
is not defined with any arguments. Because the call to unionsWith
returns a function, we can simply assign mergeRanks
to the value of that function.
Perhaps this example is a bit on the confusing side; I intentionally chose a complex example that has real-world value. Once you grok partial applications, they show up more often than you might think, and you can use them to perform some pretty sophisticated operations.
And I haven’t even mentioned function composition.
Here’s a more complicated usage of partial application combined with function composition that I wrote this summer. See if you can figure out what it does.
yearSlice year stocks =
filter ((>0) . length) $
map (filter ((==year) . fyearq)) stocks
In one program of about 500 lines, I wrote about a dozen pieces of code similar to this one.
Pattern matching gives us a new way of writing functions. To take the canonical example, let’s look at the factorial function. Here’s a simple Python implementation.
def fac(n):
if n == 0:
return 1
else:
return n * fac(n - 1)
And the same program written in Haskell:
fac n =
if n == 0
then 1
else n * fac (n - 1)
But we could also write this using pattern matching.
fac 0 = 1
fac n = n * fac (n - 1)
Think of this as saying
- the factorial of 0 is 1
- the factorial of some number n is n * fac (n - 1)
So pattern matching is more declarative rather than imperative–a declarative program describes the way things are rather than what to do.
Wait, isn’t this just a different way of writing the same thing? Sure, it’s interesting, but what can pattern matching do that if
statements can’t?
Well, quite a lot, actually.3 Pattern matching makes it trivial to deconstruct a data structure into its component values. Haskell’s pattern matching intricately relates to how Haskell handles data types.
Suppose we want to implement the map
function. Recall that map
takes a function and a list and returns the list obtained by applying the function to each element of the list. So map (*2) [1,2,3] == [2,4,6]
. (Notice how I used partial application there?)
You may wish to take a moment to consider how you would implement map
.
Without using pattern matching, we could implement map
like this:
map f xs =
if xs == []
then []
else (f (head xs)) : (map (tail xs))
But this is a bit clunky, and we can do a lot better by using pattern matching. Think about how to define map
recursively:
- The map of an empty list is just an empty list.
- The map of a list is the updated head of the list plus the map of the tail of the list.
map f [] = []
map f (x:xs) = (f x) : (map f xs)
So much nicer!
This sort of design pattern comes in handy when you’re operating over data structures. To take a real-world example, I recently wrote a function that operated over an intersection of three values:
scoreIntersection (Intersection x y z) = whatever
I could pass in the Intersection
type and pattern matching made it easy to pull out the three values into the variables x
, y
, and z
.
Haskell has a number of language features that appear strange to someone with an imperative-programming background. But not only do these language features allow the programmer to write more concise and elegant functions, they teach lessons that you can carry with you when you use more imperative programming languages.
Many modern languages partially or fully support some of these features; Python, for example, supports lazy evaluation with generator expressions, and it’s possible to implement pattern matching in Lisp. And I’m excited to see that Rust supports sophisticated pattern matching much like Haskell.
If you want to learn more about Haskell, check out Learn You a Haskell for Great Good! Or if you’ve already dipped your toes into the Haskell ocean and want to go for a dive, Real World Haskell can teach you how to use Haskell to build real programs.
P.S. This site is relatively new, so if you see a mistake, please leave a comment and I’ll try and fix it.
I realize this is a terrible way to implement a spam classifier. ↩
Note on Haskell Syntax
The $
operator groups expressions, so
length $ filter blacklisted $ words text
is equivalent to
length (filter blacklisted (words text))
The words
function splits a string into a list of words. words
text
is roughly equivalent to Python’s text.split()
. ↩
Well, technically nothing, because every Turing-complete language is computationally equivalent. Anything that can be written in Python can also be written in assembly; that doesn’t mean you want to write everything in assembly. ↩
The :
operator is a cons
–given a value and a list, it prepends the value to the head of the list. ↩
This is a collection of some of my favorite resources on utilitarianism.
Light Books
Practical Ethics, Peter Singer
Animal Liberation, Peter Singer
The Life You Can Save, Peter Singer
Heavy Books
The Methods of Ethics, Henry Sidgwick
Utilitarianism, John Stuart Mill
The Principles of Morals and Legislation, Jeremy Bentham
Introductions
Introduction to Utilitarianism, William MacAskill
Consequentialism FAQ
Utilitarian FAQ
Common Criticisms of Utilitarianism
Organizations
80,000 Hours
Effective Animal Activism
GiveWell
The High Impact Network
Giving What We Can
Collections
Utilitarian Philosophers
Utilitarianism: past, present and future
Recommended Reading
Communities
Felicifia
Effective Altruists on Facebook
Less Wrong
Essays and Blogs
Essays on Reducing Suffering
The Effective Altruism Blog
Reflective Disequilibrium
Everyday Utilitarian
Measuring Shadows
Philosophy, et cetera
Reducing Suffering
If you are a United States citizen and you want to do as much good as possible with your vote, then how should you use it? (These principles apply outside the US as well, but my analysis focuses on US elections.)
For those who care about maximizing the welfare of society, the importance of voting increases as the population increases. Below is the mathematical justification for this claim. These calculations assume that you know the correct person to vote for. If you wish to avoid math, you can skip to the next section.
Continue readingWhen I took physics, I learned that stars radiate light all throughout the electromagnetic spectrum, and radiate the most at some point in the visible spectrum. Our sun radiates more yellow than any other frequency; blue stars radiate more blue; and red stars radiate more red. Given that visible light falls in such a narrow range (with wavelengths ranging from 400 to 700 nanometers), why do all stars’ peak frequencies occur in this range? It seems like a remarkable coincidence.
I wondered about this question for some time, until yesterday I finally realized the answer.
The sun radiates light mostly in the visible spectrum; when this light hits objects on earth, some of it is absorbed, and some is reflected. Most of the light that gets reflected is in the 400 to 700 nanometer range, so any device that picks up light will be most efficient if it can pick up this range. Our eyes evolved to use light to perceive objects, so they evolved to see light in this range. In other words, the reason we see light in the 400 to 700 nanometer range is because that is the range where the sun emits the most radiation. And we can see other stars because stars’ peak radiations do not vary all that much, so they all fall within the visible spectrum.