• A lot of ink has been spilled about how to get various decision algorithms to cooperate with each other. However, most approaches require the algorithm to provide some kind of information about itself to a potential cooperator.

Consider FDT, the hot new kid on the block. In order for FDT to cooperate with you, it needs to reason about how your actions are related to the algorithm that the FDT agent is running. The naive way of providing this information is simple: give the other agent a copy of your “source code”. Assuming that said code is analyzable without running into halting problem issues, this should allow FDT to work out whether it wants to cooperate with you.

## Security needs of decision algorithms

However, just handing out your source code to anyone isn’t a great idea. This is because agents often want to pretend to have a different decision algorithm to the one that they really do.

• In the philosophy of computer science, there is a problem sometimes called the “waterfall problem”: properly interpreted, isn’t everything doing computation?

Suppose we have a waterfall, and we compare the positions of the atoms at the start of the waterfall to their positions at the bottom (which we will assume are randomly but deterministically, changed). Now, suppose we want to argue that this waterfall is implementing a chess-playing algorithm. Then all we have to do is provide some mapping from a chess position to the “input” of the waterfall, and some mapping from the “output” to a legal move, and we’re done. And, of course, we can always do this, provided we’re willing to spend enough time gerrymandering our mappings!

Scott Aaronson argues that all the real work is going into producing the mapping - in practice, computing the mapping would require solving the chess-playing problem! We could show this by producing a program that solves the problem without access to the waterfall as quickly (in complexity terms) as one with access to the waterfall.

## The epistemic question

There’s a related question, though, which is: given a waterfall, how can we tell if it is computing something (or some particular computation)? We might think that we can at least answer negatively in some cases. If the action of the waterfall is indistinguishable from a random function (that is, it is a pseudorandom function), then surely it cannot be computing anything.

• If you’re anything like me, you often find yourself writing code like this:

public List<ThingBit> getTheThings(ThingHolder t) {
if (t.hasThings()) {
List<Thing> things = t.getThings();
List<ThingBit> thingBits = new ArrayList<ThingBit>();
for (Thing t : things) {
}
return thingBits;
} else {
return null;
}
}


Now, there’s a lot wrong with this code1, but the intent is at least clear. However, there’s a good chance this will actually blow up when I actually try and use it, because it doesn’t account for errors. Did you know that in the context this method may be called in, t may be null? Or that some Things don’t have bits, shown by getBit() being null? Or that getBit() actually interfaces with a database and can throw all kinds of exceptions?

Well, if you didn’t know that, you didn’t handle it, and after running it a bit you’ll get lots of lovely NullPointerExceptions. So that’s problem number 1.

Sometimes you don’t know what the error behaviour of code is

However, sometimes I do know that getBit() might be null. I’ve seen it before, and it can certainly happen, but a lot of the time it doesn’t. So I know I should handle the null case… but I don’t anyway. Why do I code for the happy path when I know there’s an unhappy path? I’m just going to have to go back and put it in once I write some tests, or worse, after it crashes when in use.

My favourite CWE is the wonderfully unenforceable CWE-655: Insufficient Psychological Acceptability. This makes the kind-of-obvious point that if your security measures are too annoying or difficult to use, then people will bypass them, rendering them entirely pointless. The same is true for error-handling techniques in software: if they’re too painful to use, then people won’t use them until they’re forced to2. So that’s problem number 2.

Sometimes correct error handling is psychologically unacceptable to the programmer

1. Every time I write another pointless for-loop to get around the lack of map, I die a little inside.

2. If your programming language doesn’t enforce handling of errors, then ensuing that errors are handled is a basic software quality issue. And often a security issue - perhaps this deserves its own CWE!

• This blog has been built with Nix for some time, but the deployment of the blog has been a hand-written shell script that just rscny’d the files across to a VPS. How quaint.

I was bored and looking for a reason to try it out, so now all the deployment happens with Nixops, which is really quite nice.

• [Epistemic status: strongly stated, weakly held]

When faced with problems that involve ongoing learning, most strategies involve a balance between “exploration” and “exploitation”. Exploration means taking opportunities that increase your knowledge about how good your opportunities are, whereas exploitation means putting resources into what you currently believe is the best opportunity. A good strategy will involve both: if you only explore, then you will never actually reap any rewards, whereas if you only exploit then you will likely spend all your resources on a poor opportunity.

When we in the EA community are thinking about how to spend our resources, we face an exploration/exploitation problem. In this post I’m going to argue that:

1. The value of exploration is higher than ever
2. The EA community is not doing enough exploration
3. Exploration through experimentation is particularly neglected