[UPDATE: this problem was featured in FiveThirtyEight's Riddler column. Check it out!]
I've discovered an interesting result recently: If two cars approach an N-way stop sign at the same time and both go their intended ways without yielding, they have a 1/3 chance of crashing as N -> ∞.
Motivation
I originally thought of this problem because, well, I often drive through stop signs. Others have probably realized this, but there is a certain scenario where I can stop at the stop sign right after someone else, and still go without yielding because I know we won't crash. That only happens if I'm turning right, and the person with right-of-way is to the right of me (i.e. in the road I am turning into). The original question I posed to myself: what scenarios allow me to go without looking, knowing I won't crash with the other car, as N gets larger?
The answer to that question turned out to be uninteresting - no matter the value of n, there is only one combination of my intended direction and the other drivers' position which cannot cause a crash. This scenario involves me turning right into the road adjacent to me, and the other driver starting from that road. Every other combination will cause a crash in some scenario, depending on the other car's intended direction (the unknown). I assume their direction as an unknown because I don't know what the other driver will do – even looking at their turn signal, since sometimes it's wrong. Turn signals also cease to be useful after a 4-way intersection. On the flip side, there is also one combination of those 2 aforementioned inputs that will always cause a crash, no matter which way the other car turns.
A More Interesting Question
I simulated up to n=6 by hand but realized the simple pattern to my first question pretty quickly. Along the way I thought it might be more useful to ask, "given all combinations of variables, which scenarios will lead to a crash?" For n=3, the answer is 4/8. For n=4, it is 14/27. n=5, 32/64, and n=6, 60/125. Ok, so about 1/2. But why such a round number, and why should it be constant? I found this surprising enough to keep investigating.
Whether two cars will crash at an n-way intersection is a function of 3 variables: car A's intended direction, car B's initial position relative to car A, and car B's intended direction. You don't need to know car A's position, since we can define it as a fixed value, then compute everything relative to that position.
For an n-way intersection, there are (n-1)^3 possible scenarios (no U-turns). Why? Because car A can go to n-1 possible directions, and car B can start at n-1 possible positions (car B can't start at car A's position), and car B can go to n-1 possible positions. (n-1)(n-1)(n-1) = (n-1)^3.
It helps to think of each direction as an index, starting at 0, and ending at n-1. For simplicity, car A always starts at index 0. Car B can start anywhere in [1,n-1]. Car A's destination can be anywhere in [1,n-1]. Car B's destination can be anywhere in [0,n-1], excluding its start position. As an arbitrary convention, I defined the indices as going counterclockwise from start. Here is an example for n = 8:
Note that the directions don't need to be evenly spaced around a circle, they can be any irregular shape, as long as the kth direction is strictly counterclockwise to the (k-1)th direction. In real life, intersections aren't perfect polygons.
What Defines a Crash?
For low n, the crash scenarios are straightforward. But what about this scenario, where n=6?
It's ambiguous whether the two cars will crash here. In real life they might, depending on their paths to their intended direction and the width of the lanes (each direction only has one lane). So I define a rule as to whether two cars crash: their paths must cross exactly once. In the example above, the cars meet at a point (like a double root), or depending on how you draw them, they cross twice or not at all.
Consider a 4-way intersection to make sense of this rule. You pull up to the intersection planning to turn left, and the person across from you plans to turn left as well. If you both go at once, you don't crash, since you can both veer left and avoid each other. No matter how you draw the paths, the two cars will either cross twice, zero times, or bounce off at a point. So this is not a crash.
Here are some crash scenarios for further illustration.
First Attempt at Solution
To find a solution, my first instinct was to create a rule for individual cases: whether two cars will crash, given the 3 variables stated above. I decided to simulate this with a python script. After some trial and error, I landed on a simple algorithm:
Let's walk through the code:
Line 3 just checks to make sure the inputs are valid.
I employed one trick, on line 7:
if other_end == 0:
other_end = num_ways
This simplifies the algorithm, and is sort of like saying that 0 degrees = 360 degrees.
Line 10 is a simple case: if two cars end up in the same road, they will crash.
Lines 12-17 describe the cases where car B's end position is less than car A's end position. In these cases, the cars are safe if car B also starts at an index lower than car A's end position.
Lines 17-21 cover the final cases, where car B's end position is greater than car A's. In those cases, the cars are safe if car B starts at an index >= car A's end position. We can use >= here, rather than just >, because if car B starts at car A's end position, it can safely turn without intersecting car A. This is not the case on lines 12-17, because car B would have to turn left and therefore cross car A. These cases also illustrate why I employed the trick on line 7. If car B is going to position 0, we can treat it as position N and it will fall into the final else statement.
I used this algorithm to run some simulations for different values of N, calculating the total crash rate across all scenarios. I was surprised to see that the rate was not 1/2, as I had estimated earlier, but appeared to tend toward 1/3 instead. This is what it looks like for up to N = 200.
I couldn't be sure of this, because after N = a few hundred, the computation takes too long to run (Remember calculating all scenarios is O((n-1)^3). So I had an empirical estimate of 1/3, but no proof as to why this would be the case. It could be that that rate is actually trending toward zero, or .319, or not converging at all, but 1/3 seemed like a solid assumption worth investigating.
I did some more simulations, this time wondering what the crash rate would be if I held one of the variables constant. The graphs below show what happens when you vary car A's destination for N=8.
Hmm, that looks a lot like a parabola! This makes sense intuitively – you probably have a higher chance of getting hit if you're heading substantially across the intersection, rather than immediately to the left or right. The graphs on the right are non-normalized, and the graphs on the left are normalized such that domain goes from 0 to 1, not 1 to N-1. Looking at the bottom left graph, we can establish a relationship between the overall crash rate and the rate for each value of A's destination (denoted as k).
This is is just a weighted sum. Each k value has an equal weighting, because there are (N-1)^2 scenarios for each k. And there are N-1 valid k values, so each k value's weighting is 1/(N-1) of the whole.
Let's see what happens for a much larger N, like 200. We'll try fitting the normalized curve (bottom left) to a quadratic to see what happens.
This definitely looks like a parabola. The parabolic best-fit equation has these coefficients:
Again we can't be certain, but the coefficients seem to be converging on these values as N -> ∞.
If we take the integral of this equation, it is basically like the weighted sum above, but as N->∞. What is the integral of this equation from 0 to 1? 1/3! Once I saw this, I was more convinced of both my coefficient values and the 1/3 crash rate. However, we're still missing the connection between any individual scenario and the bigger picture. Given all of the input variables, why does the crash rate tend towards 1/3 as N->∞? We know that this function approaches a quadratic for large N, but what is this function?
Solution
This basically ends up being a combinatorics problem. I solved it by using the previous simulation to my advantage, hoping to derive that same quadratic function of car A's end index. It helped me to "back-solve" in this way; I think it would have taken me a lot longer to figure out if I didn't look at the output for different values of N.
Let's look at the scenarios where car A's destination is known, as above. We'll denote their destination as k. Following the rules in the code, we can work out how many scenarios will result in no crashes.
Say it's an 8-way intersection, and car A is going to position 3. If car B starts in position 4, it can go to 4 positions without crashing: [5,6,7,8]. If car B starts in position 5, 6 or 7, it can also go to 4 positions safely. So there are 4 start positions here (4-7) and 4 safe end positions for each: a total of 16 safe scenarios. This generalizes to (N-k-1)^2.
if car B starts at position 3, it is an interesting edge case. This car can go to 5 end positions safely: [4,5,6,7,8], or (N-k) positions for the general case.
For any other scenario with car B starting at a position >= k, the car must cross car A's path and cause a crash. So for this side of the intersection, we have
safe scenarios.
What about start positions below k? In this case, the only acceptable end positions are between [1,k-1] (k-1 possible positions). In our example, this means positions 1 and 2. Each start position can only go to k-2 possible end positions, because they cannot end where they started. So this leaves us with (k-1)*(k-2) safe scenarios. Note that for k=1 or 2, there are no safe scenarios on this side.
In total, the equation for number of safe scenarios at a given k is:
The total number of scenarios for a given k is (N-1)^2. So we can write our crash rate as:
We want to know what this value is as N-> ∞, so let's take the limit and find out.
Before we can solve this, let's replace k with this expression
where f is a fraction from (0, 1). Since k can only go from [1,N-1], f will never actually reach zero or one, but as N goes to ∞, it will be very close.
Skipping some boring algebra, this is the result we get:
This is the same equation I found from my curve fitting simulations! And since N is now infinity, df (1/N) approaches zero. There are infinitely many values of f crammed into the domain (0,1), so the distance between them must approach zero.
Remember f is a "normalized" version of k, which represents car A's end position. So integrating is like taking a weighted sum of each crash rate across all valid k values.
So that's the answer, confirming that 1/3 is indeed the magic number in this problem.
U Turns
[Updated] What if u-turns are allowed? I originally omitted them since people don't usually make u-turns at stop signs, but it turns out they are legal to do in most US states. It also turns out that this doesn't affect the result. We get this expression for number of safe scenarios for a given k (see if you can derive it)
This only introduces new terms proportional to N, not N^2. Since the limit's denominator is still proportional to N^2, those terms go to zero, and we end up with the same equation.
Connection to the Circle
I didn't figure this part out myself, it was pointed out to me by Zach Wissner-Gross, after I submitted this problem to his Riddler column at FiveThirtyEight. It turns out that this is exactly equivalent to the probability of two chords intersecting on a circle. If you choose any 4 random points on the perimeter of a circle, and draw two chords between those points, you have a 1/3 chance of those chords intersecting.
Here's why: pick one of the points to start with, then you have 3 points remaining to complete the first chord (so three choices). Once you've chosen that point, the other two points form the second chord and you're done. No matter which point you start with, only one of the three choices results in the chords intersecting!
Wrapping Up
I'm not sure if other people have solved this problem before, but some googling didn't come up with any results, and I thought it would be fun to work out anyway. Maybe I'm using the wrong search terms. Maybe this is just a chapter 3 homework problem in some probability textbook.
I still don't have an intuitive feel for why this should make sense; it would seem more natural if the answer here were 1/2, or 0. But it's not. I don't know why. I suspect the intuition comes from looking at average sizes of different regions of a circle as a clock hand sweeps around it, but maybe someone else will point it out to me.
There are some other variations of this problem I'd like to try.
roundabouts
multiple-lane roads
I'm not a mathematician, this problem just caught my interest and I was surprised to see what came out of it. This is sort of a pointless problem since no intersections are large enough, but isn't that the case for most of math? Maybe there are connections to other areas of math that I don't know about.
I'd also like to point out that running the python simulations helped a lot here. It sort of feels like cheating, but this is a great way to get more information about a problem before you derive a solution.
So the moral of the story is: if you show up to an infinite-way stop sign, you have a 1/3 chance of getting hit if you just go without yielding. I'd take my chances. In any single scenario your odds will be updated because you know your intended position and the start position of the other car. Or maybe you can't see them, since the intersection needs room for infinity roads. Either way, now you know your odds! (sorry Han Solo)
Comments