Friday, 2 October 2015

An empirical look at the scaling of world-record running and swimming speeds.

Rather than physics, I'm going to talk about running, and then swimming. I have run a few races in the last year or so, but I am not as knowledgeable about sports physiology as I am about physics.

Out of curiosity, one day I looked up the records for various running races, calculated their average speeds, an plotted them versus distance. Before I attempt to analyse them, let's look at the data.

Record running speed vs. race distance. Linear-log and log-log scales. Red dots are 100 m and marathon.
A few comments about the data. It comes from Wikipedia. Distances range from the 40 yard NFL combine sprint to the 24-hour record. Different races have different standards for timing. Some have official records that are kept, and some are informal. Sometimes the fastest time for a certain distance is actually a subset of a longer race, e.g. running the fastest 30 km as part of a marathon. Somewhat implicit here is the assumption that the current world records are close to the pinnacle of human possibility. Because these are average speeds, they neglect information about speed variation within a race.

If we look at the data, we see four to five different regimes. At least, I do.

Let's look at each of these individually.

1. Acceleration Zone. 0-100 m. For the shortest races, below 100 m, the average speeds are slower because a large portion of the time is accelerating. The shorter the race, the less time is spent at top speed.

2. The Usain Bolt Golden Zone. 100-200 m This is sort of a transition zone between 1 and 3, where it's long enough to reach top speed and coast at it, but not so long that the runners start to slow down. Usain Bolt holds all of these records at roughly the same speed, his fastest average race being the 150 meters which he did at 37.6 km/h. For a long time, the 200 m record was faster than the 100 m, but Usain Bolt effectively tied them up.

3. Sprint Zone. 200-1000 m.  They're trying to go as fast as they can, but it doesn't last forever. As the race gets longer, they got slower at roughly the -1/5 power of distance*. I have a feeling terms like "fast-twitch" and "anaerobic" would come into play here if the physiology were to be discussed.

4. Endurance Zone. 1.5-42 km. The races are long enough that conserving energy becomes more important than going all-out. Speed decreases with roughly the -1/13 (-0.08) power of distance. Marathon champions are slightly more than half as fast as Usain Bolt.

5. Pain Zone/Low Stats Zone. 50+ km. Few venture beyond the marathon. This is the ultimate test of human endurance, what separates us from the gazelles. Speed decreases with roughly the -1/5 power again. This may a different physiological regime than the Endurance Zone, or it could be convolved with the fact that there are much fewer people running these races so a true champion has not emerged (at the risk of stereotyping, I find it auspicious that no supermarathon record holder is Kenyan), and the races are long enough that people have to take bathroom breaks, lowering the average speed.

I mentioned low statistics. What I mean by that is that there are certain races where the record is clearly not what is humanly possible, but because comparatively fewer people run that distance, the best hasn't been achieved yet. This is responsible for the jumble in between half and full marathon speeds, and you can see an explicit example if you look at the Sprint Zone. 200, 400, 800 m and (less so) 1 km are commonly contested events, while nobody really runs 500 meters. It falls below the trend of human excellence.
500 meter glory is ripe for the taking. Also, this is a log-log plot so a straight line means a power law.
Now let's look at women's records.

Women's records compared to men's, and the ratio in speeds.
Generally, the fastest women are about 12% slower than the fastest men, with some variation. The small-sample effects are more present, especially around the marathon distance. One could try to squint out a trend in the ratio, but it is essentially constant: if you do a power law fit, it is consistent with zero, 0.002±0.003 (it's even closer to zero if you take out ultramarathons). Both the 100 m and the marathon have the same ratio, about 9% faster for men. The best and worst are both ultramarathons (6% and 20 %), indicating sample-size effects. My interpretation of this information is that whatever physiological differences separate sprinters and distance runners, they do not differ between men and women.

However, things get different when we look at swimming records. Pool events range from 50 to 1600 meters, and there are longer outdoor events, which are harder to compare because the I imagine environment conditions start to make a big difference. Looking at the data, it looks similar to running: longer is slower. It gets interesting when we compare men and women.
The two longest races are outdoors and are suspect. The first six are pool races.

As you can see, the relative advantage men have over women decreases as the race gets longer. This is pretty interesting, because it's something that's found in swimming but not running. What is the physiological reason for this? I don't know, but if I had to hypothesize, I'd say that in longer races, more energy is spent on maintaining buoyancy compared to sprints, and women are naturally more buoyant than men, and in longer races this starts to matter and tires men out comparatively more.

Now, let's get silly. My squint-analysis tells me that there is a sprint zone below 400 m. For men the scaling exponent is -.13, while for women it's -.11 (being less negative is consistent with the previous paragraph). For the endurance zone, it's -0.526 for men and -.485 for women, with an error of about .003. These numbers are about two-thirds their running counterparts, and I can't explain why, but I'm sure it's interesting. The scaling exponent of the ratio is about -.008 for all the data or -0.014 for only the pool records. If that seems ridiculously small, look at the y-axis of that ratio graph. These numbers are really small, but they are consistently nonzero; the error is roughly .001. With this information, I draw a natural conclusion: women will overtake men in a billion kilometer race.

Extrapolation is never wrong and always justified.

I am not the only person who has had these ideas: a lot of it was discussed in this paper which I didn't consult until writing everything above. They analyse sexual dimorphism in a lot of different sports, and find jumping (especially pole vault) is the most dimorphic...sexually. They also discuss these trends for running, swimming, and speed skating. They also did experiments with pigeons over hundreds of kilometers and found no evidence of sexual dimorphism.

In looking at all this information, I am basically doing what I would do with a set of experimental data where I don't really understand the underlying mechanisms (in this case, how people work). I look at plots, try to split the data into different regions, and look for trends within each region, and then try to figure out what underlying phenomena lead to those trends. Sometimes you can learn stuff this way.

If any physiologists or kinesiologists are reading this and have some ideas about the trends I've discussed, please share them.

*What this means is that in a race 32 times as long as another, the speed would be half. I would recommend reading the intro to my article on animal speeds for an overview of scaling analysis. If you want.


  1. Interesting research. Thanks for putting together. Another point to consider about longer than marathon running distances–not only is the sample smaller, but most longer-than-marathon events are trail races. There are very few (if any) ultramarathon road races (from 50k to 100m) and even if they are out there they lack the prestige and prize money to entice world class marathoners from training for and competing in them.

  2. Love it! This is super interesting.

    You've mentioned that data is poor for the ultra folks, and this is true. But another important factor is that marathons are done on roads in controlled conditions, whereas ultra races are often over very difficult terrain and/or harsh conditions - like over mountains or through the desert.

    An interesting comparison might be determining zones by amount of time rather than distance (that is, once you get out of the really short distances where acceleration time is a major factor). Super endurance races become games of eating and drinking as your body starts seeking energy from glycogen stores in the muscles and liver in order to keep moving. Elite athletes might not drop off in a marathon as much as average athletes in part because: 1) their bodies and running form are more efficient, thus burning fewer calories; and 2) they spend less time on the course, and thus burn fewer base calories. It would be super interesting to map onto your existing data to see if there is a more generalizible drop-off in pace after a certain number of hours. Of course an athlete will hit the glycogen stores more quickly if s/he is working harder. But I wonder if, even for expert endurance runners, there is a generalizable period of time we are capable of running, after which the human body just says "fuck it, I'm done. Also, you're an asshole".*

    *Anyone who has ever hit the wall will attest that this is a direct quote.

  3. If I plot the data vs time instead of distance, it looks liiiike.....


    Fairly similar, one notable difference is that there appears to be a sharper transition between spring and endurance for women. Actually, that was there in the distance data but I didn't notice it.

  4. This comment has been removed by a blog administrator.