After AI beat them, professional Go players got better and more creative
A game of the board game Go in Japan 1876
For many decades, it seemed professional Go players had reached a hard limit on how well it is possible to play. They were not getting better. Decision quality was largely plateaued from 1950 to the mid-2010s:
Then, in May 2016, DeepMind demonstrated AlphaGo, an AI that could beat the best human Go players. This is how the humans reacted:
After a few years, the weakest professional players were better than the strongest players before AI. The strongest players pushed beyond what had been thought possible.
Or were they cheating by using the AI? No.1 They really were getting better.
And it wasn’t simply that they imitated the AI, in a mechanical way. They got more creative, too. There was an uptick in historically novel moves and sequences. Shin et al calculate about 40 percent of the improvement came from moves that could have been memorized by studying the AI. But moves that deviated from what the AI would do also improved, and these “human moves” accounted for 60 percent of the improvement.
My guess is that AlphaGo’s success forced the humans to reevaluate certain moves and abandon weak heuristics. This let them see possibilities that had been missed before.
Something is considered impossible. Then somebody does it. Soon it is standard. This is a common pattern. Until Roger Bannister ran the 4-minute mile, the best runners clustered just above 4 minutes for decades. A few months later Bannister was no longer the only runner to do a 4-minute mile. These days, high schoolers do it. The same story can be told about the French composer Pierre Boulez. His music was considered unplayable until recordings started circulating on YouTube and elsewhere. Now it is standard repertoire at concert houses.
The recent development in Go suggests that superhuman AI systems can have this effect, too. They can prove something is possible and lift people up. This doesn’t mean that AI systems will not displace humans at many tasks, and it doesn’t mean that humans can always adapt to keep up with the systems—in fact, the human Go players are not keeping up. But the flourishing of creativity and skills tells us something about what might happen at the tail end of the human skill distribution when more AI systems come online. As humans learn from AIs, they might push through blockages that have kept them stalled and reach higher.
Another interesting detail about the flourishing in Go, which is teased out in this paper by Shin, Kim, and Kim, is that the trend shift actually happened 18 months after AlphaGo. This coincides with the release of Leela Zero, an open source Go engine. Being open source Leela Zero allowed Go players to build tools, like Lizzie, that show the AI’s reasoning when picking moves. Also, by giving people direct access, it made it possible to do massive input learning2. This is likely what caused the machine-mediated unleash of human creativity.
This is not the first time this kind of machine-mediated flourishing has happened. When DeepBlue beat the chess world champion Kasparov in 1997, it was assumed this would be a blow to human chess players. It wasn’t. Chess became more popular than ever. And the games did not become machine-like and predictable. Instead, top players like Magnus Carlsen became more inventive than ever.
Our potential is greater than we realize. Even in highly competitive domains, like chess and GO, performance can be operating far below the limit of what is possible. Perhaps AI will give us a way to push through these limits in more domains.
The data in the graph which shows the improvement is from Games of Go on Disk, a project that transcribes games at professional Go tournaments. These games happen in person and have precautions against cheating. There was a recent incident in Chinese Chess when Yan Chenglong, last year's winner of the Chinese tournaments, was accused of cheating by using an anal bead that let him send information to a computer by squeezing, and receiving moves sent back as a code of vibrations—so who knows. But cheating doesn’t seem common enough to explain the trend.
It is Shin, Kim and Kim who claim Leela Zero helped because, unlike AlphaGo, it showed the reasoning behind the move, not just the move.
This is interesting in light of cognitive apprenticeship theory which posits that the reason people have a hard time learning cognitive skills, like literacy or Go, is that our learning is adapted for imitation and apprenticeship-like situations, and this works poorly for cognitive skills which happen hidden in the head. By opening up the box, so that the thought process can be observed, like Lizzie does, you allow people to apprentice themselves to the cognition, not just the actions.
I am not sure I believe this explanation! When I look at subreddits for Go players who use Lizzie, my impression is that they don’t look at the reasoning all that much. They use it mainly to pinpoint moves where the winrate suddenly drops, so they can focus their learning on their biggest mistakes.
The true explanation why open source helped might actually be the inverse of what Shin, Kim and Kim propose. It might that the reason open source helped was that it let people do massive input learning—simply flooding themselves with data on how the AI plays—and bypassing reasoning all together. It could be that reasoning was holding people back before. Human moves tend to follow heuristics that are explainable and simplify things so people can do the computations in their heads. The AIs don’t care about these heuristics and explanations and so can play cleaner. In chess parlance, the AI is more “concrete”—reasoning based on specific variations rather than on general principles. Doing massive input training on this kind of concrete play, bypassing heuristics and explanations, might be the why of the improved decision quality.
In chess, the new batch of young grandmasters in chess got there largely by playing 10+ hours a day of online speed chess instead of the older strategies that emphasized targeting learning, deliberate practice and slower exercises. This is another example of a shift toward massive input, pushing beyond heuristics to pure pattern matching, and it was, like the shift in Go, facilitated by AI engines.