Machine learning and artificial intelligence have seen a precipitous rise in popularity and noteworthiness over the past few decades, and with increasing vigor over the past few years. It seems like everyone and their dog is doing machine learning to solve some sort of problem. This herculean rise in popularity can be quantified by the more than doubling of the prevalence of ‘machine learning’ in all printed text between 2015 and 2019.[1]
Put simply, machine learning involves the design and implementation of novel algorithms towards the end of ‘training’ computers to perform specific tasks. This could be a well-defined task, like determining if a particular picture is a cat or a dog, or a more ambiguous goal, like what TV programs or music playlists to recommend based on your watch/listen history on an application like Netflix or Spotify. While these applications represent tangible, business-use cases for the application of machine learning, the machine learning community-at-large has also long pursued a loftier goal of expanding the scope and capabilities of their craft to tackle a different set of problems: outperforming expert humans at complex games.
This might bring back nostalgic memories of IBM’s `Watson’ computer convincingly beating Jeopardy legends Ken Jennings and Brad Rutter in 2011, but the arc of research has continued since then. For instance, DeepMind (a now-subsidiary of Google’s parent company, Alphabet) has developed a suite of game-playing programs that have made serious waves in their respective communities by exploiting a technique known as reinforcement learning, which – instead of explicitly providing a known objective to the program – works by providing feedback after each attempt as to whether or not the program performed well or poorly.
For instance, AlphaGo was a reinforcement-learning based project undertaken by DeepMind with the singular goal of mastering the ancient board game of ‘Go’ [2], which is widely considered to be one of the most complex board games in existence. [3] It was due to this complexity that computers had not historically been successful at playing Go on the same level as professionals: there are simply too many possible future states of the game for the computer to consider for a ‘brute force’ approach to work. Herein lies the power of reinforcement learning. AlphaGo was built by observing the moves from a few hundred amateur level Go games, and then playing several million simulated games against itself, slowly improving its strategy. Through this project, the program developed its own internal logic as to how best to evaluate the board positions, without having to simply enumerate all possible future moves. This project culminated in 2016 when AlphaGo played a high-profile series of games of Go (which is chronicled by the worthwhile documentary found here) against the world’s top Go player at the time, Lee Sedol of South Korea. AlphaGo ultimately prevailed, winning 4 of 5 matches and shocking the Go community to its core. [4]
In a fortuitous manner, at the end of the AlphaGo documentary the professional Go consultant on the DeepMind team, Fan Hui, states – in admiration of AlphaGo – “[m]aybe he [AlphaGo] can show humans something we never discover.” In the years since the Sedol-AlphaGo showdown, the team at DeepMind has been hard at work, eventually improving upon AlphaGo to the AlphaZero engine, which has been successfully used to master several different complex games, such as Go, Shogi, and Chess. In the case of Chess, for instance, this has led to a unique playing style of the AlphaGo engine, which has even inspired the decades-long world champion and chess grandmaster Magnus Carlsen to reconsider some facets of his own game.
While both the AlphaZero and AlphaGo stories are impressive, until now its hard to say if any of their successes would satisfy Fan Hui’s speculation that the programs themselves can show us (humans) something we haven’t discovered. Until now that is. A recent ground-breaking study by the DeepMind team applied the AlphaZero framework to the more abstract problem of multiplying matrices. [i] Put simply, matrices are arrays of numbers that often serve as a mathematical representation of sets of equations. As a result, solving matrix equations lies at the heart of a vast collection of applications, from computer graphics to quantum mechanics, and beyond. While we have known the importance of such equations for a long time, finding the optimal (most efficient) algorithms to solve them is a very hard problem, and one where very few solutions are known. To solve this problem, the DeepMind team built a ‘game’ for their machine-learning algorithm to play, involving the seeking-out of different ways of combining the entries in matrices to make their computation more efficient. To contextualize the complexity of this problem, while the number of possible moves at any given point in time of a chess or Go game is in the realm of 100, in the game of algorithm discovery there are more than 1,000,000,000,000 (1 trillion) possible moves at each step.
While the complexity of this problem is daunting, the DeepMind team was able to effectively train their new ‘AlphaTensor’ program to find the optimal algorithms for a variety of matrix sizes, in some cases rediscovering long-known algorithms, and in certain cases improving upon them. One case of particular note occurs for a 4×4 matrix (a grid of numbers with 4 rows and 4 columns) where the AlphaTensor program finds a new algorithm that outperforms the so called ‘Strassen two-level algorithm,’ which has been the best-known computational method for the past 50 years! On larger matrices, the authors compare the time it takes to calculate the answer and find the algorithms found by AlphaTensor provide a speedup of up to 25% over standard methods.
All in all, this represents the first time – to my own, as well as the authors’ knowledge – that a machine-learning algorithm has determined a provably (and objectively) better solution to an abstract problem that humans have not been able to solve. While it bears some similarity between the novel playstyles of AlphaZero in chess, in this case, the AlphaTensor program was able to concretely reveal new and more efficient ways of performing basic mathematical tasks. Importantly, the AlphaTensor program is not just doing the calculations more effectively than standard methods, but it’s actually finding better ways of calculating them. As a result, this could mark a turning point in the scientific and computational world, where machine-learning ceases to only be a tool to be used in solving a particular task but can also serve as a means of determining how to solve a particular task. The far-reaching implications of the AlphaTensor discovery are hard to overstate and extend across virtually to all facets of the digital world, beyond the immediate prospect of speeding up scientific discovery and improvement in technology.
REFERENCES
[i] A. Fawzi et al. “Discovering faster matrix multiplication algorithms with reinforcement learning” Nature, 2022.
FOOTNOTES
[1] This is quoted from the Google NGrams tool, which mines all accessible text for particular words and phrases. The trend for prevalence of ‘machine learning’ in English text from 2000-2019 is shown here.
[2] Which, interestingly, is the game that is thought to be the longest continually played game in history, with active players since the Zhou dynasty in China (~4th century BCE).
[3] In fact, the number of legal board positions in Go is far larger than the number of atoms in the observable universe.
[4] The AlphaGo documentary contains interesting commentary on the games played, showing one particularly exceptional moment where a clever move by Sedol in the 4th game – which the AlphaGo engine calculated as having a less than 1/10,000 probability of being played by a human opponent – ultimately led to his lone victory in the series.
DISCLAIMER:
This blog and its contents are for informational purposes only. Information relating to investment approaches or individual investments should not be construed as advice or endorsement. Any views expressed in this blog were prepared based upon the information available at the time and are subject to change. All information is subject to possible correction. In no event shall Viewpoint Investment Partners Corporation be liable for any damages arising out of, or in any way connected with, the use or inability to use this blog appropriately.