AlphaGo & Google

Google has been up to a lot in its DeepMind subsidiary. Now, it has given AlphaGo—the artificial intelligence program best known for beating Go world champion Lee Sedol—to China. AlphaGo will train the AIs that may become teachers and doctors, scholars and bureaucrats. AlphaGo, as its name suggests, is meant to be the ultimate master of Go: It can be scaled up and made more powerful than any human mind can ever be in the game. And that’s what the Chinese government wants.

The dance with China

But Google has had some trouble with China in recent years. In the past, Google has tried to get into China’s massive market by self-censor its search results in order to comply with censorship mandates put forth by the Chinese government. In this case, however, handing over a program that has just made headlines for taking down the world champion of a traditionally Asian game signals something different.

Vision of Google

Google’s hope is that it will help shape what direction advanced artificial intelligence takes in China, and help set how AI research will be accepted internally. DeepMind CEO Demis Hassabis said “We hope it can be a catalyst for some research in China. I hope it can make a great contribution there.”

The tango of AphaGo

What are people besides their ministries going to do with AlphaGo? While it may be hard to predict what China will do with AlphaGo, there is little question that the country is hungry for predictive tools and AI technology. Chinese consumers are already using mobile payment programs based on facial recognition algorithms, and companies including Alibaba and Tencent have spun off popular AI tools to play language translations or decipher spoken words into text. The tech giants want to provide these services to their users, and the government wants to tap into that technology to help streamline its bureaucracy.

As AI encounters and combines with the real world, something extraordinary happens…

AlphaGo may not be able to make a great contribution on its own, but it will kickstart progress both on the consumer and enterprise side of many AI applications. Google is optimistic about new developments arising from this collaboration: “If you’re an entrepreneur in China, why not develop these ideas that were kind of being ignored by the US?” asked Hassabis. Let’s hope for more than just Xiaomi as a result of keeping up with Google.

AlphaGo is really just DeepMind’s first public sexing in the direction of AI research. Think about AlphaGo as machine learning for robotics (a style that previously only existed in academia). This kind of approach to studying AI might even lend itself to other areas of human activity outside of robotics. In the long run, it might develop the kind of foresight necessary to stop perturbations in conditions that would occur before such a perturbation becomes a problem! With this kind of technology, the number of accidents, injuries, and deaths that can be conceived of as self-inflicted will be almost non-existent.

Google’s DeepMind is particularly interested in AI as a useful tool for human physical activity…and this has huge implications for the near future of Google’s products.

What does the AI behind Google DeepMind's AlphaGo have in store?

1. Look-ahead search

This is known as tree quest, in which the nodes of the tree contain board representations and each branch corresponds to a potential step, resulting in a large number of branches per node. A good way of assessing positions at leaf nodes is needed for this method to succeed, and a good way of prioritizing branches to investigate is required for the quest to be successful.

2. Learning that is reinforced

The basic principle is to have the algorithm play and constantly change its policy model: it chooses a move to make based on this model, it receives a reward, it learns about that reward and changes its policy model accordingly, it then chooses a move to make from the new policy, it receives a reward, and so on. When a game meets its completion, the reward is +1 for winning and -1 for losing. When reaching the end of a game, the reward is +1 for winning and -1 for losing, and a new game is automatically started. The rest of the time, the reward is simply 0.

3. Long-term decision making

Since the process never really stops, the objective as a player is to make long-term investment decisions that result in a "winning" portfolio. You'd do this by using tree search to replicate investment decisions taken by you and other firms, and deep learning to retrieve useful world representations.

Sitemap

Don't forget to share the article!