Deepmind AI AlphaGo Zero Is Mastering Go By Playing With Itself

This (probably) isn't even its final form but the AI that defeated Go masters has lost to Zero 100 games to none.


While Google's Deepmind is trying to find out if AI can navigate the complex RTS Starcraft 2, machine learning continues to leap over milestones for traditional strategy board games. AlphaGO is an AI that previously defeated masters of the Chinese game, but that version was given a silver platter of professional and amateur games to study. AlphaGO Zero has learned Go entirely from scratch.

This new version of AlphaGo learned the game via "reinforcement learning" or by playing games against itself. By combining the neural network with a powerful search algorithm, it tunes itself to predict moves and calculate who'll eventually win the match. The updated network is rematched with the algorithm and this rinse/repeat method results in steady improvements. With this process, AlphaGo Zero has defeated the world-champion AlphaGo 100 games to zero. The Deepmind blog listed a few specific ways Zero differs from its older brother:

  • AlphaGo Zero only uses the black and white stones from the Go board as its input, whereas previous versions of AlphaGo included a small number of hand-engineered features.
  • It uses one neural network rather than two. Earlier versions of AlphaGo used a “policy network” to select the next move to play and a ”value network” to predict the winner of the game from each position. These are combined in AlphaGo Zero, allowing it to be trained and evaluated more efficiently.
  • AlphaGo Zero does not use “rollouts” - fast, random games used by other Go programs to predict which player will win from the current board position. Instead, it relies on its high-quality neural networks to evaluate positions.

Google's Deepmind is a leader in AI research, regularly reaching milestones in the field they believe will be one of the most crucial in scientific advances. "AlphaGo Zero also discovered new knowledge, developing unconventional strategies and creative new moves that echoed and surpassed the novel techniques it played in the games against Lee Sedol and Ke Jie," the report reads. "These moments of creativity give us confidence that AI will be a multiplier for human ingenuity, helping us with our mission to solve some of the most important challenges humanity is facing."

News Editor

Charles Singletary Jr keeps the updates flowing as the News Editor, breaking stories while investigating the biggest topics in gaming and technology. He's pretty active on Twitter, so feel free to reach out to him @The_CSJR. Got a hot tip? Email him at

Filed Under
From The Chatty
  • reply
    October 18, 2017 1:15 PM

    Charles Singletary posted a new article, Deepmind AI AlphaGo Zero Is Mastering Go By Playing With Itself

    • reply
      October 18, 2017 1:36 PM


      • reply
        October 18, 2017 9:41 PM

        After the Chatty singularity there will just be two AIs left that debate all day the age old question: Which is superior UT vs Q3.

        • reply
          October 18, 2017 9:42 PM

          my ai killed your ai. Q3 > UT

        • reply
          October 19, 2017 7:23 AM

          Quake 3, easily. UT maps required level designs to manually place navigation points by hand to build a mesh network for AI to travel on. Quake 3 had a compile step that automatically generated navmesh on any walkable surface, which wasn't hand tweakable.

          Quake3 bots have a far better understanding of their environment, hence, Quake 3 wins.

    • reply
      October 18, 2017 1:37 PM

      my sex strategy

    • reply
      October 18, 2017 2:17 PM

      Heard about that earlier. AI is scary.

      • reply
        October 18, 2017 2:22 PM

        I'm waiting until the day they train AlphaGo to write software and I'll be out of a job.

        • reply
          October 18, 2017 2:24 PM

          Or mine paperclips and destroy the universe.

      • reply
        October 18, 2017 3:05 PM

        It’s just a two big neural nets. One looks at a position and tries to say whether it’s good or bad, the other tries to pick the next move. Put the two of them together with a fairly standard chess computer program and erm that’s it. It’s not magic.

        The scale is impressive. It runs on a cluster of 2000 cores and 300 honking gpus.

        • reply
          October 18, 2017 5:12 PM

          someone didn't read the article

          • reply
            October 19, 2017 1:23 AM

            Oh hehe oops. I actually applied for a job at DeepMind a couple of weeks ago, so I was repeating some of the reading I did in preparation. Ah well.

    • reply
      October 18, 2017 2:49 PM

      Fuck yeah. One of my good friends is one of the folks leading the Edmonton, Alberta effort for DeepMind.

      Just to think! Terminator is going to start here, in my own hometown! So cool. :)

      • reply
        October 18, 2017 4:44 PM

        One of my friends works there too! I'm jelly

    • reply
      October 18, 2017 5:19 PM

      This is a really great surprise and a nice gift to the Go community. There are a number of examples of this new AlphaGo Zero playing itself in the paper.

      Ever since AlphaGo was announced we have wondered what a self-taught Go AI would look like. Would it play in a way recognizable by a human being? Would it start in the middle of the board (Tengen) or would it stick to opening in the corners as is traditional? Would it independently reproduce standard human opening patterns (Joseki) or would it play wild variations nobody had ever considered?

      Turns out that it learned to play in a way that looks quite human. It found traditional opening patterns on its own with some unique variations. In some ways its a validation of hundreds of years of human effort to explore the depth of the game, in that our preconceptions weren't completely turned on their head.

Hello, Meet Lola