Monte-Carlo Search Algorithms

Monte-Carlo Search AlgorithmsDaniel Bjorge and John Schaeffer

Problem

Important stuff

Where it needs to

be recognize

Solution

Search Algorithms

Which isfaster?

Best early prediction?

Best β parameter?

ComparingSearch Algorithms

Option 1: Computer Go

State expansion

SlowTermination detection

Slower

Heuristics

Lassú

Option 2: Artificial Trees

State expansion

FastTermination detection

Faster

Heuristics

010110

Existing FrameworksP-Game

Kocsis2006Artificial

Simple

FuegoEnzenberger and

Müller

Generic

FastScalabl

Existing FrameworksP-Game

Kocsis2006

Small Scale

FuegoEnzenberger and

Müller

Complex

Only Go Implement

Gomba Search Framework(Go Multi-Armed Bandit Analysis)

Scalable

Multiple Metrics

Tweakable Trees

Swappable Search Strategies

Finom a Paprikásban!

Eager Game Tree Generation

Tree Generat

orMemory

Search Algorith

(Small trees)

Eager Game Tree Generation

Tree Generato

r Memory

(Big trees)

Lazy Game Tree Generation

Tree Generat

Memory

Search Algorith

(Any trees)

Minimax Values

7 0 -4 9 6 3 4 -91

Maximizing

Minimizing

Minimax Values

7 0 1 -4 9 6 3 4 -9

Maximizing

Minimizing

7 0 1 -4 9 6 3 4 -9

Minimax Values

O (bd /2 )Maximizing

Minimizing

Downward Minimax Evaluation

7 0 1 4 9 6 3 4 0

Example

0Maximizing

Minimizing

Difficulty

Minimax

Optimal

White to play

Easier

Quantified Difficulty

0Maximizing

Minimizing

Application to Search Algorithms

Many-Armed Bandit Solutions

10837...

0100...

1419-27-13...

1222...

-12-4-2-8…

UCTUpper Confidence Bound for Trees

mal arm

(Kocsis and Szepesvári, 2006)

UCT-FPUUCT with First Play Urgency

mal arm

(Hartland et al., 2006)

Infinitely-Armed Bandit Solutions

10837...

-12-4-2-8…

Infinite-Armed Bandit SolutionsAdapting to Trees

…for treesK-

Failure

UCT-V ()UCB-V ()

UCT-V () AIRUCB-V () AIR

(Wang, Audibert, and Munos, 2006)

K-Failure

Random

UCT-FPU

1-Failure

3-Failure

10-Failure

mal arm

UCT-V ()

… … … … … … … … …

UCT-V () with AIR

… … … … … … … … …

UCT-V () with AIR: Gomba

mal arm

In Fuego

UCT UCT-V UCT-V (∞), β=.1

UCT-V (∞), β=.2

Fuego vs GnuGo, 40s moves

Upper 95% Con-fidence BoundLower 95% Con-fidence Bound

Search Algorithm

Wrapping Up

Gomba• Fast• Scalable• Reasonably accurate• Simple• InterchangeableInfinite Bandit-Based Solutions• Improvement

Questions?

Köszönjük szépen!

Advisors• Kocsis Levente• Sárközy Gábor• Stanley Selkow

SZTAKI Info Lab• Benczúr András• Recski Gábor• Zséder Attila• Kornai András• Erdélyi Miklós

Fuego Development Team

Monte-Carlo Search Algorithms

Documents

Transcript of Monte-Carlo Search Algorithms

Markov Chain Monte Carlo confidence intervals

Monte Carlo 1: Integration

Bab 5 Metode Monte Carlo 00

Acceptance Rates (1) (black) Linst-mat.utalca.cl/jornadasbioestadistica2011/doc...Monte Carlo Methods with R: Metropolis–Hastings Algorithms [160] Acceptance Rates Normals from Double

Monte Carlo Simulation for Ion Implantation Profiles ...cdn.intechweb.org/pdfs/14007.pdf · M M Φ = + ⎛⎞ ⎜⎟ ... Electron stopping power ... Monte Carlo Simulation for Ion

MONTE CARLO AND MARKOV CHAIN MONTE CARLO METHODShome.engineering.iastate.edu/~namrata/EE527_Spring08/l4b.pdf · which is just MSE = variance + bias2 (recall handout # 1). As we know

Kinetic Monte Carlo - University of Illinois at Urbana–ChampaignKinetic Monte Carlo •Hop every time •Consider all possible hops simultaneously •Pick hop according its relative

Improved Estimation and Uncertainty Quanti cation …wang/2014-JCGS-.pdf · Improved Estimation and Uncertainty Quanti cation Using Monte Carlo-based Optimization Algorithms ... By

Quasi Monte-Carlo Methoden - uni-muenster.delemm/seminarSS08/horstman… · Einleitung / Motivation Fehlerabschätzung für QMC-Integration Quasi-Zufallsgeneratoren Quasi Monte-Carlo

Tutorial on ABC Algorithms - Dr Christopher Drovandi · Approximate Bayesian Computation in Population Genetics. Genetics. Marjoram et al (2003). Markov chain Monte Carlo without

MARKOV CHAIN MONTE CARLO AND IRREVERSIBILITYmo3/KarpaczSchool3.pdf · Keywords. Markov Chain Monte Carlo, non-reversible di usions, Hypocoercivity, Hamil-tonian Monte Carlo. 1. Introduction

The Monte Carlo Simulation of Radiation Transport · PDF fileThe Monte Carlo Simulation of Radiation Transport ... calculation of π with a Monte Carlo (MC) simulation ... A random

Markov chain Monte Carlo - School of Informatics ...

FLUKA Monte Carlo Simulation for the Leksell Gamma Knife ... · FLUKA Monte Carlo Simulation for the Leksell Gamma Knife ... TUNGSTEN FLUKA Monte Carlo Simulation for the Leksell

Markov Chain Metropolis Monte Carlo - Páginas de materias

Lecture 7 and 8: Markov Chain Monte Carlo

Markov Chain Monte Carlo (MCMC) for model and …Markov Chain Monte Carlo (MCMC) for model and parameter identification N. Pedroni, nicolapedroni@gmail.com Multidisciplinary Course:

Quantum Monte Carlo Simulations - MCSpress3.mcs.anl.gov/computingschool/files/2014/01/A.BenaliATPESC... · ∇ I2+V − ( R )+E BO R ) $ % & ( ) ξ ... Quantum Monte Carlo Simulations

Risk Neutral Valuation the Black-scholes Model and Monte Carlo

DerMonte-Carlo- · PDF fileRandomisierteAlgorithmen Monte-Carlo-Algorithmen Las-Vegasvs. Monte-Carlo Anwendungsbeispiel: DieZahlπ Gliederung 1 RandomisierteAlgorithmen 2 Monte-Carlo