Adversarial Search

Evaluation by rollouts – play multiple games to termination from a state s (using a simple, fast rollout policy) and count wins and losses
- For each rollout, repeat until terminal state: Play a move according to a policy, then record result
- Fraction of wins correlates w/ true value of position
- Pick move that gives best outcome by WL ratio or whatever metric
- Allocate rollouts to more promising nodes, more uncertain nodes
Selective search – explore parts of the tree that will help improve the decision at the root, regardless of depth