Archived from groups: comp.ai.games (
More info?)
Moti ezra a écrit :
> Hi All,
>
> I am trying to come up with an Othello program.
> I am using alpha-beta prunning search algorithm.
>
> I suspected some bugs so I striped my software from it's heuristics
> and used instead [number of my discs - Number of opponnent discs] as
> an evaluation function.
This is the maximizer evaluation function, which is known as being one
of the dumbest. A correlation study between the disc difference at any
time and the final score on a game database will show that:
- the correlation is always very tiny, meaning that the final score
depends only a very little on the disc difference in a midgame position.
- the correlation is negative for the first 48 plies, null around ply
48, and positive for the last 12 moves. So maximizing the disc
difference mostly select bad moves during most of the game.
> I have a primitive baseline version that uses the same idea but up to
> depth 1 , that is only looking at current best avilable move.
>
> When I am searching up to depth 1-2 I can win the baseline, but If I
> increase my depth to 5-6, I loose!!! (52 / 12).
Obviously, this is a result from a single game, I don't understand how
you can draw any conclusion from just a single game. To look at what you
want, you need to do some statistics on many games.
> I even tried to multiply the evaluation function with -1 ,(same
> result, but differnt game)
The "minimizer" is not much better; now, you are selecting bad moves at
the end of the game instead of the start of the game.
> Does anyone have an Idea?
Here are some statistical data, to show you what you have to do before
drawing any conclusion.
A tournament has been organized, with 81 "rounds" where each player play
from depth 0 (a random move is chosen) to depth 8. In each round, 200
games were played, with the 10 first moves forced, so that the games are
all different. The 200 different 10 first moves were randomly chosen
from a game database.
Here are the results for different evaluation functions. The result
shown is the percentage of win for the Black side (a draw is considered
as a half win):
Maximizer
White
0 1 2 3 4 5 6 7 8
0 46.50 42.25 62.00 61.50 53.50 59.75 53.50 55.75 47.75
1 59.75 46.25 67.75 70.00 60.75 69.75 58.50 65.25 57.50
B 2 47.25 39.00 53.00 47.50 46.75 48.00 44.75 42.50 43.25
l 3 32.75 29.25 47.50 43.75 56.50 46.75 45.25 45.25 37.50
a 4 51.50 37.50 67.00 47.75 54.00 44.00 44.25 38.25 23.75
c 5 35.50 26.50 49.75 43.25 53.50 42.00 37.50 38.75 31.50
k 6 53.50 45.50 63.00 56.00 61.50 56.50 61.25 38.75 38.00
7 44.25 36.25 54.75 49.75 63.25 42.25 54.25 44.50 43.25
8 59.50 46.00 67.50 56.00 69.75 61.75 62.00 52.25 61.00
The maximizer eval uses the disc difference.
Interpreting this weak eval is quite hard but we can see that:
- all the matchs are relatively close.
- the 1 depth bot is the strongest.
- there is an odd/even effect.
- there is a visible enhancement between depth 2 and 8, so looking
deeper does work, and going deeper than depth 8 will probably bring a
stronger player.
Random eval
White
0 1 2 3 4 5 6 7 8
0 46.50 45.50 50.50 33.25 31.25 25.50 29.50 20.50 18.75
1 44.50 48.50 40.00 27.75 38.00 27.00 22.25 18.75 19.00
B 2 53.50 55.00 46.75 35.25 30.75 26.50 31.25 17.50 19.25
l 3 53.75 54.75 59.50 44.00 36.50 34.50 27.75 21.50 18.00
a 4 65.50 59.50 62.25 52.25 53.50 34.75 42.00 26.00 35.50
c 5 65.25 65.00 67.75 60.00 58.75 42.25 34.25 33.00 30.00
k 6 77.50 74.25 75.00 69.75 58.25 58.25 42.25 40.25 39.50
7 79.25 73.75 77.50 74.25 65.50 59.00 63.75 41.75 39.00
8 83.00 85.50 82.50 77.75 68.00 73.75 61.25 59.00 49.50
The random eval is build from the board hashcode used as a seed to a
random genrator, so the eval is unique to each board, but not correlated
to it.
The strength is clearly enhanced when looking deeper. This is a well
known result obtained when combining alphabeta with a random eval, where
alphabeta choses moves with a greater mobility because of the property
of the max() function on a set of values.
Pattern based eval (from Edax's bot).
White
0 1 2 3 4 5 6 7 8
0 46.50 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1 100.00 54.50 8.50 3.75 0.50 2.00 1.00 0.00 0.00
B 2 100.00 80.50 47.25 17.75 12.00 3.75 4.25 2.00 0.50
l 3 100.00 91.50 76.50 41.25 19.75 9.25 8.25 4.00 2.75
a 4 100.00 96.50 87.25 71.50 47.00 21.25 18.50 9.00 10.25
c 5 100.00 98.50 91.50 82.00 70.00 48.50 24.75 18.00 10.75
k 6 100.00 99.00 97.00 90.00 76.75 62.75 49.75 33.25 24.00
7 100.00 100.00 97.00 91.50 84.25 76.25 62.50 51.00 29.00
8 100.00 100.00 96.00 95.75 87.25 87.50 69.25 62.50 49.50
With a stronger evaluation function (pattern based à la Logistello), the
effect of the depth becomes obvious.
To me, it is clear that the "deeper the smarter", whatever the
evaluation function, however a good evaluation function helps a lot.
Good luck with your othello program.
--
Richard