|
Post by Steve Draper on Aug 9, 2015 5:54:14 GMT -8
I'm not sure why we'd be dis-proportionately fast in Sudokus, but it could be propnet optimization I guess (we do a fair amount of logical transformation to try to reduce the network, though I do not see anything I would expect to be more effective here especially. The propnet IS faster now than the numbers posted on the course as I have made a few improvements since then, but only by tens of percent. OK, then. It's a bit basic. I solve Suduko without partitioning, or any heuristics, by saying that if all my rollouts are scoring nothing, I'll use the rollout depth as the score. That way I'm maximimizing the number of moves I'll stay alive. As it happens that approach solves Sudoku 1 in metagaming, and in Sudoku 2 it manages 8 "good moves" and then spots a solution (as a rollout scores the touchdown). Maybe as my player gets faster it'll get there on Sudoku 3+. Also, unfortunately this approach doesn't work elsewhere as far as I know, so maybe instead I'll need to actually add something cleverer, but not yet. This technique is useful in a number of games, and actually Sancho uses something similar too (though not in Sudoku). We use it in puzzles where statistically the goal value is monotonically increasing (so the longer the game goes the likely the higher the end score will be) and is volatile (the goal values get changed often, implying (with the montonicity) that it will correlate well to game length). This is helpful in several puzzles when turned into a non-random rollout policy - choose moves that avoid termination when there is a choice. Works well in several of the knights tour variants, and also in MaxKnights. Using the monotonicity of goals even more starkly, you can implement a playout policy that always (or in an epsilon greedy fashion) chooses moves that increase the goal value as a playout policy. A variant of this (adding shallow back-tracking to try other paths actually in the playout when terminal states are hit or even when you cannot find a goal-increasing move choice) is how we reliably solve the larger knights tour variants (especially the nasty ones where going 'wrong' does not terminate, and so on). As an FYI, if you want a really hard knight's tour variant to work on, I recommend Stanford's 'multiKnightsTour' (which is actually a pseudo-2-player game, though it factorizes into two independent puzzles). Even if you take that GDL and hack it into a simple puzzle (throw out the second role), the way it is constructed makes it significantly more challenging than most of the other knight's, tour variants. It has several interesting properties: - It's decently large (10X10)
- Does not terminate until 100 moves have been played regardless of how much you repeat yourself
- Score goes up on the first revisit-a-cell move, but then not the FOLLOWING move as it scores as you LEAVE a cell!
- Score map contains an inconsistent 'bump' in the middle which can be awkward
|
|
|
Post by Lars Ericson on Aug 15, 2015 18:07:35 GMT -8
The following Python program solves Sudoku 1 in 6 passes. It is very naive. It simply forms the initial matrix for the game, and then fills in each blank with the set of possible values for that blank based on the rules of Sudoko. If the set happens to be a singleton set, it fills in the solution. Then repeat until there are no more sets. I wrote this program just so I could understand Sudoku better. For example after writing it, it occurred to me that the number of moves for a GGP player is equal to the number of blanks. Not an earth-shattering realization, but I just hadn't thought about the game. Next thing I want to do is code this solution as a sequence of moves, add it to the game KIF, translate it into Clingo ASP, and verify in Clingo that I have a correct 50-move solution (which I know already but this is warm-up). That will tell me that I have translated all the terms correctly from KIF to LP. Then I can take out the solution, leave the number of moves and the correctly translated initial GDL in, and Clingo should find the same solution (though it may have to work much harder than my Python program, which was written with an understanding of the game's rules). So here is a run of the program: game steps 50
initial
[[9 7 0 3 0 4 0 6 5]
[0 2 0 5 0 6 0 8 0]
[0 0 0 0 0 0 0 0 0]
[0 0 5 8 0 2 9 0 0]
[0 0 2 0 4 0 3 0 0]
[0 0 8 7 0 5 1 0 0]
[0 0 0 0 0 0 0 0 0]
[0 6 0 2 0 8 0 3 0]
[8 4 0 1 0 9 0 2 7]]
update
[[9 7 1 3 {8, 1, 2} 4 2 6 5]
[{1, 3, 4} 2 {1, 3, 4} 5 {1, 9, 7} 6 {4, 7} 8 {1, 9, 3, 4}]
[{1, 3, 4, 5, 6} {8, 1, 3, 5} {1, 3, 4, 6} 9 {8, 1, 2, 9, 7} {1, 7} {2, 4, 7} {1, 9, 4, 7} {1, 2, 3, 4, 9}]
[{1, 3, 4, 6, 7} {1, 3} 5 8 {1, 3, 6} 2 9 {4, 7} {4, 6}]
[{1, 6, 7} {1, 9} 2 {9, 6} 4 1 3 {5, 7} {8, 6}]
[{3, 4, 6} {9, 3} 8 7 {9, 3, 6} 5 1 4 {2, 4, 6}]
[{1, 2, 3, 5, 7} {1, 9, 3, 5} {1, 9, 3, 7} {4, 6} {3, 5, 6, 7} {3, 7} {8, 4, 5, 6} {1, 9, 4, 5} {8, 1, 9, 4, 6}]
[{1, 5, 7} 6 {1, 9, 7} 2 {5, 7} 8 {4, 5} 3 {1, 9, 4}]
[8 4 3 1 {3, 5, 6} 9 {5, 6} 2 7]]
update
[[9 7 1 3 8 4 2 6 5]
[{3, 4} 2 4 5 {1, 7} 6 {4, 7} 8 {1, 9, 3, 4}]
[{3, 4, 5, 6} {8, 3, 5} {4, 6} 9 {8, 1, 2, 7} 7 {4, 7} {1, 7} {1, 3, 4}]
[{1, 3, 4, 6, 7} {1, 3} 5 8 {3, 6} 2 9 7 6]
[{6, 7} 9 2 6 4 1 3 {5, 7} {8, 6}]
[{3, 6} {9, 3} 8 7 {9, 3, 6} 5 1 4 {2, 6}]
[{1, 2, 5, 7} {1, 9, 5} {9, 7} {4, 6} {3, 5, 6, 7} {3, 7} {8, 4, 5, 6} {1, 9, 5} {8, 1, 9, 4, 6}]
[{1, 5, 7} 6 {9, 7} 2 {5, 7} 8 {4, 5} 3 {1, 9, 4}]
[8 4 3 1 {5, 6} 9 {5, 6} 2 7]]
update
[[9 7 1 3 8 4 2 6 5]
[3 2 4 5 1 6 7 8 {1, 9, 3}]
[{3, 5, 6} {8, 3, 5} 6 9 {1, 2} 7 4 1 {1, 3, 4}]
[{1, 3, 4} {1, 3} 5 8 3 2 9 7 6]
[7 9 2 6 4 1 3 5 8]
[{3, 6} 3 8 7 {9, 3} 5 1 4 2]
[{1, 2, 5, 7} {1, 5} {9, 7} 4 {3, 5, 6, 7} 3 {8, 4, 5, 6} {1, 9, 5} {8, 1, 9, 4}]
[{1, 5, 7} 6 {9, 7} 2 {5, 7} 8 {4, 5} 3 {1, 9, 4}]
[8 4 3 1 {5, 6} 9 {5, 6} 2 7]]
update
[[9 7 1 3 8 4 2 6 5]
[3 2 4 5 1 6 7 8 9]
[5 {8, 5} 6 9 2 7 4 1 3]
[{1, 4} 1 5 8 3 2 9 7 6]
[7 9 2 6 4 1 3 5 8]
[6 3 8 7 9 5 1 4 2]
[{1, 2, 5} {1, 5} {9, 7} 4 {5, 6, 7} 3 {8, 5, 6} 9 {1, 9}]
[{1, 5} 6 {9, 7} 2 {5, 7} 8 5 3 {1, 9, 4}]
[8 4 3 1 {5, 6} 9 {5, 6} 2 7]]
update
[[9 7 1 3 8 4 2 6 5]
[3 2 4 5 1 6 7 8 9]
[5 8 6 9 2 7 4 1 3]
[4 1 5 8 3 2 9 7 6]
[7 9 2 6 4 1 3 5 8]
[6 3 8 7 9 5 1 4 2]
[{1, 2} 5 7 4 {5, 6, 7} 3 {8, 6} 9 1]
[1 6 {9, 7} 2 7 8 5 3 {1, 4}]
[8 4 3 1 {5, 6} 9 6 2 7]]
update
[[9 7 1 3 8 4 2 6 5]
[3 2 4 5 1 6 7 8 9]
[5 8 6 9 2 7 4 1 3]
[4 1 5 8 3 2 9 7 6]
[7 9 2 6 4 1 3 5 8]
[6 3 8 7 9 5 1 4 2]
[2 5 7 4 6 3 8 9 1]
[1 6 9 2 7 8 5 3 4]
[8 4 3 1 5 9 6 2 7]]
solved True
steps 6
moves
(does robot (mark 1 1 1 3 1))
(does robot (mark 1 1 3 1 2))
(does robot (mark 1 3 2 1 9))
(does robot (mark 2 2 2 3 1))
(does robot (mark 2 3 3 2 4))
(does robot (mark 3 3 1 3 3))
(does robot (mark 1 1 2 2 8))
(does robot (mark 1 2 1 3 4))
(does robot (mark 1 3 2 3 7))
(does robot (mark 2 1 3 2 7))
(does robot (mark 2 1 3 3 6))
(does robot (mark 2 2 1 2 9))
(does robot (mark 2 2 2 1 6))
(does robot (mark 1 2 1 1 3))
(does robot (mark 1 2 2 2 1))
(does robot (mark 1 2 3 1 7))
(does robot (mark 1 3 1 3 6))
(does robot (mark 1 3 3 1 4))
(does robot (mark 1 3 3 2 1))
(does robot (mark 2 1 2 2 3))
(does robot (mark 2 2 1 1 7))
(does robot (mark 2 2 3 2 5))
(does robot (mark 2 2 3 3 8))
(does robot (mark 2 3 1 2 3))
(does robot (mark 2 3 3 3 2))
(does robot (mark 3 1 2 1 4))
(does robot (mark 3 1 2 3 3))
(does robot (mark 1 2 3 3 9))
(does robot (mark 1 3 1 1 5))
(does robot (mark 1 3 2 2 2))
(does robot (mark 1 3 3 3 3))
(does robot (mark 2 1 1 2 1))
(does robot (mark 2 3 1 1 6))
(does robot (mark 2 3 2 2 9))
(does robot (mark 3 1 3 2 9))
(does robot (mark 3 2 3 1 5))
(does robot (mark 1 3 1 2 8))
(does robot (mark 2 1 1 1 4))
(does robot (mark 3 1 1 2 5))
(does robot (mark 3 1 1 3 7))
(does robot (mark 3 1 3 3 1))
(does robot (mark 3 2 1 1 1))
(does robot (mark 3 2 2 2 7))
(does robot (mark 3 3 3 1 6))
(does robot (mark 3 1 1 1 2))
(does robot (mark 3 1 2 2 6))
(does robot (mark 3 1 3 1 8))
(does robot (mark 3 2 1 3 9))
(does robot (mark 3 2 3 3 4))
(does robot (mark 3 3 2 2 5)) Here is the code: from __future__ import division
import sys
import numpy as np
### First a LISP parser copied off the Internet from Peter Norvig: ###
Symbol = str # A Lisp Symbol is implemented as a Python str
List = list # A Lisp List is implemented as a Python list
Number = (int, float) # A Lisp Number is implemented as a Python int or float
def parse(program):
"Read a Scheme expression from a string."
return read_from_tokens(tokenize(program))
def tokenize(s):
"Convert a string into a list of tokens."
return s.replace('(',' ( ').replace(')',' ) ').split()
def atom(token):
"Numbers become numbers; every other token is a symbol."
try:
return int(token)
except ValueError:
try:
return float(token)
except ValueError:
return Symbol(token)
def read_from_tokens(tokens):
"Read an expression from a sequence of tokens."
if len(tokens) == 0:
raise SyntaxError('unexpected EOF while reading')
token = tokens.pop(0)
if '(' == token:
L = []
while tokens[0] != ')':
L.append(read_from_tokens(tokens))
tokens.pop(0) # pop off ')'
return L
elif ')' == token:
raise SyntaxError('unexpected )')
else:
return atom(token)
### End LISP parser ###
def parse_file(kf):
with open (kf, "r") as myfile:
lisp = myfile.readlines()
program = '('+''.join([x for x in lisp if x[0] != ';']).replace('\n', '')+')'
gdl = parse(program)
return gdl
def twod(X):
[a1,a2,a3,a4]=X
row=(a1-1)*3+(a2-1)
col=(a3-1)*3+(a4-1)
return (row,col)
def fourd(row,col):
a1=int(row/3)+1
a2=(row%3)+1
a3=int(col/3)+1
a4=(col%3)+1
return (a1,a2,a3,a4)
def empty_game():
return np.zeros((9,9),dtype=object)
def initialize():
kf = 'sudoku.kif'
M=empty_game()
gdl = parse_file(kf)
init = [(x[1][1:-1],x[1][-1]) for x in gdl if x[0] == 'init']
for (x,mark) in init:
(a,b)=twod(x)
w=list(fourd(a,b))
if x != w:
print("fail", (x,w))
quit()
init = [(twod(x),mark) for (x,mark) in init]
for ((x,y),m) in init:
M[x,y]=0 if m == 'b' else int(m)
return M
def subgame_bounds(x,y):
return (3*int(x/3),3*int(x/3)+3,3*int(y/3),3*int(y/3)+3)
def determined(M):
L=[x for x in list(M.flatten()) if type(x)==type(0) and x != 0]
return set(L)
moves = ''
def update(M):
global moves
N=empty_game()
for x in range(9):
for y in range(9):
if M[x,y]==0 or type(M[x,y])==type(set()):
(x1,x2,y1,y2)=subgame_bounds(x,y)
subM = M[x1:x2,y1:y2]
Stile=determined(subM)
Srow=determined(M[x])
Scol=determined(M[:,y])
filled = Stile.union(Srow.union(Scol))
S=set(range(1,10)).difference(filled)
if len(S)==0:
print("UNSOLVABLE")
print("(x,y)",(x,y))
print("subM", subM)
print("Stile", Stile)
print("Srow", Srow)
print("Scol", Scol)
quit()
if len(S)==1:
S=list(S)[0]
(i,j,k,l)=fourd(x,y)
moves = moves + "(does robot (mark %d %d %d %d %d))\n" % (i,j,k,l,S)
N[x,y]=S
else:
N[x,y]=M[x,y]
return N
if __name__ == '__main__':
np.set_printoptions(linewidth=160)
M=initialize()
blanks=len([x for x in M.flatten() if x==0])
print("game steps", blanks)
solved = False
print("initial")
print(M)
for i in range(10):
M1=update(M)
solved = np.array_equal(M,M1)
M=M1
if solved:
break
else:
print("update")
print(M1)
print("solved", solved)
print("steps", i)
print("moves")
print(moves)
|
|
|
Post by steadyeddie on Nov 6, 2015 15:37:15 GMT -8
OK, then. It's a bit basic. I solve Suduko without partitioning, or any heuristics, by saying that if all my rollouts are scoring nothing, I'll use the rollout depth as the score. That way I'm maximimizing the number of moves I'll stay alive. As it happens that approach solves Sudoku 1 in metagaming, and in Sudoku 2 it manages 8 "good moves" and then spots a solution (as a rollout scores the touchdown). Maybe as my player gets faster it'll get there on Sudoku 3+. Also, unfortunately this approach doesn't work elsewhere as far as I know, so maybe instead I'll need to actually add something cleverer, but not yet. An update on this. While attempting to solve Fotushiki 6x6, it appears SteadyEddie is suddenly good at Suduko. And Max Knights. The big difference is that I have added something that appears to be similar to MAST. Once I added that and cranked up the "leverage", all the sudukos, even 6E, became possible. I imagine that is the difference between where I was at and galvanise (who can do the Sodukus), assuming galvanise has no further tricks. On a point of interest is I also added a little extra logic for puzzles that are "disappearing move" puzzles that chooses the move with the most score difference to a move it eliminates. This even solves Futoshiki 6x6 sometimes, depending on config options I have become bored fiddling with (I think perhaps a 0% record on simultaneous move games, where StreadyEddie even manages to lose to Random, is probably a better use of my time).
|
|
dpoly
New Member
Posts: 16
|
Post by dpoly on Apr 17, 2017 20:12:55 GMT -8
A typical game has 6,670,903,752,021,072,936,959 incorrect solutions. Given the size of the space, it is extremely unlikely that MCTS will happen upon the right solution, i.e the chances are 1/10^21. Most Sudokus are easily solved given one sub-goal: make any valid move. There are no blind alleys, all sub-goals lead to the solution. Given 81 cells and N starting knowns, there are at most (81-N)*9 possible first moves, of which (usually) several will be valid. So the problem is (a) define 'valid move' to set any cell as known (b) choose a strategy that pursues any move as a sub-goal. For the remaining minority (no valid move but no solution yet), a brute-force search is required. In my experience, that only ever requires searching a few hundred positions. I'm not a GDL expert, but I know I can do this on other platforms without too much difficulty.
|
|