I’m conceptualizing a solver for a variant of sudoku called multi-sudoku, where multiple boards overlap like so:
If I understand the game correctly, you must solve each grid in such a way that the overlap between any two or more grids is part of the solution for each.
I’m unsure as to how I should be thinking about this. Anybody got any hints/conceptual clues? Additionally, if any topics in artificial intelligence come to mind, I’d like to hear those too.
2
Answers
Constraint programming (CP)
Sudoku is a typical constraint programming problem. You have a set of variables (the fields in the grid) each with a domain (here the digits
0
to9
) and a set of constraints over these variables (the fact that a number occurs only once in a row, column, block,…).A generic way to solve constraint programming problems is arc consistency (AC): you propagate constraints. By variables that are (partly) filled in, you can reduce the domain of the remaining variables, etc. Finally if propagation no longer can make the domains smaller, you have to make a choice.
With a choice, you select a value for a certain variable. A good strategy is to select a variable with a small amount of possible values left. Next you propagate again and possibly make another choice and so on.
It is possible that your program finds out that a choice was wrong: it makes the domain of one or more variables empty. In that case, you backtrack: you undo a choice you’ve made earlier (as well as the propagation that was done after that choice) and pick another value.
This answer evidently does not aim to provide an in-depth overview of the topic, but the Wikipedia page can give a better overview and pointers to more information.
There are constraint programming solvers like ECLiPSe (not the IDE), MiniZinc, etc. where one can simply define the variables, domains and constraints.
Solving the problem with ECLiPSe
On the ECLiPSe website, you can find a model for sudoku. Given you read some documentation about ECLiPSe, you can turn this file into a model for multi-sudoku. I’ve made some small modifications resulting in the following quick-and-dirty solution:
I “borrowed” the model of the sudoku from Joachim Schimpf, so credits to him. Furthermore note that this answer does not advice to use ECLiPSe over another tool. I’m simply more familiar with the Prolog toolchain when it comes to constraint programming. But if you are more into C++, Gecode will do the trick with approximately the same (or even better) performance.
which generates the output:
It took my machine approximately 0.11 seconds. Furthermore there are 60 valid solutions in total.
The last two “matrices” show the solution for the two Sudoku’s. As you can see (I haven’t checked it fully), they share a block (the same output), and all sudoku constraints are valid. A more convenient representation of the solution is shown below:
I don’t know about a constraint programming library in Python, nor do I know of a port of ECLiPSe to Python. But my experience is that all modern programming languages have such tool.
The advantage of using a constraint programming tool like ECLiPSe, Gecode,… is first of all that you only have to specify your problem, how it is solved is something you don’t have to worry. Furthermore such libraries implement 30 years of research into constraint programming: they are extremely optimized, exploit constraints and structures in a way most people cannot imagine and are less likely to contain bugs (than a custom made algorithm). Furthermore if new strategies, algorithms,… are found, an update of ECLiPSe will result in faster processing of the model.
A disadvantage is that some hard problems can still not be solved using constraint programming: the search space is simply too large, the constraints are that complex that the domains are not reduced to small sets, and there is not enough processing power to make enough choices in order to find a valid solution. Another disadvantage is that it is not always easy to specify a problem: although programmers aim to design good constraints, there are always complex problems where there are no easy-to-use constraints defined for.
Other techniques
Evidently there are other AI techniques to solve problems. A technique that is commonly used to tackle hard search and optimization problems is evolutionary computing: one starts by filling in the sudoku allowing some values to be wrong, then at each time step they aim to fix one or more fields. Sometimes they introduce new errors in order to find a valid solution eventually.
A different approach would be to use a genetic algorithm. Which is based on biological evolution. Genetic algorithms are part of evolutionary algorithms and therefore share the analogy “fitness-function”. A valid solution is found via recombination, selection and mutation.
The basic concept is to initialize a number of solutions “individuals” which consist of “chromosomes” by random (Solutions can be wrong of course). Define a “fitness-function” which evaluates the quality of every individual within the current “generation”. Take with a higher probability the individuals with better fitness to recombine them into a “child” solution and mutate, with a low probability, some individuals within the new generation. Repeat until some criteria (max_iteration_count, valid_solution_found) is true.
To fit a genetic algorithm to your problem. You could do something like the following. Create for every 3×3 or 9×9 sudoku a local correct solution. These are the chromosomes. Put them in one data structure. Thats an individual. Create a bunch of them to form a generation. Evaluate every individual by the constraints its violating. Calculate corresponding probability. Recombine the individuals, mutate some chromosomes, repeat.
You could see it as combination of local and global optimization. Since you would need some kind of greedy algorithm (or whatever you want to use to solve the local sudoku) to find a local, and the GA to find the overall global optima. I think it would not make sense to only use the GA and try to solve this Problem. I implemented a GA on a combinatorial Problem recently and without local optimization the results were, most of the time, quite terrible.
The good thing about GA though, is that they make large steps at the beginning of the search converging towards an optima but still cover big parts of the search space. But as always with EA, tuning the parameters can be quiet tricky.