"Paint-By-Number" is one of many names for a type of graphical logic puzzle originating in Japan. Other common names for these puzzles are Nonograms, Griddlers, Hanjie, and Picross. The puzzles consist of a blank grid with numbers along the top and one side. The numbers tell the sizes of the blocks of the foreground color in that row and column. To solve the puzzle, you must reconstruct the image by figuring out how the color blocks must be place in each row and column. One place to find examples of these puzzles is at my web site, webpbn.com.
The 'pbnsolve' program is a tool to automatically solve such puzzles. It was written primary to validate puzzles created by users for the webpbn.com site. We solve the puzzle to determine if it has a unique solution and sometimes to get a feel for how easy or difficult it is. There are many similar programs available on the web. Steve Simpson has a nice solver, and links to many others, including a very good solver by Mirek and Petr Olák. I have posted a page with comparisons of some of these solvers.
The Olák's solver came close to doing what I needed for webpbn.com, but the fact that the code is all in Czech makes it hard to modify for a person ignorant of that language, so I wrote my own.
This is fast, but not complete. It will miss some cells that could be marked.
The trickiest part in coding this is finding the left-most and right-most solutions. The code for this in pbnsolve is fairly fast, but extremely ugly, with many special cases.
Unlike most other solvers, pbnsolve handles multicolor puzzles. For each cell, it maintains a list of what colors that cell could be. (This is a major improvement over the webpbn.com "helper" program, in which cells are always either one specific color, or completely unknown.)
The left-right overlap algorithm is extended slightly to handle this case. When we intersect the left and right solutions, we do the following:
For all blocksIf you do this, you typically find that there are some cells that can't be some colors, for example if the first clue for a line is a green seven, then the first seven cells in the line can't be red.For all cells between position of the left-most cell of the block in the left solution, and the right-most cell of the block in the right solutionThat cell may be the color of that block.
Steve Simpson has described what appears to be a better line solving algorithm. Someday I should figure out if that can be adapted to multi-color puzzles.
Then it enters a loop where it keeps grabbing a line off the job list, and handing it to the line solver. Every time the line solver marks a cell, the crossing lines are put on the job list (or have their priority bumped up if they are already there).
Often, this process will suffice to solve the puzzle. If it succeeds, we are assured that there is a unique solution, and that it is solvable by mortals humans, since humans are as good or better at line solving than the left-right overlap algorithm.
In many cases, however, this will fail to fully solve the puzzle. The job list will empty out, with some cells still in an unknown state.
When pbnsolve reaches this point, it does an exhaustive check to make sure that every cell that can be marked by looking at one clue at a time has been marked. This is necessary because our line solver algorithm is not complete and can miss marking some cells. The exhaustive check tries marking every unsolved cell with each of the colors that it could possibly be, and then checks if the row and column containing that cell can still be solved. If not, it removes that color from that cell's list of possible colors. If this algorithm succeeds in marking any cells, we resume the normal line-by-line solving algorihtm.
If the exhaustive check fails, then logical solving will have failed us, and it is time to try something else.
When the logic solver stalls, the program makes a guess by picking a cell and assigning it a color. It then restarts the logical solver, this time keeping an undo history of all the changes it makes.
One of three things will happen. The first possibility is that the logical slover will stall again, with the puzzle still incomplete. If so, we make another guess and proceed by starting up the logical solver again.
The second possibility is that the logical solver could hit a contradition. This occurs if the line-solver is unable to find any legal solutions for a line. If that happens, we backtrack to the most recent guess, using the undo-history to undo everything we did. We then invert the guess. This is not a guess anymore, we know that the cell must be some color other than the color we guessed before. We can then restart the logical solver and see how far we get.
The third possibility is that the puzzle solves completely. Then, of course, we can stop, unless we are trying to determine if the solution is unique. If we are trying to prove uniqueness, we need to backtrack just as we did in the previous case. If backtracking leads to other solutions, then we know there are multiple solutions. If it only leads to contraditions, then the first solution we found must have been unique.
Choosing good guesses is fairly important to the performance of this algorithm. Several different guessing algorithms are implemented in pbnsolve. You choose between them by editing the config.h file and recompiling.
This guessing algorithm worked well on most puzzles, but there were some puzzles which humans could solve quite well, that the program would just run (seemingly) forever on.
Depth-first search gets hopelessly inefficient if the chain of choice points gets too long. If you have to make ten guesses to to get to a solution, then the search tree has something like 210 branches. Exhaustively exploring all that is pretty hopeless.
However on many of the puzzles where this kind of thing was happening, you could do much better if you made different guesses. Some guesses would get you much further than others, so fewer guesses would be needed to reach a solution. But the guessing heuristics weren't good enough to find these for all puzzles.
The way it works is that it chooses a set of cells that seem likely candidates for a guess. Specifically, it chooses:
There are some important special cases that come up. Sometimes running the logical solver on one of the candidate guesses will end up solving the puzzle completely. In that case, obviously, we can abort the probing and run with the solution we found.
Similarly, sometimes one of the candidate guesses will lead to a contradiction. This is almost as good. Again, we abort the probing phase, invert the guess that led to the contradiction, and proceed with that.
There are a lot of inefficiencies in this. It does a lot of computation to explore in various directions, and discards all the results of all that work. Then it chooses one and repeats the logical solution of that. Really kind of stupid, and it's not surprising that many of the puzzles that solved quickly with the guessing algorithm, run quite a bit slower with the probing approach.
But it's not exponentially slower in those cases. In many cases where the guessing approach got lost in an exponential search, the probing approach finds better guesses, making the search tree substantially shallower, and thus winning an exponential speed improvement. This can be seen in the Full Results Table, where it can be seen that though pbnsolve is a bit slower on many puzzles of moderate difficulty, it is a lot faster on some really hard puzzles.
Originally I believed that the success of the probing algorithm was founded on the fact that it finds guesses that get us farther toward the solution, thus reducing the number of branch-points in the search tree and reducing the n in its 2n size. But when I started collecting statistics, I discovered that something different was actually happening. On the puzzles where probing really performs well, pbnsolve was almost never choosing the probe that got it the furthest toward the goal, because it was almost always finding a contradiction before the probe sequence ended. Here are some statistics on a few difficult puzzles, from pbnsolve-1.08:
|Puzzle||Size||CPU Time||Number of Probing Sequences|
|Total||Ended in Contradiction
|#1611: Merka||55 x 60||0.01 sec||25||25||100.0%||0|
|#1694: Tragic||45 x 50||0.03 sec||193||191||99.0%||2|
|#6739: Karate||40 x 40||0.79 sec||9,961||9,351||93.9%||610|
|#2040: Hot||55 x 60||0.82 sec||2,508||2,417||96.4%||91|
|#2712: Lion||47 x 47||6.0 sec||36,036||33,910||94.1%||2,126|
|#6574: Forever||25 x 25||9.4 sec||221,846||194,838||87.8%||27,008|
|#8098: 9-Dom||19 x 19||10.2 sec||164,216||152,037||92.6%||12,178|
|#2556: Flag||65 x 45||118.7 min||127,421,555||102,554,824||80.5%||24,866,731|
Many of the most troublesome puzzles are the ones with very large numbers of solutions, like Lion and Flag, or ones where you have to look many moves ahead to find contradictions, like 9-Dom. These are exactly the kinds of situations where contradictions are hard to find.
So it appears what probing is really about is pruning the solution space by aggressively searching for contradictions.
So in version 1.09 pbnsolve, by default, uses a somewhat kludgy "plod & sprint" algorithm. It starts out plodding through the solution space with the standard probing algorithm. But if it is not finding many contradictions, if the rate of finding them falls below 88%, then we switch to sprint mode for a while, making our decisions with the guessing algorithm for a while. After about 4000 iterations of that, we give it up and resume plodding for a while.
In practice, most puzzles are solved without ever sprinting. When we do sprint, it actually slows us down a little in about half the cases. But in the remaining cases it gives us quite impressive speed ups, solving some puzzles in a fraction of a second that used to take hours. So this is a useful trick, if not a particularly brilliant one.
Though the merging code works, it doesn't actually improve the performance of the solver. The overhead for the extra bookkeeping almost always exceeds the gains made. So, by default, pbnsolve does not do merging, but it can be turned on by a command line option.
I think basically this would only get us better performance if it found merges that worked in situations where contradictions could not be found. Most of the time we do find contradictions, and merges work out in only a fairly small percentage of the remaining cases, so they don't give you a lot of benefit.
Basically, everytime we run the line solver described in section 3.1, we save the result in a hash table. The line clue and the initial state of all the cells in the line form the hash key, and the final state of the line forms the data. We check the hash table before every call to the line solver, and skip the call if we can get the solution from the hash table. If the hash table gets too big, we simply empty it. In pbnsolve's implementation, we enlarge the hash table after the first few times it is is flushed. We also do a lot of compression of the data so the hash table doesn't get too big.
This works amazingly well. I would have expected hit rates on the table to be rather low, but they aren't. For a typical hard puzzles, one sees hit rates in the 80% to 98% range. Apparantly once the solver starts searching there is a lot of re-solving of the same problems. So this seems to cut the run-times on hard puzzles by 33% to 50%. The memory cost is substantial, but even small, frequently flushed tables pay off pretty well.
However, for every solver there seems to be a few rare puzzles that cause much more difficulty, where the search becomes lost in an exponentially branching tree of guesses. Pbnsolve seems to do better on most of these than other solvers, probably due to the probing algorithm, but there some rare puzzles that pbnsolve cannot solve in a reasonable amount of time.
All solvers are going to go exponential on some puzzles. We know this, thanks to some folks who proved that solving paint-by-number uzzles is NP-complete. Pbnsolve does fairly well, but I think there is still room for improvement.
I think a bigger performance improvement would come from a better search algorithm, perhaps using something like an A* algorithm instead of a depth-first search. Such searches don't explore in only a single direction, but instead explore along many branches of the search tree simultaneously. The down side is that all the leaves of the tree have to be kept in memory simultaneously, so this can eat more memory the longer it runs. (The current pbnsolve program uses only a modest amount of memory and memory consumption doesn't grow much no matter how long it runs.)
An interesting direction to go would be to try to apply more of the special reasoning tricks that humans use in solving more complex puzzles, like edge logic. This would be hard to do, but it would make the program into a better tool for assessing how hard the puzzle is for humans to solve.
There are whole classes of other non-traditional search algorithms that might work well here. Sometimes when a person solves a puzzle, they find that due to a mistake they are in a contradictory state. Sometimes then you can move things around to try to fudge the solution back into shape. Some kind of relaxation algorithm that temporarily produces illegal solutions and then fudges them back into shape might be interesting.