XML Format for Paint-by-Number Puzzles
"Paint-By-Number" is one of many names for a type of graphical logic puzzle
originating in Japan. Other common names for these puzzles are Nonograms,
Griddlers, Hanjie, and Picross. The puzzles consist of a blank grid with
numbers along the top and one side. The numbers tell the sizes of the blocks
of the foreground color in that row and column. To solve the puzzle, you must
reconstruct the image by figuring out how the color blocks must be place in
each row and column.
One place to find examples of these puzzles is at my web site,
That site offers offers a tool that allows one to export any puzzle in any
of several formats.
Many other sites provide puzzles to solve on-line (e.g.,
various books and magazines publish these puzzles regularly. Quite a few
people have also written programs that allow puzzles to be solved on a
home computer or a PDA, and various other people have written programs to
solve puzzles automatically.
Nonogram site includes a good collection of links to paint-by-number
In surveying these, I noticed a lack of a good, standardized file format
for paint-by-number puzzles.
Every program seems to invent it's own file format.
So I decided to come up with an XML file format that would be rich enough
to handle most puzzle-storing applications.
I offer this with some trepedation, as XML formats are typically the
subject of endless bickering, and I have very little experience with
designing them, but I think this is OK and I presume future versions will
The present format has the following charms to recommend it:
Obviously not every program that reads or writes these file formats needs
to support all features. None of my software, for example, supports triddlers.
- Works for multicolor puzzle as well as black &
- Works for puzzles with triangles and for
These variations on the standard paint-by-number puzzle will be
familiar to users of the griddlers.net
- Can store single puzzles, or collections of puzzles.
- Can store puzzles either by giving just the clues,
just the image of the goal solution, or both.
- Can store various supplemental information about
puzzles, including titles, information about authorship and copyrights.
- Easily readable/writable using any standard XML parsing library.
- Reasonably well documented (you're looking at it).
Another XML format
for paint-by-number puzzles has been described by Steve Simpson. He designed
his own "so that a future solver will be able to deal with multicoloured
and non-rectangular puzzles," which is a bit perplexing, since
my format has always supported both those variations.
However, his format is fine and if there weren't multiple incompatible XML formats
for the same thing, then it wouldn't be XML.
Below is a sample XML paint-by-number puzzle description file. This is a
simple two-color puzzle. These files can contain multiple puzzles, but this
example contains just one.
This example includes both the clues for the puzzle and the intended solution.
It would be possible to omit either the clues (the two <clue> tags and
all their content) or the solution (the <solution> tag and all it's
<!DOCTYPE pbn SYSTEM "https://webpbn.com/pbn-0.3.dtd">
<puzzle type="grid" defaultcolor="black">
<copyright>© 2004 by Jan Wolter</copyright>
A dancing stick figure man.
<color name="white" char=".">fff</color>
<color name="black" char="X">000</color>
A DTD file for this format is available from
The character entities that may appear in the file are are identical to
those that may be used in HTML files.
In our experence, character entities in XML files are a pain. I recommend
not using them if possible.
The following tags appear in the document:
The root tag for the document. Should be included even if there is only
One puzzle in the set of puzzles. Appears inside <puzzleset>.
puzzle type. Defaults to "grid" if omitted. Current legal values
Other values may be defined in the future.
Cells are square and there will be a set of row clues
and a set of column clues.
Cells are triangular, the puzzle is a big hexagon, with
clues along six sides.
The name of the color to use for any <count> tag
that does not include a color= attribute. If this is not
defined, the default default color is "black".
The name of the background color. This defaults to "white".
Where does the puzzle come from?
This string should be different for each possible publisher of puzzles.
Appears inside <puzzle> and/or <puzzleset>.
If placed in the <puzzleset>
this is the default source for all puzzles that don't define a source.
May be omitted.
An identifier for the puzzle that is unique within the source.
Syntax depends on the source.
Can only appear in a <puzzle> tag, not in a <puzzleset> tag.
May be omitted.
In a <puzzle>, this is the title of the puzzle.
In a <puzzleset>, this is the title of the collection of puzzles.
May be omitted in either place.
In a <puzzle>, this is the name of the author of the puzzle.
If placed in a <puzzleset> tag, this is the default author name
for all puzzles that don't define an author name. May be omitted.
In a <puzzle>, this is an identifier for
the author of the puzzle which is unique within the source.
Syntax depends on the source.
If placed in a <puzzleset> tag,
this is the default author id for all puzzles that don't define
an author id. May be omitted.
In a <puzzle>, this is a copyright message for the puzzle.
If placed in the <puzzleset>
tag, this is the default copyright message for all puzzles that don't
define a copyright message. May be omitted.
A description of the puzzle. The sort of thing you might want to display
to the solver after they have solved the puzzle.
Can only appear in a <puzzle> tag.
May be omitted.
Defines a color name used in this puzzle. Must be in a <puzzle>
tag. There may be multiple color definitions for a puzzle. Each
color definition must have a name= attribute.
The name of the color. This can be any text string.
A one-character representation for the color. This is
used to represent the color in solutions. May be omitted,
especially if there are no solutions. White space characters
are not legal. Neither are the characters '\', '/', '|', '?',
'[' or ']'.
The content of the color tag is a color value, typically an RGB
color code. This is usually a 3 or 6-digit hexadecimal number,
like "3cc" or "210fbe". A three digit string like "123" is
equavalent to the six digit string "112233".
For puzzles with triangles, the value can contain two hexadecimal
color codes, separated by a "/" or "\", like "000/fff" or
Two color names are predefined. (Because of this, the color declarations
in the sample are redundant and could be omitted.)
Any other color used in the puzzle must be defined by a <color> tag.
A set of clues used in the puzzle. Must be in a <puzzle> tag. The
"type=" attribute is required. For a puzzle of type "grid" we expect
to see a set of clues with type "columns" and a set of clues of type
For a puzzle of type "triddler" we expect to see six sets of clues,
with types "top", "topright", "bottomright", "bottom", "bottomleft" and
"topleft". This labeling assumes that the puzzle is oriented so that
there are horizontal lines separating cells, but not vertical lines,
and that the horizontal clues are at the left of the puzzle, like this:
/ / 2 /
/ 3 / 1 /
___ ________ 3 /
1 1 1 /\ /\ /\ /
2 3 /\ /\ /\ /
___ /__\/ _\/__\/ \
1 \ /\ /\ / 2
___ \/__\/__\/ \
\ 1 \ 2 \
\ \ 1 \
The "topleft" and "bottomleft" clues are clues for horizontal rows of
cells above and below the bend on the left side of the puzzle.
The "top" and "topright" clues are for lines in the / direction. The
"bottom" and "bottomright" clues are for lines in the \ direction.
It is possible for some clue-sets to be empty (if the puzzle has a
The puzzle above would be represented like:
Each clue set contains one or more lines of clues.
Line tags must be in a <clues> tag.
The order of the lines within the clue is from top to bottom for the
horizontal clues, and from left to right for vertical or diagonal clues.
Each line contains zero more more counts. Count tags must be in a
The order of the counts is from left to right for horizontal lines,
and from top to bottom for vertical or diagonal lines.
The content of a count tag is normally a positive integer, which
gives the length of a block. Zero values are used to indicate blotted
Counts have an optional attribute called color:
The color for the clue. This can be any color name defined
by a <color> tag. If the color= attribute is omitted, then
it defaults to the value set by the defaultcolor= attribute on
the <puzzle> tag.
A <puzzle> tag file may contain solutions. Each solution can have a
type= attribute which can take one of the following values:
This is the goal solution intended by the designer of the puzzle.
If a solver reaches this solution, then the puzzle is considered
"solved". Well designed puzzles typically have only one goal
solution, but the file format allows for multiple goal solutions,
in which case reaching any goal would constitute "success".
This is the default if no type attribute it given.
This is a solution, but not necessarily a goal. If a solving
program found 16 solutions to a puzzle, it might write them
each out with type="solution".
This is a saved soluiton to the puzzle. It is not necessarily
correct or complete. If you had partially solved the puzzle,
and wanted to save it for further work later, then that would
be written with type="saved".
Solutions may also have an "id=" attribute, which can be used to label
They can contain <image> and <note> tags.
Each <solution> tag contains exactly one <image> tag.
The <image> tag gives that solution as a string of
characters. All white space characters (space, tab, newline, carriage
return, line feed) are ignored. All other characters must be either:
- A '|', '\' or '/' character to mark the beginnings and endings of
- A character matching the "char=" attribute of some defined color
to indicate a cell of that color.
- A '?' which means that the cell could be any color.
- A sequence of characters like '' which means that the cell
could be one of the colors who have char= attributes of '2', '4,'
or '5', but not any other color. If all declared colors (including
the background color) were listed, then this would be equivalent to
a '?'. If only one is listed, like '[X]' then that would be
equivalent to just writing 'X'.
The latter two forms are acceptable only in type="saved" puzzles, not
in type="goal" or type="solution" puzzles.
Probably to be really XMLy, there should be some fancy substructure to
this tag, but I felt it was simpler just to have it contain an image
of the puzzle.
For grid type puzzles, the solution is given row-by-row. Each row starts
and ends with a | character. There may be line-feeds separating the
rows, but there need not be. The solution in the sample above could
equally well be given as:
For triddlers the solution is also stored row-by-row, but the line
starting and line ending characters are / or \ depending on the slope
of the edge of the puzzle. Basically if the puzzle looks like this:
/\ B/\D /\
/\G /\I /\K /
\L /\ N/\ P/
then we save it like this (except, of course, that the letters are
replaced by whatever symbol indicates the color for that cell):
Notes are always optional.
They can appear in <puzzleset>, <puzzle> or <solution>
- Version 0.1 - Jul 26, 2007
- Original release.
- Version 0.2 - Jan 14, 2009
- Added <authorid> tag.
- Version 0.3 - Jan 20, 2009
- Generalized use of <note> tag.