Web Paint-by-Number Forum

Web Paint-by-Number Forum
Topic #1037: Pondering puzzles on the front page
By Valerie Mates (valerie)

#1: Valerie Mates (valerie) on Mar 17, 2021

[From a Facebook thread.]

I was wondering how long we will have the "twenty restored puzzles from the database crash" section on the home page, so I plugged the numbers in to a calculator. At one puzzle a day, it will take 7.7 years. Wow.

I was looking at that because I was wondering why restored puzzles never show up in the Featured Puzzles section of the site, so I wondered if there was some detail in the database that was preventing them from being selected. But I checked and they can indeed be listed; they just don't tend to be chosen.

That surprised me, because 10% of the puzzles on the site are puzzles that were restored from the database crash, so there are plenty of restored puzzles that could be chosen. But I think the restored puzzles have a few strikes against them: They tend not to have descriptions and solvability ratings, and people tend to rate them lower. But eventually the system should start selecting them to be shown there. Probably it's already happened and I didn't notice.

In clearer language: Basically I was wondering why puzzles that were restored from the crash aren't picked to be featured at the top of the home page. It turns out that they can indeed be picked. But the restored puzzles tend to have lower ratings, so my theory is that they don't get picked as often because of that.

While I was looking at it, I was curious how many puzzles are in the pool of puzzles that have good enough quality ratings and other stats so that they *could* be picked for each of the categories on the front page. So I did a search in the database. Here's how many are currently in each pool:

Beginner: 5,771
Easy: 10,770
Intermediate: 3,571
Hard: 925
Brain-Busting: 51

I thought that was interesting.

#2: Brian Bellis (mootpoint) on Mar 17, 2021

Valerie, I don't remember ever seeing any of my restored puzzles (about 285) on the featured puzzle page.

How are featured puzzles chosen? Is it random within each of the difficulty ratings or does something else like quality or number of raters or solvers go into it?

Since most of my puzzles are smaller and easier, they probably fit into the first two categories. So a Mootpoint puzzle should show up at a rate of about 1.7%. At 4 beginner and easy puzzles each day, I think I should see one of my restored puzzles on the list every 14 days.

#3: Valerie Mates (valerie) on Mar 18, 2021

Brian - The code that selects the day's featured puzzles runs five times, once for each difficulty level. The criteria are:

* Beginner: Difficulty rating up to 6, minimum quality rating 2.9, rated by at least 15 people.
* Easy: Difficulty rating from 6 to 10, minimum quality rating 3.7, rated by at least 15 people.
* Intermediate: Difficulty rating from 10 to 13, minimum quality rating 4.0, rated by at least 15 people.
* Hard: Difficulty rating from 13 to 16, minimum quality rating 4.0, rated by at least 10 people.
* Brain-Busting: Difficulty rating from 16 to 20, minimum quality rating 4.0, rated by at least 5 people.

For all categories, in order to be selected, puzzles also have to have only one solution and be solvable by either color and line logic, moderate lookahead, or deep lookahead. That is, the code doesn't select puzzles that have a trivial solution or require guessing. It also doesn't choose puzzles where nobody has rated the solvability and uniqueness. The ratings for solvability and a unique solution can be the consensus of the people solving the puzzle; it doesn't need to be set by an administrator, but there needs to be at least something entered there - it can't be blank.

For each of the five difficulty levels, the system makes a list of all of the puzzles that match the criteria for that level, and then it selects two of them at random.

Yesterday when I was wondering why restored puzzles don't seem to be selected, I ran the selection code but limited it to only choosing from the ID numbers of the restored puzzles. And it did select puzzles that had been restored. So that code *can* select restored puzzles; it just doesn't seem to ever do that.

I am guessing that there aren't many restored puzzles with enough ratings and other settings that qualify them to be selected.

Just now I re-ran the code for selecting beginner puzzles and split it into restored puzzles vs. regular puzzles. There are only 24 restored puzzles that qualify for selection, while there are 5,748 non-restored puzzles that qualify. In other words, only 0.4% of the beginner puzzles that can be selected are restored puzzles. That would explain why they don't turn up very often!

So I think that over time more restored puzzles will start to qualify for selection, as they get more ratings and as their "Solvability" and "Logical" ratings get entered.

There are 638 puzzles by username "mootpoint" that qualify for selection as a beginner puzzle. Three of those were restored from the database crash. So that is about 0.5% of your selectable beginner puzzles. So if they aren't selected very often, it looks like it's because the restored puzzles don't have a lot of ratings entered yet.

#4: Valerie Mates (valerie) on Jan 7, 2022

Well this is interesting!
There was a puzzle that pretty much everybody was unhappy with in today's selections, so I experimentally increased the minimum quality rating for a puzzle to be selected in the Beginner category from 2.9 to 3.2. Then I was curious how that changed the number of beginner-level puzzles that could be selected. To my surprise, I found that before the change, the number of puzzles that could be selected was 6055, and after I changed it, it was *still* 6055 puzzles. So I tried raising the minimum quality rating a little bit higher, and *still* there were 6055 beginner puzzles that could be chosen. That got me curious about the quality ratings, so I looked up the ratings for all of the puzzles (of all levels) in the database. Here's what I found. The first column is the quality rating and the second column is how many puzzles have that rating:
+---------+----------+
| quality | count(*) |
+---------+----------+
|       0 |     3166 |
|       4 |      189 |
|       5 |      245 |
|       6 |      565 |
|       7 |      892 |
|       8 |     1558 |
|       9 |     2824 |
|      10 |     4710 |
|      11 |     4499 |
|      12 |     4061 |
|      13 |     3713 |
|      14 |     2952 |
|      15 |     1823 |
|      16 |     1185 |
|      17 |      670 |
|      18 |      312 |
|      19 |       72 |
|      20 |       40 |
+---------+----------+
You can see why changing the quality minimum from 2.9 to 3.2 made no difference! Puzzle ratings are stored as whole numbers, and there *are* no puzzles with scores between 0 and 4.
Then I spent a while down the rabbit hole of reading the code for how puzzle quality scores are computed.
Anyway, I have often wondered why puzzles that don't seem like anybody liked them still get selected for the front page, and this explains it.
I am experimentally going to set the minimum quality rating for all difficulty levels to 9 and see how that impacts the puzzles on the front page.
I looked up how many puzzles there currently are that *could* be selected at each difficulty level if the minimum quality rating is 9. Currently the numbers that can be selected are:
Beginner: 4,690 puzzles
Easy: 10,722
Intermediate: 3,652
Hard: 947
Brain-Busting: 63
So I think setting the minimum quality rating to 9 for all levels is a reasonable change to make, and it should somewhat improve the puzzles that are selected each day.
I am going to go ahead and make the change, and then keep an eye on it over the next few days to see how it goes.

#5: Joe (infrapinklizzard) on Jan 7, 2022

The number stored must be the average multiplied by four (and then reduced to an integer). Thus a 5 star equals 20 and a 1* equals 4. There can't be user ratings below 1, so that's why there is no 1-3. The 0 is probably puzzles that have no rating yet.

#6: Valerie Mates (valerie) on Jan 7, 2022

Sort of. The algorithm is:

1) There is an old rating system and a new rating system. A puzzle can have ratings in either system -- or both.

2) For the old rating system, there are two slots in the database. One indicates the sum of the ratings and the other indicates the number of people who rated the puzzle. For example, if two people rated a puzzle and one person gave it a 5 and the other a 3, then the sum stored in the database would be 8 and the number of people giving a rating would be 2. So to find the puzzle's rating, you would divide 8 by 2 and find that the average rating was 4.

3) For the new rating system, each puzzle's ratings are stored in five slots that indicate the number of people who gave it each rating, from 1 (worst) to 5 (best). So, for example, if the same two people had voted, the slots would be:
1: 0 voters
2: 0
3: 1
4: 0
5: 1

3) When the system calculates the Quality score for each puzzle, it is stored in the database as an integer (technically a "tinyint"), so they could not have had decimal places -- so I am surprised that the cutoff numbers have decimal places in them. I am guessing that that must be from a former version of the system.

4) To calculate the score, the code does this:

New-style points are calculated by adding up the number of people who voted for each score, multiplied by that score. So in the example with two raters, the code would multiply 3 times 1 and add that to 5-times-1. We'll call that the new-style-points. Then the new-style points are plugged into this:

int(4*(new-style-points + old-style-total-points)/(number-of-new-style-voters + number-of-old-style-voters) + 0.5)

So, basically, yes, it's sort-of the average rating multiplied by four and reduced to an integer.

(Everybody wanted to hear that much detail, right? :) )

So, if the code that selects puzzles for the front page had cutoffs at around 3 and 4, maybe we actually want the cutoffs to be more like 12 (which is the old cutoff of 3, multiplied by 4) and 16 (4 times 4), to honor Jan's intent.

But I checked how many puzzles would meet the cutoffs of a quality score of 12 for a Beginner puzzle and 16 for the other levels. The numbers are:

Beginner: 1,049 puzzles
Easy: 530 puzzles
Intermediate: 884 puzzles
Hard: 429 puzzles
Brain-Busting: 26 puzzles.

Some of those numbers are kind of low, so for now I'm going to go with a cutoff of 9 at all levels and see how that goes. That should remove the lowest-scoring 1/3 of puzzles from the front page, which feels about right.

As always, I'm happy to have feedback, thoughts, ideas, corrections, etc.

#7: Byrdie (byrdie) on Jan 7, 2022

For me, I guess there would be a big question on what is meant by "quality." I've always related it to the image, therefore a 5/5 rating from me could mean that the image was really beautiful but the puzzle was high difficulty because it was virtually unsolvable. It also could mean the image was wonderful but it took all the tricks and skill I could muster to solve it.

I have a feeling that some people relate "quality" to the solving experience - "I didn't enjoy solving this puzzle because it was unsolvable so I'm going to give it a low quality rating."

My first example, a beautiful puzzle that's unsolvable, is a quandary for me because I see a 5/5 as a high rating but I don't believe that's a puzzle that belongs in the recommended puzzle list.

#8: Valerie Mates (valerie) on Jan 7, 2022

For me, puzzle quality includes everything -- whether I liked the image, whether it felt good to solve it, and anything else about it. But I think it's good that puzzle quality means different things to different people, because that all gets mixed into the score and makes the ratings more widely helpful to more people.

#9: Valerie Mates (valerie) on Jan 8, 2022

Today's picks all have a quality rating of at least 2 1/2, and most are higher. So I think this was a good change.

#10: Kristen Vognild (kristen) on Jan 8, 2022

Excellent! (I mean, as long as I see my own puzzles regularly, I'm happy) ;)

#11: Valerie Mates (valerie) on Jan 9, 2022

After two days of flight time, I would say that the puzzles with a score of 2 1/2 are noticeably less good than the ones with higher scores. So I'm going to raise the cutoff from 9 to 11 in all categories except for brain-busting. I'll leave the minimum for brain-busting at 9 because otherwise I think the pool of puzzles would be too small.

#12: Valerie Mates (valerie) on Jan 16, 2022

After a week of flight time, I think the new cutoffs for the Featured Puzzles are working well. The selected puzzles seem more enjoyable to me, whatever that means.

Goto next topic

You must register and log in to be able to participate in this discussion.