67 Comments
Jan 7, 2022·edited Jan 7, 2022Liked by Tom

I could be wrong and not thinking this all the way through, but I don't think this is optimal. Your trees handle bigrams and co-occurrence, but isn't it the case that a particular guess could seem to be be suboptimal at one step, but turn out to have been optimal at a later step?

Imagine (for some word list) that guessing "s" first narrows the range as much as possible in step one (say, it narrows the range of possible words 50%). The three branches from that guess (green, yellow, or black) lead to three new optimal guesses for your second guess. Let's say that green "s" means the next optimal guess narrows the field by a further 10%, yellow "s" means the next guess narrows the field by 5%, and black "s" means the next guess narrows the field by 3%. Taken together, the two guesses narrow the field by 60% ("s" was green), 55% ("s" was yellow), and 53% ("s" was black), which is an average of 56%.

Isn't it the case that, for that same word list, guessing "p" first might narrow the range by less (say only 49%), yet the three branches for the next guess might be better than they were for "s"? So "p" might seem like a worse guess at the moment, but it might be that the optimal guesses off each branch of that guess are better. Maybe the subsequent optimal guess after green "p" narrows by 9%, after yellow "p" by 8%, and after black "p" by 7%. So the sequence of guesses taken together give you 58%, 57%, and 56%, for an average of 57% (higher than 56%).

I don't think you can actually optimize this with a single tree at each step. I think you would need to actually build the full tree of trees (not just the trees you have here, but the tree of these trees for each subsequent guess, all the way to solutions). Your approach, looking at only one of those subtrees at a time, assumes a constraint of that larger decision tree that I don't think it has.

Expand full comment
Jan 18, 2022·edited Jan 18, 2022Liked by Tom

This is bugged for 213. RAISE>COUNT>BLOOD narrows the scripts possibility array down to [PROXY, GROOM, PROOF]. Since the 2nd O in BLOOD is grey it should exclude GROOM and PROOF but instead guesses both of them first before PROXY.

In fact it guesses GROOM before PROOF even though PROOF would be the better guess between the two.

Expand full comment

Hey, I believe this can be improved on by considering more than just the first word. It turns out that some "best" first words tend to result in some "bad" second guesses, so some different words are likely to be optimal. See https://jonathanolson.net/experiments/optimal-wordle-solutions for more information.

Expand full comment
Jan 16, 2022·edited Jan 16, 2022Liked by Tom

Nice approach. I have a question for Tom: Any kind of game cheater relies on some slightly sociopathic ability to take a contrived set of rules, sometimes as in this case ambiguous, and adapt them in a way that provides self-satisfaction and nothing more. Unless there are prizes but in the case of Wordle there isn't even bragging rights because everyone knows you can cheat with two browser sessions. I'm pretty sure that the satisfaction here comes from the ability to understand and document the behavior of the game in a fun way (despite your reputation at parties) and not from the ability to cheat just a little but not too much. IE you are making analysis, not Wordle, fun. Still, I wonder, on the vast sliding scale from (at one extreme) playing honestly as if the word list is huge and the choice is random, all the way to (at the other extreme) just using the source code, or for that matter, using two browser sessions, to obtain a "1" score every single time ... what shady self-justification did you use to arrive at this particular level of cheating along the spectrum?

Incidentally, if you want to apply the lessons learned here but without memorizing the word list or actually using a cheat tool, an ideal first two guesses are LANES and TORIC. Then you'll usually have a very small number of possible words and you can either use your head or a anagrammer or Scrabble cheater to find them (depending on your level of self-delusion).

And finally, a new Wordle knockoff game has been launched, probably not by the author of the original game, but with well-planned digital marketing design that will probably attract all the search and game play away from the original within a couple of weeks ... and that uses a far less simplistic approach including allowing the user to choose the number of letters. I suspect that fairly soon the original game will be lost in obscurity because it has NO digital marketing thought at all, not even a decent URL.

Expand full comment

Hi! Great tool, but a bit too much clicking on very small boxes. Maybe preselect "not in word" for each letter?

Expand full comment
Jan 20, 2022·edited Jan 20, 2022

Thank you for the enjoyable blog post! A question and an observation. Might you update the post and gist to match the code running on the server? And the solver stumbles on "FOCAL" — there's still a little problem processing black tiles and it enters a loop guessing "LOCAL." (The first time around, the filter doesn't correctly handle green repetitions following black repetitions of a letter. On following iterations, all guesses being placed in the same "group" rather than different groups results in the excess repetitions not being detected as intended.)

Expand full comment

Hey Tom, I love this tool! My partner and I have been using this to analyze our strategy after the fact. One thing I’d really love is the ability to view the list of remaining words (e.g after you enter a word and it says “12 words remaining” it would be incredible to be able to see all twelve of those words). Obviously for large numbers of words it’s not practical, but anything below 30 or 40 or so would be super useful!

The other thing, which I imagine would be harder to implement, would be something that allows me to see the “strength” of a guess. Something like the average number of possible words remaining given any possible color configuration. That way I could compare the strength of my guess to the strength of the optimal guess.

Thanks so much for building this!

Expand full comment

Can't solve today for STUNG.

Expand full comment

This is fantastic. Your approach to solving wordle seems much better than others I've seen, and the fact you have managed to code it is really impressive! One thing though... - test it with DROLL, a word that came up the other day. I tried it, and it started looping at DROOL and did not complete in six tries...

It probably needs a tweak to stop it trying to use a word it has already tried! ;-)

Expand full comment

This is brilliant! Your solution approach is fantastic, and the fact you have managed to code it is even more impressive. One thing though - test it with DROLL, a word that came up the other day. I tried it, and it gets stuck in a loop on DROOL and doesn't complete in 6 tries! :-(

Expand full comment

Today's Wordle word, April 17: 'AMPLE' gives me a "Cigar, I think you screwed up" after 'APPLE' in the following successful sequence on NYTimes Site: 'TRACE', 'BONUS', 'APPLE', 'AMPLE'.

Expand full comment

This is a bit buggy today (4-6-2022) - I guessed RAISE --> CLOUT --> COCOA but after putting in the results of cocoa, it keeps suggesting the same disproven word as the best guess again.

R A I S E (Yellow A)

C L O U T (Green C, yellow O)

C O C O A (Green C, Green O, gray C, gray O, green A)

repeats

Expand full comment

Minor bug today? - Fun to see how the algorithm now and then after I solve, to see it’s approach. But noticed that after it’s fourth guess today: CATCH, the green boxes are missing. End of the road. (RAISE, LYNCH, HUMPH, CATCH)

Expand full comment

It didn’t work for the word ELDER. It told me I screwed up and called me a cigar.

Expand full comment

I recently analyzed English 5 letter words. I wanted to know what letters are most common in five letter words. After seeing the letter frequencies it was easy to figure out what would be the best guesses for first words in Wordle. Here you can see the Wordle five letter statistics https://www.unscramblerer.com/wordle-solver/

However choosing an optimal second word becomes much more harder. Also probably ruins the fun at some point.

Expand full comment

This has a bug in color response parsing. Try solving today’s wordle (230) with initial (manual) guess “GEESE”. It’ll fail and tell the user they screwed up.

Expand full comment