Wordle Start Words

Example completed Wordle

I have analysed commonly used start words for the game Wordle which can be played here. A prime concept in this analysis is that there are a certain number of possible solutions (at the time of writing this is 1,735 and it reduces by 1 each day). Each guess that you make will reduce the number of possible solutions and once they reach 1 you have the answer.

The Criteria

For each chosen start word this analysis looks at the number of Remaining Solutions for each possible colour pattern and records 5 criteria:

  • Max – The worst case number of Remaining Solutions.
  • Entropy – The higher the number, the higher the probability that the guess will bring you closer to the correct answer than one with a lower value. This isn’t a concept that can be described easily in a few words, but essentially it adds up the probability of getting each colour pattern (eg bbbbb, bbbbg, bbbby,… where b is black, g green and y yellow) in such a way that the flattest distribution gives the highest value. Mathematicians call it the amount of information the guess will provide. Watch this video if you would like to know more about information theory entropy and how it is calculated.
  • One – The number of colour patterns that will uniquely identify the solution (Remaining Solutions of 1). ie There will be times when given the chosen guess and the pattern that comes back, there will only be 1 possible solution remaining.
  • Two – The number of solution words, out of the 1,735 (at time of writing) available, that will have two or less Remaining Solutions after using the starting word.
  • Three – The number of solution words, out of the 1,735 (at time of writing) available, that will lead to three or less Remaining Solutions.
  • I have also indicated whether or not the potential best starting word can be the answer. There are 2 reasons that the answer can be No. Firstly, Wordle has a much longer list of words that will be accepted as guesses, but which will not solve the puzzle. Secondly, Wordle does not reuse answers, so once a word has already been a solution it can no longer be a solution.

Results

WordMaxEntropyOneTwoThreeAnswer?
about284751255166y
adept234739224661y
adieu218600163045n
aisle147929214379y
arise116939234361y
arose143936224471y
audio331694214159y
clamp449803285472y
clasp347911344881y
close1909382060102y
cones206919286684n
depot284769244679n
hates212864284661n
heart 2081, 0943064103y
lance2131, 039316191y
leant1641, 0472953110y
meaty252787215368y
ocean182870345670y
opera180816295362y
ouija395448163042n
parse2061,1663276118y
pious354685183868n
plaid343874265490y
print311865225279n
recap243861235386n
roate156936214765n
salet1651,147316996n
scalp347823274771y
serai130847224063n
slate1651,146326693n
slice221998255383y
soare143981214168n
stale1651,0943064106n
stare173994184480y
steal166936254572y
store187938225282n
strap271896285888y
tales165917235163n
trace2031,1693276100n
train2041,060335594n
tramp330788274774y
trice2171,051295995n
tried2431,044326479n

Interpretation

Potential Answer

Theoretically, a guess that can’t be an answer could still be a good guess, but this is not supported by the evidence above. Words with “n” in the Answer? column aren’t standing out in this list enough to warrant an unwinnable step. Thus, whilst the unwinnable TRACE has the highest Entropy and Two scores it is only slightly better than the winnable PARSE, so I consider PARSE to be a better option.

Criteria Usefulness

None of these criteria are direct measures of the aim in playing the game. I play to find the word with as few guesses as I can. The best measure would therefore be about the number of guesses needed, which you would want to be less than 7. These criteria give numbers that can be much bigger than 7, meaning that small differences might make no or infrequent differences.

Number of guesses criteria are not provided because such measures require dealing with very large numbers and on using calculation techniques that are not easily emulated with spreadsheets.

The Max (Worst Case) criteria can be useful, however, it is the distribution of Remaining Solution sizes that is the key issue. Two (not real) extremes that illustrate the issue would be a word that provides potential answers in 17 groups of 100 and 1 of 35, versus another that provides 1 of 400 and 1335 groups of 1. The second distribution provides a better than 3 in 4 chance of finding the answer with one more guess, but its Worst Case is 4 times greater.

The Entropy criteria is a mathematically rigorous effort to evaluate which are the best distributions. However, depending on your strategy, it could be rational to put an even greater weight on small Remaining Solutions to improve the chances of quick answers at the cost of occasional slow answers, hence the One, Two and Three criteria.

Here is a table showing the number of words in categories of Remaining Solution sizes for a selection of the words in the table above.

traceparsesalethearttrainraiseiratearisemeaty
Entropy1,1691,1661,1471,0941,060997969939787
Answer?nynyyyyyy
>160203206165208204000447
81-160278397197452315298341408480
41-80288323472282483554404561229
21-40408240305263227262504200145
11-20221244255197197350203286233
5-10201171205202167156190183109
3-4607867678763395439
0-2767669645552544353
TOTAL173517351735173517351735173517351735
0-10337325341333309271283280201

Conclusions

  • There doesn’t appear to be a reason to used the allowed words that aren’t answers
  • Entropy is the most reliable criteria that is easily calculated
  • PARSE appears to be the best choice at the moment
  • The vast majority of high Entropy words have only two vowels and they are usually A and E
  • Three vowel higher Entropy words usually have A, E and I
  • Four vowel words are universally poor choices
  • Some 3 vowel words have reasonably high Entropy and significantly lower Max (good) but they also tend to have fewer small Remaining Solutions. The lower Entropy suggests that the low Max doesn’t make up for the poorer distribution.

Exit

Please leave comments if you feel there are ways that this analysis can be improved or corrected. Also, if you have start words that you would like me to analyse, let me know in the comments.

Published
Categorized as Maths

Leave a comment

Your email address will not be published. Required fields are marked *