The 12dicts Word Lists

Alan Beale
June 3, 2007

The 12dicts package is a collection of English word lists.  They are different from other word lists which you can download from the Internet in the following ways:

I developed 12dicts about ten years ago out of dissatisfaction with the resources available on the Internet at that time; practically all lists I found were either enormous or of poor quality, and not infrequently both.  One of the original goals of 12dicts was to approximate a core vocabulary for English, which led me to the modus operandi of taking words from a set of twelve ESL and desk dictionaries.  As time passed, I expanded 12dicts to include more lists, which mostly were not derived from the original dictionary set.  However, all of the lists used essentially the same methodology - that of establishing a set of sources, and then including all words found in at least n of the sources, for some value of n.

I consider 12dicts to have been very successful.  It has been used in applications ranging from word games to literacy programs.  It has influenced a body of open-source software, including the aspell spelling checker and the OpenOffice.org Writer.

I have now completed Release 5 of 12dicts, which adds two new lists to the package: a lemmatized list, dividing the vocabulary of previously released lists into sets of closely related words, along with a rearrangement of these lemmas into bands based on their frequency of use.  Release 5 of 12dicts can be downloaded here.  As with the previous versions, your comments and suggestions are welcomed, and I would be pleased to learn how 12dicts is being used.



To comment on this page, e-mail Alan at wyrdplay.org