Page 1 of 1

Top 100 words in English

Posted: Wed Jul 16, 2003 5:35 pm
by WidreMann
6.18% the
4.23% is, was, be, are, 's (= is), were, been, being, 're, 'm, am
2.94% of
2.68% and
2.46% a, an
1.80% in, inside (preposition)
1.62% to (infinitive verb marker)
1.37% have, has, have, 've, 's (= has), had, having, 'd (= had)
1.27% he, him, his
1.25% it, its
1.17% I, me, my
0.91% to (preposition)
0.86% they, them, their
0.86% not, n't, no (interjection)
0.83% for
0.83% you, your
0.70% she, her
0.65% with
0.64% on
0.62% that (conjunction)
0.58% this, these
0.57% that (demonstrative), those
0.55% do, did, does, done, doing
0.51% we, us, our
0.50% by
0.47% at
0.45% but (conjunction)
0.44% 's (possessive)
0.41% from
0.40% as (many parts of speech)
0.37% which
0.37% or
0.31% will, 'll
0.28% said, say, says, saying
0.25% would
0.25% what
0.23% there (existential, in "there is ..." phrases)
0.23% if
0.23% can
0.22% all
0.22% who, whose
0.21% so (adverb / conjunction)
0.20% go, went, gone, goes
0.20% more
0.19% other, another
0.19% one (numeral)
0.18% see, saw, seen, seeing
0.18% know, knew, known, knows, knowing


These 100 words make up 43% of all words in terms of usage.

Posted: Wed Jul 16, 2003 5:53 pm
by ColdFront77
Interesting information about our English language, WidreMann... thank you for posting it.

Posted: Wed Jul 16, 2003 7:21 pm
by wx247
Interesting... Thanks for posting it.

Posted: Wed Jul 16, 2003 8:53 pm
by Colin
Thanks for the info! ;)

Posted: Thu Jul 17, 2003 12:29 am
by JetMaxx
I just can't believe the word "sex" didn't make the top 100. It's all I seem to hear on television nowadays (why I rarely watch :D :D

Posted: Thu Jul 17, 2003 6:42 am
by coriolis
....or certain 4 letter words. Of course they wouldn't make it into a scholarly work.

Posted: Thu Jul 17, 2003 6:44 am
by coriolis
Widremann, what document(s) were used for this word study?

Posted: Thu Jul 17, 2003 8:12 am
by JCT777
Great list, WidreMann! Thanks for posting it. 8-)

Posted: Thu Jul 17, 2003 10:51 am
by raine
Great post widremann! I find things like this fascinating, makes me stop and ponder...can't get enough of that :D

Thanks for sharing my friend, have a great day!

Blessings, Raine

Posted: Thu Jul 17, 2003 12:30 pm
by WidreMann
You'll notice that most of these words have a purely grammatical function, or are hybrids (such as "do" and "be", which are used mostly as auxillaries, but also have content meanings).

The corpus used to come up with these numbers was a collection written works, speeches and conversations. Of course there is no good way to come up with an exact count of English words or their frequency in the language, it's the best that can be done.

Here's the source: http://www.invisiblelighthouse.com/langlab/bncwords.html