Lesson 26 - lab exercise 2 - Countwords

Background:

1.   This lab exercise will count the occurrences of words in a text file.  Here are some special cases:

      sixty-three   counts as one word

      joyous - sparkling    counts as two words, the hyphen (-) will have a blank space on each side

      'tis   counts as one word

      can't  counts as one word

      The and the both count as occurrences of the word "the".  In other words, convert any capital letters to lower case before counting such a word.

2.   You are encouraged to use a combination of all the programming tools you have learned so far:

Data structures

Algorithms

 

apvector class

apstring class

structures

fstreams

 

sorting

searches

text file processing

Assignment: 

1.   Your instructor will provide you with a data file to analyze.  Parse the file and print out the following statistical results:

      Total number of unique words used in the file.

      Total number of words in a file.

      The top 30 words which occur the most frequently, sorted in descending order by count. 

      For example:

 

Number of words used = 521

Total # of words = 1573

 

  1   103   the

  2    97   of

  3    59   to

  4    43   and

  5    36   a

  6    32   we

  7    32   be

  8    26   will

  9    24   that

 10    21   is

 

... rest of top 30 words ...

 

2.   Turn in your source code and run output.