Web search engines use a variety of information to determine the most relevant documen a query . One important factor ( especially in early search...
I need help with this java program. expected output and zip file that contains java files
- Attachment 1
- Attachment 2
- Attachment 3
Web search engines use a variety of information to determine the most relevant documena query . One important factor ( especially in early search engines ) is the frequency ofoccurrences of the query values in a document . In general , one can try to answer a questionof how similar or dissimilar two documents are based on the similarity of their wordfrequency counts ( relative to the document size ) . A necessary step in answering these typesof questions is to compute the word frequencies of all words in a document . This steprequires many search operations to be done within a word database . In order to performthese search operations in an efficient manner , hash tables can be used as the data structureof this databaseIn this assignment , you will write a method word count that reads a file ( document ) andoutputs ( into another file ) all the words encountered in the document along with theirnumber of occurrences . The method should have the following signature :public static void wordcount ( String inputfile Name , String outputFileName )While implementing method word count , please use a hash table with separate chaining tokeep the current counts of words already encountered in the input file . Here , words aredefined to be simply strings of characters between two delimiting characters , whichinclude a space and punctuation charactersAssuming that something like " Father's " is two words (" Father " and " s " becausethey are separated by delimiters ) is OK for our purposesFor simplicity , assume any derivative words to be distinct e g . " book " and " bookeat " and " eating " are all considered distinctDo not distinguish words that only differ in upper or lower case of their characterse .B . " Father" and " father correspond to the same wordYou can use appropriate methods of the String class to handle this easily( 2 8 . String tolower Case method )To extract words from an input string , you can use String . split ( ) or javaclass String Tokenizer ( which is sometimes viewed as deprecated but itnot , it's considered a " legacy " class ) to save yourself some programming .The general procedure for obtaining word counts should include the following steps1 . Scan in the next word2 . Search for this word in the hash table3 . If not found , insert a new entry for this word with an initial count of I , otherwiseincrement the count4 . If you're inserting a new word , check if the hash table needs to be expanded Hastables should not require a ( constant ) table size to be provided . Thereforerelore ,implementations that use a constant hash table size will be penalized .
Show more