Programming with
Data Structures and Algorithms

Nile

We're going to do Nile again, this time without parallelism. However, if you thought we were going to let you get away with implementing it in Racket, you must be in de-Nile!

Program Specification

Please re-read the parallelism assignment. You are being asked to support two kinds of functionality: recommending a book and recommending a pair.

Any data structures you need you must define: you may not use built-in Java datatypes except when we indicate otherwise. You must also make sure you define your datatypes according to the design style laid out in this course.

The recommendation lists your program needs as input will be in files stored in a directory specified by the first command-line argument (accessible as args[0] in your main method). Each list will be stored in its own file, and every file in the directory will be an input list. Inside the lists every line will contain the description of a single book. Each book has a unique and unambiguous description; that is, if two lines in two different input lists are identical, they refer to the same book. Otherwise they refer to different books. Input lists will always contain at least two books, and they will never contain duplicates.

To retrieve the input lists, you should use the NileIO class we provide. It has two public static methods:

Support Code

We provide you the support code in a jar format, along with the javadoc.

We do not provide you with the source code (.java file) because we do not want you to concern yourself with the implementation of the classes we've written for you.

We would like you to use DrJava as your IDE. You will need to make sure the NileSupport jar is in your classpath. To do this, go to Project/Project Preferences, and then add the .jar to the classpath.

Part 1

Write a program in Java that performs the above tasks. Your program should consist of a file, Nile.java, containing a class with a main method, along with any other files you may need. Your program will be run in one of two ways:

  1. When run with one argument (the directory path), your program should print a list of the most popular pairs of books. The count should be first, followed by a newline, and each pair should be printed on its own line with only a plus sign separating the two books in the pair. For example,
              12
              Lewis Carroll, Alice in Wonderland+Douglas Hofstadter, Godel, Escher, Bach
              Frank Herbert, Dune+Arthur C. Clarke, Childhood's End
              Jane Austen, Emma+Emily Bronte, Wuthering Heights
            
    Each pair of books should only appear once, and order is irrelevant. Note that
              Lewis Carroll, Alice in Wonderland+Douglas Hofstadter, Godel, Escher, Bach
            
    and
              Douglas Hofstadter, Godel, Escher, Bach+Lewis Carroll, Alice in Wonderland
            
    count as the same pair.
  2. Your program may also be run with two arguments, accessable as args[0] and args[1] in your main method. The first argument will once again be the directory path and the second argument will be a book description string in the same format as the input lists. Your program should print a list of books most frequently paired with it, preceded by the count as above. For example,
              7
              Lewis Carroll, Alice in Wonderland
              Larry Niven, Ringworld
              N.K. Stouffer, The Legend of Rah and the Muggles
            
    If the given book description cannot be found in any input list, your program should print an error message and exit.

No matter which way your program is run, it should print an error message and exit if the given directory is inaccessible or empty. If one of the input list files in the directory is inaccessible, your program should ignore that file and continue.

Note: Java automatically splits command line arguments at spaces, so if you gave as arguments

     /some/path Edwin A. Abbott, Flatland
you would get the args array
    ["/some/path", "Edwin", "A.", "Abbott,", "Flatland"]
To avoid this, you must put quotes around your book descriptions, like so:
    /some/path "Edwin A. Abbott, Flatland"
resulting in the correct args array
    ["/some/math", "Edwin A. Abbott, Flatland"]

Your goal in this assignment is to produce a well-designed, working implementation. You do not need to focus on efficiency. Comment appropriately.

You may use the NileIO class we provide, but no outside code. To utilize the methods, import nile.* at the top of your main class.

Part 2 – The Task

We also want you to answer the following question. Suppose you wanted your program to be interactive and process several requests in a short period of time. In particular, if your program is part of the back-end of a popular server with lots of data, it should respond in as close to constant time as possible. How would you modify your solutions to Part 1 to obtain this? You don't need to provide code; just discuss the algorithms and data structures you would use, and how you might modify the design overall.

Turning in:

Your pdf file must be on Letter-sized paper. Mathematical content must be formatted appropriately. Please, no Arial.