See attached file

THE UNIVERSITY OF THE WEST INDIES COMP4217: Introduction to Database Principles FINAL PROJECT (Version 1.0) Semester 2, 2020/21 To be attempted by groups of up to 4 students for 15 coursework marks Section A: Project Brief: Consider the problem of monitoring the incidences of an infectious disease in a small country with a population less than 10 million people. The monitoring process involves managing data on:  Patient visits to a health facility  The results of tests for the disease  New cases of the disease discovered from testing  Periodic tests carried out on infected cases  Patients who recover from infection  Patients who have died from the disease A critical part of the process involves daily generation of statistics on the number of new infections, how many patients have recovered, and (sadly), how many have succumbed to the disease and died.

Your mission is to design, develop, and demonstrate a database that can be used to facilitate the process of monitoring infectious diseases. A description of the processes involved in monitoring the disease is given in Section B.

The statement of deliverables is given in Section C. There are also three appendices to this document containing specification of requirements for SQL scripts to update the database, and scripts to generate data for reports. Section B: Process Description: The process for trackin g infectious diseases is as follows: 1. A person who visits a health care facility is asked to complete a registration form indicating name, date of birth, address, gender, and other demographic data. 2. If the person has been a patient at the facility before then his/her medical record (including a unique patient identifier number) is retrieved and attached to the registration form. Please note, that a registration must be done for each visit. 3. If the patient is attending this health center for the first time he/she is assigned a unique patient identification number (a simple serial number). 4. The patient is tested for the infectious disease and the result of the test is recorded. The data for a test includes the date of the test, the doctor who requested the test, the technician who administered the test, and the result of the test (positive, negative, or indeterminate). A patient can be tested many times but only one test is allowed per day. 5. If the test is positive then a new case is recorded. Data for a new case includes the patient identification number, test result, and general comments regarding the patient. The new case is recorded as having status active. 6. Daily tests are recorded for the active cases. 7. If the result of a daily test is negative then the status of the case is recorded as recovered. 8. If a patient dies then a record of the death is made detailing the date of the death. COMP4217 FINAL PROJECT 2020/21: A DATABASE FOR MONITORING INCIDENCES OF AN INFECTIOUS DISEASE Section C: Project Deliverables: The project has four (4) deliverables for a total of 100 points. I. A written report (MS Word, or PDF) is to be submitted containing:

a) An Entity-Relationship diagram to depict the main entities and relationships in the database for infectious diseases. (10 points) b) A relational model derived from your ER model identifying each relation and the primary key for each relation. (10 points) c) A database schema detailing attributes, primary keys, foreign keys, and constraints. (10 points) II. An archive (zip format) containing the following SQL scripts a. A script file named make-did.sql with SQL commands to create the database. (2 points) b. A script file named load-did.sql with SQL commands to load database data from a set of text files containing comma separated values.

(2 points) c. A script file named dump-did.sql with SQL commands to save the data in each table as a CSV text file (2 points) d. Script files containing SQL commands to achieve each task listed in Appendix A. Each script file you should be named as indicated in each case. (8 points) e. Script files containing SQL commands to retrieve the data for each report listed in Appendix B. Each query should be saved in a script file named as indicated. (8 points) f. An SQL query to generate data other than any specified in Appendices A or B. (2 points) III. Each group will make a 10 minute presentation of the work done for the project.

The presentation shall highlight the database design decisions that were taken and justification for those decisions. Your presentation will be graded for: a. Quality of presentation (grammar, format) (6 points) b. Functionality as demonstrated by execution of at least two (2) update queries and three (3) reporting queries. (10 points) c. Discussion of database design issues to include (but not limited to: i. Normalization ii. Query optimization iii. Security (10 points) IV. Peer Review form (included as Appendix C). Each group member is required to submit one form. The Peer Review will be used to adjust each group member’s final score by adding (or subtracting) from the 80 points achieved by the group.

(20 points) Section D: Project Dates  All written reports and accompanying archives shall be submitted by midnight Tuesday April 13, 2021.  Presentations will be held on Thursday April 15, 2021. COMP4217 FINAL PROJECT 2020/21: A DATABASE FOR MONITORING INCIDENCES OF AN INFECTIOUS DISEASE APPENDIX A – SQL QUERIES FOR UPDATING THE DATABASE 1) Registering a patient (register.sql) Inputs: Patient Identification Number (PIN), last name, first name, address, gender, symptoms 2) Recording a test (update-test.sql) Inputs: Patient identification Number, test date, test result (Negative, Positive, Indeterminate), technician who performed the test, lab where the test was done, doctor who ordered the test. 3) Recording a new case (add-new-case.sql) Inputs: Patient Identification Number, data case was diagnosed, doctor who ordered the test. 4) Recording a recovery (update-recovery.sql) Inputs: date of recovery, doctor who discharged the patient 5) Recording a death (update-death.sql) Inputs: Date of death, doctor who pronounced the death COMP4217 FINAL PROJECT 2020/21: A DATABASE FOR MONITORING INCIDENCES OF AN INFECTIOUS DISEASE APPENDIX B - SQL QUERIES FOR GENERATING REPORT DATA 1) Daily Summary (daily-summary.sql) Report Date: DD/MM/YY New Cases: 999999 Recovered: 999999 Deaths: 999999 Active Cases: 999999 2) Case Details (case-details.sql) Case #: 9999999 Status: XXXXXX (Active/Reovered) Patient Id. #: 9999999 Name: XXXXXXXX XXXXXXXXXXX Age: 99 Gender: X Parish: XXXXXXXXXXXXXXXX 3) Test Profile - Ordered by Date of Test (test-profile1.sql) Case # Date Result ------------------------------------------------------------------------- 999999 DD/MM/YYYY XXXX 4) Test Profile - Ordered by Case # (test-profile2.sql) Case # Date Result ------------------------------------------------------------------------- 999999 DD/MM/YYYY XXXX 5) Active Cases – Ordered by Parish (active-cases1.sql) Last Name First Name Age Parish ___________ XXXXXXXXXX XXXXXXXXXXXXXX 999 XXXXXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXXXXX 999 XXXXXXXXXXXXXXX 6) Number of Active Cases by Parish (active-cases2.sql) Parish Number of Cases Kingston 9999 … St. Thomas 9999 COMP4217 FINAL PROJECT 2020/21: A DATABASE FOR MONITORING INCIDENCES OF AN INFECTIOUS DISEASE 7) Number of Active Cases by Age Group (active-cases3.sql) Age Group Number of Cases 0-18 99999 18 -35 99999 35 -50 99999 50 and above 99999 Total 99999 8) Number of Deaths by Age Group (deaths1.sql) Age Group Number of Deaths 0-18 99999 18 -35 99999 35 -50 99999 50 and above 99999 Total 99999 *** END OF APPENDIX B *** COMP4217 FINAL PROJECT 2020/21: A DATABASE FOR MONITORING INCIDENCES OF AN INFECTIOUS DISEASE PEER REVIEW REPORT ACTIVITY PLEASE RATE EACH MEMBE’S CONTRIBUTION ON THE SCALE 0 TO 4 WHERE 0 = No Input and 4 = Equal Contribution by all) Team Member #1 Team Member #2 Team Member # 3 Team Member #4 Data Modelling Database Schema Testing Report Preparation Preparation of Presentation