Develop a plan to integrate this new data warehouse with an Internet application. Include in your plan:A description of the transformation processA schema diagram identifying the changes needed (revis
Introduction
There are certain discrepancies that may arise such as duplication of data. We will identify each table with a primary key which ought to be unique. A primary key is the candidate key that is selected to identify things uniquely within the relation. We will also employ the use of foreign keys. A foreign key is an attribute within one relation that matches the primary key of some relation. In some instances, it could be the same relation.
We will also use foreign key with cascade delete to enforce data quality. This means that, if a record in the parent table is deleted then the corresponding record in the child table should also be deleted automatically.
We will also use data types whereby a user cannot enter something that is contrary to the data type. For example, the attribute AmountOfLoans from table FFEL ought to be a currency, so we need to use a data type that supports currency such as DECIMAL(15,2).
DLGradPlus
PK DLGradPlusId
Recepients
NoOfLoans
AmountOfLoans
NoOfDisbursement
AmountOfDisbursement
FK DLId
DLParentPlus
PK DLParentPlusId
Recepients
NoOfLoans
AmountOfLoans
NoOfDisbursement
AmountOfDisbursement
FK DLId
FFEL
PK FFELId
Recepients
NoOfLoans
AmountOfLoans
NoOfDisbursement
FFELLoanType
FK LoandId
DirectLoanUnsubsidized
PK DLUnsubsidizedId
Recepients
NoOfLoans
AmountOfLoans
NoOfDisbursement
AmountOfDisbursement
FK DLId
DirectLoanSubsidized
PK DLSubsidizedId
Recepients
NoOfLoans
AmountOfLoans
NoOfDisbursement
AmountOfDisbursement
FK DLId
Grants
PK GrantId
GrantProgram
FK SchoolCode
FederalPellGrant
PK FederalPellfrantId
SumOfRecepients
SumOfDisbursement
FK GrantId
TeachGrant
PK TeachGrantId
SumOfRecipents
SumOfDisbursements
FK GrantId
IraqAfghanGrant
PK IraqAfghanGrantId
SumOfRecepients
SumOfDisbursements
FK GrantId
GainfulEmployment
PK GEId
CIPCode
CIPName
CredentialLevel
RepaymentRate
RepaymentNumerator
RepaymentDenominator
DirectLoans
PK DLId
LoanType
FK LoanId
Loans
PK LoanId
LoanType
FK SchoolCode
School
PK SchoolCode
SchoolName
Address
City
State
Zip
Province
Country
PostCode
SchoolType
Physical database
School (SchoolCode PK, SchoolName,Address, City, State, Zip, Province, Country, PostCode, SchoolType)
Loans (LoanId PK, LoanType, SchoolCode FK)
DirectLoans (DLId PK, LoanType, LoanId FK)
DirectLoanSubsidized (DLSubsidizedId PK, Recipients, NoOfLoans, AmountOfLoans, NoOfDisbursement, AmountOfDisbursement, DLId FK)
GainfulEmployement (GEId PK, CIPCode, CIPName, CredentialLevel, RepaymentRate, RepaymentNumerator, RepaymentDenominator, SchoolCode FK)
DirectLoanUnsubsidized (DLUnsubsidizedId PK, Recipients, NoOfLoans, AmountOfLoans, NoOfDisbursement, AmountOfDisbursement, DLId FK)
DLParentPlus (DLParentPlusId PK, Recipients, NoOfLoans, AmountOfLoans, NoOfDisbursement, DLId FK)
DLGradPlus (DLGradPlusId PK, Recipients, NoOfLoans, AmountOfLoans, NoOfDisbursement, AmountOfDisbursement, DLId FK)
Grants (GrantId PK, GrantProgram, SchoolCode FK)
FFEL (FFELId PK, Recipients, NoOfLoans, AmountOfLoans, NoOfDisbursement, FFELLoanType, LoanId FK)
FederalPellGrant (FederalPellGrantId PK, SumOfRecipients, SumOfDisbursement, GrantId FK)
TeachGrant (TeachGrantId PK, SumOfRecipients, SumOfDisbursements, GrantId FK)
IraqAfghanGrant (IraqAfghanGrantId PK, SumOfRecipients, SumOfDisbursements, GrantId FK)
The bold attributes with PK means it is the Primary Key and the attributes with FK means it is the Foreign Key attribute.
SQL Queries
1. Creating the School table
CREATE TABLE `School` (
`SchoolCode` VARCHAR (10) NOT NULL,
`SchoolName` VARCHAR (255) NOT NULL,
`Address` VARCHAR (255) NOT NULL,
`City` VARCHAR (255) NOT NULL,
`State` VARCHAR (5) NOT NULL,
`Zip` VARCHAR (10) NOT NULL,
`Province` VARCHAR (50) NOT NULL,
`Country` VARCHAR (100) NOT NULL,
`PostCode` VARCHAR (20) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
2. Showing School table data
SELECT * FROM School;
3. Which schools generate the highest amount of debt for the average student who attends?
SELECT s.SchoolName, MAX(l.NoOfDisbursements) AS Debt FROM school s, DirectLoanUnsubsidized l GROUP BY s.SchoolName;
4. Employment rate for the students who graduate from these schools?
SELECT s.SchoolName, AVG(g.RepaymentRate) AS EmploymentRate FROM School s, GainfulEmployment g WHERE s.SchoolCode = g.SchoolCode GROUP BY s.SchoolName;
Best Practices
I have used primary keys to ensure that there are no duplicate records. In this project, I have also used foreign keys so as to ensure relationship among the tables. The use of aliases in creating calculated fields has been performed. I have ensured that you cannot put an empty data by using the SQL NOT NULL key word in attributes.