statistics project (using R)

STAT462: Final Project SubmissionYou will submit your report through Canvas. Your report must be uploaded by the end of the day on Sunday, April 30th. If you have nals on Monday, May 1st, then plan accordingly. The due date will not be adjusted in any way.

Grading Guidelins You will be graded based on 3 criteria: 1. Presentation: You will be graded on organization and clarity of the report. Your nal report must be written using RMarkdown, or some similar software, and turned in as a pdf. Text, summaries, plots, should be cleanly integrated. Nothing should be printed in the report that is not specically discussed (i.e. don't print random plots or laundry lists of numbers). Altogether, your report should not exceed 15 pages (shorter is ne).

2. Statististical Tools: You will be graded on how well you incorporate the various tools discussed throughout the course into your analysis. You should include as much as you can, while still keeping objective (1) in mind.

3. Scientic Merit: You will be graded on your ability to interpret the results from statistical analyses, plots, etc, as well as how well you incorporate them into the overarching goal of the report. The overall goal of the report to carry out a complete scientic study, so do not only focus on the statistics!

Ethical Guidelines This project is meant to be a reection of your own work. You are not allowed to discuss the project with anyone other than the instructor, and even then, I will only provide basic clarications. If anyone is caught discussing the project with anyone other than the instructor, they will, at the very least, receive a failing grade in the course. If there is anything unusual about any of the reports, be prepared to be called in to my oce for further discussion. Note that this could delay receiving a nal grade, therefore, make sure that the report is uniquely your own work.

Project Details The data come from a study on childhood obesity. Pregnant mothers were enlisted and then each child was followed from birth to two years old. The researchers that collected this data were interested in understanding childhood obesity and especially what role (if any) the human microbiome plays. The human microbiome refers to the dierent bacteria that live inside/on humans. The following variables are available. ˆ id: id of subject ˆ weight_born: weight at birth (in kg) ˆ height_born: height at birth (in cm) ˆ weight_2yr: weight at 2 years ˆ height_2yr: height at 2 years ˆ child_buccal_fb: Firmicutes-to-Bacteroidetes ratio in the child's mouth. These are two phyla of bacteria. The ratio of their abundences are thought to play an important role in obesity.

ˆ child_gut_fb: Firmicutes-to-Bacteroidetes ratio in the child's gut.

1 ˆ mom_buccal_fb: Firmicutes-to-Bacteroidetes ratio in the mother's mouth.

ˆchild_buccal_alpha: Alpha diversity of types of bacteria in the child's mouth. The alpha diversity is a summary of how diverse the types of bacteria are (i.e. are there lots of dierent types of bacteria).

ˆ child_gut_alpha: Alpha diversity in the child's gut.

ˆ mom_buccal_alpha: Alpha diversity in the mother's mouth.

ˆ Treatment: There were two treatments as part of this study, with labels Safety and Parenting .

Safety refers to the control group, while Parenting means the treatment group where the parents received nutritional training.

ˆ Antibiotic: Did the child receive antibiotics in the rst two years.

ˆ Gender: Gender of the child. Your goal is to play the role of the statistician in this study and analyze the data. You should clearly detail everything that you do in your report. IN ADDITON: you are to apply one tool from outside this class. It can be from another class (that we didn't discuss), the book, or just random searching. You should also breiy describe this tool and why you think it is interesting to apply to this data.

You can load the data into R by placing the le into your working directing and using the command load ("child_obese_data.Rdata") This will load a dataframe called child_obese head (child_obese) ## id weight_born height_born weight_2yr height_2yr child_buccal_fb ## 1 1 3.712337 51.81976 NA NA NA ## 2 2 2.066262 51.28491 11.32099 88.50137 23.630935 ## 3 3 4.118248 54.07935 12.52760 83.78521 1.782404 ## 4 4 3.948013 51.03525 NA NA NA ## 5 5 4.022174 53.36464 11.05147 83.23308 1.848309 ## 6 6 3.895072 52.35753 14.96314 88.48936 1.758953 ## child_gut_fb mom_buccal_fb child_buccal_alpha child_gut_alpha ## 1 NA NA NA NA ## 2 0.2649587 22.532844 2.048551 1.646911 ## 3 0.4243042 7.034536 3.190728 2.025753 ## 4 NA NA NA NA ## 5 7.6557240 6.264417 3.622135 1.457000 ## 6 0.3407116 2.243395 3.878499 1.922044 ## mom_buccal_alpha Treatment Antibiotic Gender ## 1 NA ## 2 2.693387 Parenting No M ## 3 1.857338 Safety No M ## 4 NA ## 5 2.137427 Parenting No M ## 6 3.274502 Parenting No F 2