Camile Faith
W e e k 6 D is c u s s io n : M in i S c e n a rio s
1. You are creating a database for a research study that is investigating the effectiveness
of tram poline jum ping for im proving cognitive functioning in alzheim er’s patients.
Enrolled patients are being random ized into either intervention or control groups.
Patients in the intervention group w ill receive a total of 24 sessions over 3 m onths, w hich
are supposed to occur tw ice a w eek. The Principal Investigator (PI) w ants a tally of the
num ber of sessions that each patient participated in, in order to confirm that patients
received the correct “dose” of the intervention. In your database you create a table to
record the specific date of each tram poline session for every patient. The PI w ants to
sim plify the database and suggests only recording the total num ber of sessions that
each patient participated in. Explain to the PI w hy collecting m ore “granular” data could
be w orthw hile in this instance.
2. You are using an Electronic M edical R ecord report to collect data on patients w ho have
received a specific procedure. M idw ay through the study you realize that the specific
procedure code that the report is using relates to m ore than one procedure, not only the
one you are studying m eaning that w hen that code appears in a patient’s record, they
m ay or m ay not have actually had the procedure you are interested in. The only w ay to
be sure is to review the notes entered in each patient’s EM R . U nfortunately that w ill take
too long for you to do alone. Luckily there are 2 interns in your office looking for
som ething to w ork on! H ow w ould you divide the w ork of review ing the patient records
betw een the 3 of you in a w ay that w ould ensure that the data is being recorded
accurately and consistently? H ow w ould you avoid this in the future?
3. The health and hum an services departm ent in your state has just com piled a m assive
“all claim s database” that includes all public and private insurance paym ents for every
procedure provided at all hospitals in the state over the past decade a total of 4.3 billion
claim s. A couple of your colleagues at state university are exploring the data and
discovers a very sm all but significant increase in the num ber of hearing tests provided in
m onths beginning w ith the letter J. They begin to hypothesize about w hy people m ay be
experiencing increased hearing problem s during those m onths. H ow w ould you help
them understand the need for caution in interpreting patterns found in “big data”?
4. You are w orking on a study related to exposure to secondhand sm oke using secondary
selfreported survey data that w as collected by som eone else for an earlier study. O ne of
the things you are interested in doing is com paring the num ber of subjects w ho reported
living w ith som eone w ho sm okes. You notice that that particular variable is coded as
“Yes” or is blank. There are no notes in the codebook, so you are not sure if a blank cell
should be interpreted as a “N o” or as m issing data. You contact the original investigators, and they say that they accidently forgot to include those responses in the
final dataset. They send you a list of the subjects w ho answ ered “N o” to the question,
and let you know that data is m issing for anyone w ho isn’t a “Yes” already, or isn’t on this
“N o” list. Explain the general steps you w ould take to go about updating your dataset
w ithout losing or corrupting any of the original data.