Good evening, I need to make a statistics report but I really have trouble doing it. The report includes the use of the SPSS program and the employment of three hypothesis tests. The report need Are

Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Part 2 Guidance document Contents Overview ................................ ................................ ................................ ................................ ................. 1 Using SPSS ................................ ................................ ................................ ................................ ............... 1 Report Structure/C riteria ................................ ................................ ................................ ........................ 2 Marking rubrix/Feedback sheet ................................ ................................ ................................ .............. 3 Marking scheme ................................ ................................ ................................ ................................ ...... 4 Suggested timeline for completion ................................ ................................ ................................ ......... 4 Ongoing support ................................ ................................ ................................ ................................ ..... 4 Finding datasets ................................ ................................ ................................ ................................ ...... 5 Worked example ................................ ................................ ................................ ................................ ..... 5 Technical notes ................................ ................................ ................................ ................................ ....... 6 Comparing different regions: raw versus relative values ................................ ................................ ... 6 Co nverting String data (Categorical variables) ................................ ................................ ................... 6 Creating categorical variables from numerical variables ................................ ................................ .. 10 Chi -squared tests of association ................................ ................................ ................................ ....... 11 Overview Produce a report of up to 2000 words. Do note references do not count as wordlimit. The report is designed to evaluate your ability to conduct and report quantitative research. Using SPSS For this report, you will need to use SPSS. There are two options: Remote access to desktop computers This service should allow you to run SP SS on own laptop (remote desktop) as if were sat in computer room on campus. https://www.aber.ac.uk/en/is/it -services/rooms/ https://faqs.aber.ac.uk/2968 Downloading SPSS onto own laptop Alternatively if wish can download SPSS direct onto own machine: https://faqs.aber.ac .uk/?search=spss%20software Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Report Structure/ Criteria Your report should have an introduction followed by 3 main sections ( one for each of the hypothesis tests) , a discussion of dataset /tests limitations and final conclusions. Descri ptive statistics/plots should be included throughout with SPSS output presented and interpreted. The three hypothesis tests sh ould be un dertaken as per the following criteria: Test For each test , c hoose one from the following: Data ty pe required 1 Pearson correlation Two numerical variables Spearman correlation Two or more sets of ranking data Pearson correlation → Linear Re gression Two numerical variables Pearson correlation → Multiple regression Multiple numerical variables 2 Chi -squared test of goodness of fit Frequency count for one categorical variable Chi -squared test of association Frequency count for two categorical variables 3 Single -sample t -test Numerical variable for a single group. Repeated measures t-test Two measurements (numerical variable) for a single group. t-test for comparing two groups Numerical measurement for two different groups. One -way ANOVA Numerical measurement for more than two different groups. One -way ANOVA (repeated measures) Repeated measurements (numerical variable) for a single group. One -way MANOVA Multiple numerical dependent variable s + Single categorical variable One -way MANOVA (repeated measures) Repeated measures on a single group. Two -factor ANOVA (independent measures) Single numerical dependent variable + Two categorical variables. Two -factor ANOVA (repeated measures) Repeated measures for different groups Two -factor ANOVA (mixed measures) Table 1: Criteria for three hyp othesis tests to undertake. For each of the tests, the options have been put in order of increasi ng difficulty with the more advanced methods worth more marks (provided done correctly). If you choose (an d justify) undertaking non -parametric equivalents, then this will be scored higher. For each hypothesis test you should : • Provide a justification for the variables you have ch osen and choi ce of test. • Clearly state the null and alternative hypothesis. • Identif y the dependent and independent variables and descr ibe how they have been measured/ recorded . • Plot the data and comment on the distri butio n. Note an d comment up on any outliers. You may wish to undertake formal tests of the normality of the da ta. • Where appropriate , clearly report descriptive data for the variables, including means and standard deviations . • Report the findings (test statistic values, p-value s, reject/ not rejec t null hypothesis ). • Put your findings in conte xt for lay audience. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Marking rubrix /Feedback sheet (70+) (60 -69) (50 - 59) (40 - 49) (0 -39) An introduction to t he topic area and justi fication ( using appr opriate literature) of the chosen research question(s). Distinction Clear Pass Pass Fail Clear Fail Comment: Hypothesis test 1 : Correlation Distinction Clear Pass Pass Fail Clear Fail Comment: Hypothesis test 2 : Ch i-squared tests Distinction Clear Pass Pass Fail Clear Fail Comment: Hypothesis test 3 : T/AN OV A/ MANOVA Distinction Clear Pass Pass Fail Clear Fail Comment: Ability to report findings in a way understandable to a lay audience . Pr ofessional presentat ion of report (headings , captions etc ) Distinction Clear Pass Pass Fail Clear Fail Comment: A discussion of the problems, issues and limitations arising from the dataset , variables and tests used Distinction Clear Pass Pass Fail Clear Fail Comment: OVERALL MARK CATEGORY* (please circle one) Distinction Clear Pass Pass Fail Clear Fail First Marker General Comment: First marker details: Name: Dept: Date: Second marker comments Second marker’s comments: Second marker details: Name: Dept: Date: Final mark: * Marks are not arrived at by mechanical addition, but by a consideration of the overall achievement of the work.', Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Marking scheme Distinction (70+) An excellent pass. The report produces a set of clear testable hypotheses, a coherent discussion of why variables were chosen and demonstrates an awareness of statistical assumptions. The report uses the appropriate techniques in a sophisticated fashion and the analysis of the results is clearly based on the model results. Clear Pass (60 -69) The hypotheses are identified with some precision. There is some disc ussion of why the variables were chosen. Some discussion of the statistical assumptions is made and there is an attempt to critically analyse the statistical results. Pass (50 -59) The hypotheses are identified, but its formulation is in need of considera ble refinement to make them clear. There is some but limited awareness of the assumptions and the choice of variables for analysis. Reports in this category are likely to have only a basic understanding of the relevant ideas, concepts and arguments and pro duce little detailed analysis of the statistical output. Fail (40 -49) The report produces incoherent analyses. Hypotheses are incorrect and the methodology is very poor. The report will make little attempt to analyse the data and will provide little disc ussion of the assumptions of the regression model. The report will provide very basic overview of results with no in -depth interpretations. Clear Fail (0 -39) The report is poorly formulated. No hypotheses are identified and understanding of the methodolo gy is non -existent. The report will lack any sense of analysis and will not attempt to explain data in terms of a model. The report will provide only a very superficial discussion of the quantitative results. Suggested timeline for completion Date Suggested Milest one Friday 17 th December Familiarisation with requ irements – identification of data sets w hich may want to use. Chat with s uperv isor/personal tutor about assignment. Friday 14 th January Choice o f dataset & identification of variables Friday 21 st January Test 1 done Friday 28 th January Test 2 done Friday 4th Fe bruary Test 3 done Friday 11 th Fe bruary Introduct ion /conclusions write /edit Friday 1 8th Fe bruary Report completed – aka give week for pr oofreading/buf fer time. Frid ay 25 th February Deadline [FIXED ] Table 2: Suggested timeline for completion . O ngoing s up port Small -group/individual appointments: I am happy to arrange mee tings up until F rid ay 18 th Febr uary. Email qu eries: I will re spond to email qu eries sen t by Friday 18 th February us ually within 1-2 worki ng days. For both forms of support, the more specif ic your question s the more I can help. Please have clear objectives of what wish to get out of appoint ments/sessions. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Finding datasets Note: dataset does not necessarily need to be within your general discipline area. I appreciate there are fields in which confidentiality/data protection/embargos may present barriers. Ideally, we require a dataset which has a range of both categorical and numerical variables in order to satisfy the criteria : up to 1000 rows of data should more than suffice . Here are a few places which may be useful: Sheffield University, Datasets for teaching: https://www.sheffield.ac.uk/mash/statistics/datasets SAGE publishing: https://study.sagepub.com/paternoster/student -resources/spss -datasets CALTEC online -workshops: http://calcnet.mth.cmich.edu/org/spss/prjs_datasets.htm I found these resources by simply doing a google search for “SPSS datasets teaching”. I’d really encourage people to set aside some time and have a search – even if don’t find anything suitable for this particular assignment hopefully this exercise in itself will be useful for dissertation/PhD work in general. If find at end of exercise that isn’t a suitable dataset this is fine – there is always the sample data – do make sure though to explain this (describe searching work that have done) and credit will be awarded. This process of selecting a dataset, selecting relevant variables (hence applying theory) is what makes this assignment a Mast ers (PG) level assignment: sitting right at the top of Bloom’s taxonomy as is requiring application of knowledge, creativity in devising research questions and evaluation. Worked example Note: you are prohibited from using this example for your report. Diet example Sheffield University, Datasets for teaching: https://www.sheffield.ac.uk/mash/statistics/datasets List of variables: Person ID Purely ID number (row number) Gender Categorical Age Numerical Height Numerical Pre -weight Numerical: example of repeated measures Weight after 10 weeks Weightlost Numerical Diet Categorical Table 3: List of variable s and their datatype in dataset for worked example For any dataset (provided there aren’t too many variables) one strategy is to list out the variable and their types as in Table 3. Identifying the datatype really does underpin the assignment in terms of how choose to plot the data and which tests apply. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Test Options Data required 1 Pearson correlation → Linear /Mul tiple regression Age & Height Age & Weightbefore Age & Weightafter Age & Weightloss Weightbefore & weightloss etc etc. Spearman correlation N/A 2 Chi -squared test of goodness of fit Option to create categorical variables from numerical data. Chi -squared test of association/homogeneity 3 Single -sample t -test Is average weightlost = X value? Repeated measures t -test t-test for comparing two groups Compare weightlost for diets 1 & 2 or 2& 3 or 1&3. One -way ANOVA Compare weightlost for diets 1, 2 and 3. One -way ANOVA (repeated measures) N/A: Need more than two measurements. One -way MANOVA N/A One -way MANOVA (repeated measures) Two -factor ANOVA (independent measures) Weight loss Gender, diet Two -factor ANOVA (repeated measures) Gender + time (mixed) Diet + time (mixed) Two -factor ANOVA (mixed measures) Table 4: Worked ex ample showing how to rel ate dataset varia bles to the various hypothesis tests. Technical notes Comparing different regions: raw versus relative values Suppose that you wanted to compare a particular type of crime (e.g. burglary) for different regions of the UK (e.g. NE, NW, Midlands, London, SE, Scotland, Wales etc): you could do this if you have data for different counties within each region. It would be unfair to compare the raw values of each region as each region has a different population. Hence to account for this it is worth considering creating relative values i.e. total number of burglaries divided by population or divide by total crime repo rts. Converting String data (Categorical variables) Often datasets will have categorical variables recorde d as string data: unfortunately in this format we cannot directly use it . We need to convert the strings (words) into codes (number labels). This can be easily done auto -magically in SPSS. As an example say that we have haircolour recorded as string data (words – blonde, brown, black) and want to convert into codes (still keeping as categorical) so that can run statistical tests. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Figure 1: Example of string data Here in the column heading in Figure 1, the letter “a” denotes that this variable is recorded using strings. The 3 circles indicate that i t is categorical in nature. We can convert these strings into codes (i.e. number labels) as follows: Transform → Recode into different variables Figure 2: Preliminary dia logue box f or recod ing string da ta in SPSS into usable format with nu mber lab els. Transfer string variable of interest into the middle box. Type in name of new Output Variable and click Change . Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Figure 3: Recoding stri ng data - speci fying new/old variable name s. Click on Old and New Values box. Type in the string names and desired codes (1, 2, 3). Click Add for each in turn. Click continue . Click OK . Figure 4: Recoding string data - giving variables number lab els. The new variable will appear at the end of your SPSS file (i.e. last column). Under Variable View tab you can change the number of decimal places. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Figure 5: Re coding str ing data - data - giving variables number lab els (II) Fina lly, you sh ould c heck that all values have been converted i.e. no missing values. You can do this by using Descriptive Stati stics (Analyse → Descriptive Statistics, select old and new columns and check that the total number of data points are the same). Figure 6: Check that recoding process has converted all data in original string variable into new variable. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Creating categorical variables from numerical variables Suppose that you have height measured in cms. This is a numerical variable. This can be “cast” into a categories i.e. 141 -150 cm, 151 -160 cm, 161 -170 cm, 171 -180 cm, 181 -190 cm. This may be useful if have a dataset with lots of numerical variables and don’t have enough categorical variables to fulfil the criteria. We can do this in SPSS by doing: Transform → Recode into Different Variables Transfer variable want to cast and type in name wish to give new output variable. Click Change. Figure 7: Recoding numerical variables into categorical variables. Click on Old and New Values . Fill out desired labels – for each click Add . Once finished click Complete . Click Ok . Check that there are no missing values in the new column (categorical in nature). You can then go to Data → Define Variable properties to tell SPSS what the labels you’ve assigned are. Figure 8: Putting n umerical data into ca tegories. Version 2, November 2021 PGM101 0/PGM4310 Dr Maire Gorm an [email protected] Chi -squared tests of association In the examples in the module we had “summary data” i.e. totals for each category. When the data is inputted in this way we had to always remember to do the following step: Data → Weight Cases → weight by frequencies. In the datasets you will be using yo u will most likely have the raw data where row totals are not available. Here do not need to do the step above as SPSS will manually count the relevant rows to make totals as required. Here are two screenshots showing difference: 1. Summary totals: Figure 9: Summary total inp ut (pre -recorded examples) Here would need to remember to “weight cases”. 2. Dataset (each row is an occurrence) – sample – many more rows than shown. Figure 10: Typical dataset input - each ro w repres ents a differe nt participant. Here don’t need to “weight cases” as SPSS will count up the relevant rows.