Need help to solve this assign. Files are attached below. Thanks.

Need help to solve this assign. Files are attached below. Thanks. 1

MAT 240 Module 1 Journal Aid Transcript

[0:00] This is some aid for the Journal 1 assignment in MAT 240.

The topic for Journal 1 is to use the dataset Manchester.xlsx which records data about weather in Manchester, NH collected over a lot of months over a bunch of years. In this video, I’ll be doing a lot of the same kind of analyses. Unfortunately, I won’t be using the Manchester data – that’s for you to do. I’ll be using data from Central Park in New York City.

[0:51] We’ll do two kinds of things – visual descriptive statistics, in particular a histogram with some estimation and numeric descriptive statistics, in particular measures of center and spread.

I’m not going to do the entire journal here. However, I’ll do a large chunk of it and show you the direction to go in.

[1:27] Before we go and jump into the analysis, let’s first take a look at the data.

Here’s the data in an Excel spreadsheet. If you don’t have Excel, or you’re not familiar with Excel, don’t worry. We’ll see that you don’t need to use Excel in this course. In any case, here the first column is Year and the second column is Month. So the first row there contains data for January 1876 for Central Park.

[2:09] We’re going to be looking at the variable TPCP, which stands for Total Precipitation. Here the value is 240 that are measured in tenths of a millimeter. So 240 is 24 millimeters, or just less than 1 inch. It’s always good to take a look at the numbers before jumping in to do statistical operations. That way, you have a feel of what the numbers are. So in this case, in January 1876, a combination of rain and snow had an equivalent of just under 1 inch of precipitation. Not a whole lot.

[3:07] So to load a file from your computer, we choose a file, and here I already have the file in the right directory. I’m going to use Central Park. Click open, scroll all the way down, and load the file, And here’s the data. This is how you would load an Excel spreadsheet from your computer.

[3:52] There’s an easier way. We already have this dataset in StatCrunch for you. Here’s how you can access it:

Data – Load (again) – we have it as a Shared Dataset so everybody can see it. You get a very long list of shared datasets. To find ours, you can search for SNHU MAT 240. Makes sense? Click “Go”, and there are two files there. The Journal file is the one for you to work on your Journal. However, as I said, I’m not going to do your work for you. I’m going to do some very similar things, but for Central Park. Click on the dataset, and there you are. Again, the Central Park data. So now that we have the data loaded, what do we want to do?

[5:15] For the visual part, we want to draw a histogram for Total Precipitation, TPCP. By eye, because this is visual, we want to approximate the lowest value, the highest value, and check for outliers. Let’s use StatCrunch.

[5:43] We want to draw a graph, so click “Graph”. We want to draw a histogram, so click “Histogram”.

So far, I hope you agree with me, that this is not rocket science. Select a column … StatCrunch is asking Need help to solve this assign. Files are attached below. Thanks. 2 us which variable do we want to use. So I’m scrolling down and selecting TPCP or Total Precipitation. I click on “Compute” and there’s the histogram. Make it a little bit wider so that we can see it.

[6:34] So the lowest value … pretty close to 0 probably. There must have been some months when it hardly rained at all. The highest value – pretty close to 5,000. 5,000 is 500 millimeters, or about 20 inches. So there are several points out here that are in the 15 to 20 inches per month range. That’s a lot of rain. If we went back to the data, I wouldn’t be surprised if those were hurricane months, the rare months where a hurricane came through New York City. These two might be considered outliers. They might be considered months that are so unusual that we should treat them specially when we do an analysis.

[7:40] Before doing the next part, I’ve made the histogram real small, put it on the side. We’re going to need it again a little bit later.

[7:54] So we now want to compute some numeric descriptives for the second part of the journal – mean / median / mode / variance / standard deviation / that kind of stuff. So let’s pull up StatCrunch again - we want to do some statistics. Most all of what we want to do in this term is under STAT. Mean / median / mode and that stuff is under Summary Stats, and we want to choose the data in columns. We’ll scroll down to look for precipitation again. There it is, and we’ll click Compute.

So here’s a bunch of information. The mean – 900 some … standard deviation, standard error (we’re going to skip for now, we’ll see a lot of it in the coming weeks … variance / median / all the stuff we need to have computed. It’s not great to just know the numbers; we really should be able to see them on the graph.

[9:10] So here’s where StatCrunch really is nice. The mean, 900 something, is a little bit less than 1000, so where my cursor is, that’s the mean. It looks pretty good – it’s kind of in the middle of the data. The median is 800 something … it’s also pretty much right in the middle of the data. The standard deviation looks at how far the data is from the mean. Here’s where you need to use the Empirical Rule.

[9:46] The mean is around 1000 … the standard deviation is roughly 500 up / 500 down, and as you can see, 500 down to 500 up covers a lot, but not all, of the data. Look up the Empirical Rule. Interpret the Empirical Rule for the rainfall in your dataset. So that should be a lot of help … a lot of clues for your analysis. You can see that part of it is straight calculation, and part of it we’re looking for you to interpret it, to make sure you understand how your results relate to the real world.

[10:36] And oh yes, I forgot to do the mode. The mode isn’t very useful for numeric data which is why

StatCrunch didn’t put it out. But here’s how you use StatCrunch to modify what you just did. Under

Options, click Edit. That takes you back to the previous screen, and I’m going to scroll down, and here in Statistics, if you go all the way down, you will see Mode. Mode is all the way down there because mode is not very useful for numbers. Then I’ll click Compute, and there it is. The mode is not unique.

The mode really isn’t very interesting here. The mode is “What is the number of tenths of a millimeter that has the most months?” I don’t know about you, but I don’t care. I don’t care if more months have 53.2 millimeters than have 49.1 millimeters. So that’s the mode, at least for my dataset here. It’s not unique, and I don’t care.