next up previous
Next: About this document ...

Lab 1: A Brief Introduction to Systat, One-Sample Descriptive Statistics, and the Normal Distribution

1.
GETTING STARTED. On your screen there should be an icon that reads ``Computer's Hostname''. This will be different for each machine. Double click on it. This should open up a window containing several folders, one of which reads ``Systat 5.2.1''. Double click on that one. Next double click on the ``Systat.5.2.1§'' icon. At this point you should be in SYSTAT.
2.
SYSTAT DATA FILES. Much of the data we will work with is already in SYSTAT. To open one of these files, go to the File menu and select Open.... A window will appear. One of the possible files to choose will be a folder entitled Data Files (to get here it might be necessary to click on ``Desktop'' and then ``Name of Your Computer'' and then on the ``Systat'' Folder). Select this folder by double clicking on the folder. You can now open one of the data files by double clicking on the name of the file. At this time, we would like to open the file MEDICAL. When you do this, the data editor should appear. The data editor is a spreadsheet containing the variables and cases corresponding to the data file you have selected. If the data editor does not appear, pull down on the Window menu. Select the option Editor by pulling down on the menu and releasing the mouse when you get to the option Editor. The data editor should pop up automatically.

3.
This file (MEDICAL) contains information obtained from the results of the 1986 census. The data represent the mortality rates in each state for various causes of death. The first five columns give information about state, region, and division of the census. We will not be using these. The next seven columns contain the variables that correspond to the death rates. The title of the column represents the type of death.

4.
Choose one of the variables on death rates (ACCIDENT, CARDIO, CANCER, PULMONAR, PNEU_FLU, DIABETES, or LIVER) and do the following.

5.
ANALYSIS. Typically, we will begin labs by examining the basic descriptive statistics of the data. This will include such things as sample means, standard deviations, and other numerical summaries. To calculate these descriptive statistics, go to the the Stats option of the Stats menu and select Statistics.... A window will appear with the column names of our data. Highlight whichever data set you chose and hit select. In the lower right corner, you will see the options for statistics to calculate. Some of the statistics may seem foreign. We will be discussing these more in class and in lab.

6.
Calculate and record the basic summary statistics, including the mean, variance, standard deviation, and skewness. To do this, simply select the stats desired and hit OK, after you select your variable of choice. Remember the mean and the standard deviation are good measures of central tendency, the mean measuring location and the standard deviation measuring dispersion. What is the formula for variance in terms of standard deviation?

7.
We will also be looking at various graphical representations of this data.

8.
Look at a histogram of the data, by selecting histogram under the Density option of the Graph menu. Once there, select your variable of choice by highlighting it and hitting Select. This should put your variable name in the box between Select and OK. Now hit OK and the graph should appear in the Systat View window.

9.
Look at a stem and leaf plot. Select Stem under the Graph menu.

10.
Look at a boxplot. Select Box under the Graph menu. The boxplot also uses the median and the interquartile range.

11.
Summarize your results, including any observations or comments on choosing the mean or median as a measure of location and how the skewness of the data affects that choice. Also include your opinions on which types of graphical representation are better suited to displaying the shape of the data and showing outliers.

12.
ENTERING DATA I. We will also be working with data that is not already supplied in SYSTAT. To enter data into SYSTAT, go to the File menu and select New. An empty SYSTAT Data Editor will appear (provided that the Editor option under the Window menu is active). Title 4 columns NORM1, NORM2, NORM3, and NORM4. Use the Fill Worksheet option under the Data menu to fill the worksheet to 200 rows. Then, use the Math option under the Data menu to set NORM1 to ZRN (select NORM1 on the left, ZRN on the right). This fills the column with a sample of 200 values taken at random from a standard Normal distribution. Repeat this procedure for NORM2 through NORM4.

The population mean and variance of a standard normal ($\mu$ and $\sigma^2$) are and 1, repectively.



 
next up previous
Next: About this document ...
Dennis Cox
1/23/1998