OBJECTIVES: This lab is designed to show you aspects of experimental design, with an emphasis on randomization and randomized block designs. You will also investigate the effects of sample size on sampling distributions, as well as understand the motivation behind randomization as being a key to statistical inference.
DIRECTIONS: Follow the instructions below,
answering all questions. Your answers for each of the questions, including
output and any plots, should be summarized in the form of a brief report
(Word), to be handed in to the instructor before 1:00pm Friday Oct. 12.
__________________________________________________________________________________________________________
__________________________________________________________________________________________________________
1.) Design of Experiments . . .
Five treatments are being tested on a homogeneous group of 40 patients. In particular, there are four different medications and one placebo. You are asked to use randomization to assign each of the treatments to the available patients, and to do so, you will use Minitab to generate random data and take five samples.
Randomization . . .
a.) First of all, generate the random data to represent
the 40 patients and store them in column 1 (C1).
(Hint: Investigate
"Calc/Make Patterned Data/Simple Set of Numbers" -- Each patient can be
represented by the numbers 1-40, of course :) ).
b.) Note that this first column contains your "original"
data from which you will now sample. Now take a random sample
of this data and store it in column 2. You may want to label your
columns as well. VERY IMPORTANT!: remember that we don't
want to sample the same patient twice, so make sure you sample in the proper
fashion.
(Hint: Take
a look at "Calc/Random Data/Sample from Columns", and remember, all you're doing is sampling
40 rows from C1!)
c.) You should now have your random sample of the
40 patients. The next task is to assign the five treatments among
the patients. Be sure to assign the same number of patients receive
each treatment.
(Note: Perhaps
the simplest way is to just assign the first 8 patients in each sample
to "Treatment 1", the next 8 patients to "Treatment 2", and so on to "Placebo.")
"Restricted Randomization" -- Block Design . . .
In this group of 40 patients, it is known that there are 20 females and 20 males. We suspect that gender could have some effect in the efficacy of the treatments. We thus have two homogeneous blocks of subjects, each one homogeneous to the best of our knowledge. You are asked to implement a randomized block design and allocate the treatments within the two randomized blocks.
d.) For this part, you should start a new worksheet
and generate the data as you did in part a.) above.
(Note: Remember
that this time, you're basically generating two sets of data -- 20 males
and 20 females -- to represent your "original" data. (i.e.,
you should have male patients 1-20, as well as female patients 1-20).
Thus, be sure to label your columns "Male" and "Female" to distinguish
the two blocks of data.)
e.) Next, you need to take a random sample of each block of patients, in the same manner you did in part b.) above. Store them in your next two columns. Again, to distinguish among the columns, they should be labeled appropriately.
f.) Now, within each randomized block of data,
assign
the 5 treatments to the patients.
(Hint: You may
want to recall your method from part c.) above!)
__________________________________________________________________________________________________________
__________________________________________________________________________________________________________
2.) Sampling Design and Moving Towards Statistical Inference . . .
There is a mayoral election that will be held in Owltown, USA. You are to assume that there are only going to be two candidates running -- Screech McTalons and Ima Hoot.
Let p be the proportion of voters that will vote for Mr. McTalons.
We wish to study the behavior of groups of samples of different sizes drawn from the population of voters (Owltown voter population: 500,000).
a.) Suppose the actual value of p is a favorable
0.7, or 70%.
Generate 25 samples of
size 50 from the population.
(Hint: Again
refer to "Calc/Random Data", and we will simulate random data from a Bernoulli
distribution!
Also, to makes things easier, let the rows represent the 25 samples, and
store them in columns C1-C50.)
b.) Next, calculate the means of each sample and
store them in the next available column (C51). You may want to label
this column (e.g., "Means").
(Hint: Since
we're working with rows as each sample, you should investigate "Calc/Row
Statistics." Be sure to calculate the mean for each of the 25 rows,
which basically means calculating the row statistics among all 50 columns
. . .)
c.) Further analyze the means you just calculated
by looking at the basic descriptive statistics of the means.("Stat-> Basica Statistics")
d.) Create a histogram of the estimates of
p
from the 25 samples of size 50 drawn from the population with
p=0.7.
e.) Now increase the sample size to 100 and
repeat parts a.)-d.) (same p of 0.7). You should probably
start a new worksheet before doing this! Remember, in doing this,
you will be increasing the number of columns (sample size) to 100!
f.) Finally, with a sample size of 100 again
(still 25 samples), suppose the actual value of p is now only 0.55,
or 55%, and repeat parts a.)-d.) with this new information. Again,
start a new worksheet before doing this.
__________________________________________________________________________________________________________
__________________________________________________________________________________________________________
;) BONUS! ;)