Paired Differences with R

Would shifting to a four-day work week reduce commuting mileage, fuel consumption and exhaust emissions? We will use the data comparing mileage driven for four and five day work weeks that is the main example at the start of Chapter 25 of De Veaux, Velleman and Bock, Stats.: Data and Models 2nd ed., 2008, Addison Wesley, Boston.. If you have this book (or another by the same authors) run the ActivStats CD that comes with it but instead of starting ActivStats click on the Datasets button at the lower left of the start-up window. Choose the version of the text you have and go to the Text folder and findCh25_mileage.txt. Copy it to the R folder on your computer. If you do not have the CD, the dataset is small enough that you can just type it in from the printout below..

We had already copied all these files to the R folder on our computer so we looked at the files in the R directory to find out what the name of the file was, then read it into a frame named mileage. We just read it to see what the variable names were. Then we created a variable diff for the differences, typed diff to see them (and compare them to the textbook), and then used t.test.

> mileage <- read.delim(file="Ch25_mileage.txt",header=TRUE)
> attach(mileage)
> read.delim(file="Ch25_mileage.txt",header=TRUE)
      Name X5.Day_mileage X4.Day_mileage
1     Jeff           2798           2914
2    Betty           7724           6112
3    Roger           7505           6177
4      Tom            838           1102
5    Aimee           4592           3281
6     Greg           8107           4997
7  Larry G           1228           1695
8      Tad           8718           6606
9  Larry M           1097           1063
10  Leslie           8089           6392
11     Lee           3807           3362
> diff = X5.Day_mileage - X4.Day_mileage
> diff
 [1] -116 1612 1328 -264 1311 3110 -467 2112   34 1697  445
> t.test(diff)

        One Sample t-test

data:  diff 
t = 2.858, df = 10, p-value = 0.01701
alternative hypothesis: true mean is not equal to 0 
95 percent confidence interval:
  216.4276 1747.5724 
sample estimates:
mean of x 
      982 

Note the free confidence interval. We should also

Make a picture!

> stem(diff)

  The decimal point is 3 digit(s) to the right of the |

  -0 | 531
   0 | 04
   1 | 3367
   2 | 1
   3 | 1

This is not too bad for such a small dataset. Note that we check assumptions on the differences, not the original data.


©2008 statistics.com, portions ©2006-2007 Robert W. Hayden and used by permission.