Next: Selecting Parts of a Up: Reading Data in S-PLUS Previous: Referencing Data in a

Attaching a Data Frame

Typing the name of the data frame and a $ every time could grow tiresome very quickly. However, if you only type the name of the variable S-PLUS will not know where to find it. To solve this problem, put the data frame on your search path.

You can check your search path by typing search():

> search()
[1] "/GRAFT-U2/ap3b/.Data.hp"
[2] "/usr/statlocal/lib/Splus3.3/splus/.Functions"
[3] "/usr/statlocal/lib/Splus3.3/stat/.Functions"
[4] "/usr/statlocal/lib/Splus3.3/s/.Functions"
[5] "/usr/statlocal/lib/Splus3.3/s/.Datasets"
[6] "/usr/statlocal/lib/Splus3.3/stat/.Datasets"
[7] "/usr/statlocal/lib/Splus3.3/splus/.Datasets"

This means that if I type something, S-PLUS first looks for a function or variable of that name in my .Data directory, and then in the directories of functions and data sets that come with S-PLUS. You may have different directories on your search path, but this should give you the basic idea.

Next attach the pain.relief data frame to your search path in the second position (you can specify a different position, but two is the default). This means that if you type the name of a variable or function, S-PLUS will first look for it in your .Data directory, then in the pain.relief data frame, and then in the S-PLUS directories. This gives precedence to any existing variables over the ones in the data frame.

> attach(pain.relief)
> search()
[1] "/GRAFT-U2/ap3b/.Data.hp"
[2] "pain.relief"
[3] "/usr/statlocal/lib/Splus3.3/splus/.Functions"
[4] "/usr/statlocal/lib/Splus3.3/stat/.Functions"
[5] "/usr/statlocal/lib/Splus3.3/s/.Functions"
[6] "/usr/statlocal/lib/Splus3.3/s/.Datasets"
[7] "/usr/statlocal/lib/Splus3.3/stat/.Datasets"
[8] "/usr/statlocal/lib/Splus3.3/splus/.Datasets"

When you are done with the problem, use the detach function to remove the data frame from your search path. If you type detach() with no arguments, S-PLUS will detach the second thing from your search path. If this is not what you want, you can specify by number which data frame or directory you want detached. As long as you use attach(pain.relief) without specifying a position, you can use plain detach().

When a data frame is attached to your search path, you can manipulate the variables in the data frame directly.

> Dose
   1   2   3   4   5   6   7   8   9  10  11  12
 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4
> Water
  1  2  3  4  5  6  7  8  9 10 11 12
 10 10 10 10 20 20 20 20 30 30 30 30
> Relief
 1  2  3  4  5  6  7  8  9 10 11 12
 7 15 19 23 15 13 26 38 21 28 31 47

Assignments to these variables will not affect the data in the data frame. Rather, they will result in the creation of a variable of the same name in your .Data area. For example, suppose Dose needs to be replaced by Dose*10.

> pain.relief <- read.table("example2", header=T)
> attach(pain.relief)
> pain.relief$Dose
 [1] 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4
> Dose
   1   2   3   4   5   6   7   8   9  10  11  12
 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4
> Dose <- Dose*10
> Dose
 1 2 3 4 5 6 7 8 9 10 11 12
 1 2 3 4 1 2 3 4 1  2  3  4
> pain.relief$Dose
 [1] 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4

Making an assignment to Dose created a new variable (called Dose) in the .Data area, which takes precedence over the original Dose (still available as pain.relief$Dose).

Another approach (after removing the new Dose from the .Data area) would be to change the original Dose vector. However, Dose still refers to the old pain.relief$Dose. The changes do not take effect until the next time the pain.relief data frame is attached to the search path.

> pain.relief <- read.table("example2",header=T)
> attach(pain.relief)
> pain.relief$Dose
 [1] 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4
> pain.relief$Dose <- Dose*10
> pain.relief$Dose
 1 2 3 4 5 6 7 8 9 10 11 12
 1 2 3 4 1 2 3 4 1  2  3  4
> Dose
   1   2   3   4   5   6   7   8   9  10  11  12
 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4
> detach()
> attach(pain.relief)
> Dose
 1 2 3 4 5 6 7 8 9 10 11 12
 1 2 3 4 1 2 3 4 1  2  3  4

A third alternative would be to change Dose before attaching to the data frame. In this case, the right hand side of the assignment must use pain.relief$Dose (since pain.relief is not on the search path, and Dose is not in the .Data area).

> pain.relief <- read.table("example2", header=T)
> pain.relief$Dose <- pain.relief$Dose * 10
> attach(pain.relief)
> Dose
 1 2 3 4 5 6 7 8 9 10 11 12
 1 2 3 4 1 2 3 4 1  2  3  4

The point of this confusing example is to demonstrate some complications that arise when using data frames. I prefer the second or third approach, in order to keep the data frame up-to-date and to avoid cluttering the .Data area.

Please note that the rest of this handout uses the original pain.relief data frame.

Next: Selecting Parts of a Up: Reading Data in S-PLUS Previous: Referencing Data in a

Brian Junker 2002-08-26