In R Studio there are (by default) four “panes” to work in.
These can all be configured, but we’ll stick with the default configuration for now.
The Source Pane provides several helpful features:
The console “is” R. You feed commands into it, and it produces output. It also keeps track of variables, functions, etc. that you define, and the previous commands you’ve given.
The console pane is where you directly interact with the console. You can do this by typing commands in directly, or by running them from the source pane.
The output from the console is often given in the console. Plots appear separately (by default in the “Plots/Help” pane).
You can view a few different things in the upper right. By default you’ll see the Environment, where R shows you what things it’s keeping track of (Data frames, variables, functions). Many of these things, e.g. data frames and functions, can be opened in the source pane for viewing.
You can also view the commands the console has run, in the order it ran them.
Plots is self-explanatory. Note that you can flip back and forth between various plots you’ve made.
The Help browser allows you to search for documentation on functions you are using, or might use. You can also access these from the console. Great when you know roughly what you want to do, but need to check syntax. Also, check the “See Also” at the bottom if you’re not sure what function you’re looking for.
A data frame is a fancy list. Let’s load the data frame from the lecture:
library(datasets)
states = data.frame(state.x77, Abbr=state.abb,
Region=state.region, Division=state.division)
class(states)
## [1] "data.frame"
typeof(states)
## [1] "list"
states['Alabama',]
## Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
## Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
## Abbr Region Division
## Alabama AL South East South Central
head( states['Population'] ,10)
## Population
## Alabama 3615
## Alaska 365
## Arizona 2212
## Arkansas 2110
## California 21198
## Colorado 2541
## Connecticut 3100
## Delaware 579
## Florida 8277
## Georgia 4931
Also states$Population
states['Alabama','Population']
## [1] 3615
“Slicing” a data frame returns a new data frame with the desired rows and/or columns
states[1:4,c('Population','Income')]
## Population Income
## Alabama 3615 3624
## Alaska 365 6315
## Arizona 2212 4530
## Arkansas 2110 3378
Let’s look at the big states:
states[states$Population>10000,c('Area','Region')]
## Area Region
## California 156361 West
## Illinois 55748 North Central
## New York 47831 Northeast
## Ohio 40975 North Central
## Pennsylvania 44966 Northeast
## Texas 262134 South
Can also save this condition in a vector
bigstates = states$Population>10000
states[bigstates,'Abbr']
## [1] CA IL NY OH PA TX
## 50 Levels: AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA ... WY
Generate the first 10 elements of the Fibonacci sequence
fibonacci = c(1,1)
for (i in 3:10){
fibonacci[i] = fibonacci[i-1]+fibonacci[i-2]
}
fibonacci
## [1] 1 1 2 3 5 8 13 21 34 55
A random walk
position = 0
total_steps = 0
while( abs(position) < 10){
my.rand = runif(1)
if (my.rand>0.5){
step = 1
} else {
step = -1
}
position = position + step
total_steps = total_steps + 1
}
total_steps
## [1] 26
position
## [1] 10
Basic Plotting: provide x and y data:
plot(states$Area,states$Population)
We can pass additional parameters to plot
that tell it what the plot should look like.
plot(states$Area,states$Population,xlab='Area',ylab='Population',main='Nice Plot!')
Let’s split the states up by population density
popdensity = states$Population/states$Area
dense = (popdensity>median(popdensity))
sparse = (popdensity<=median(popdensity))
Then make a plot with both
plot(states$Area[dense],states$Population[dense],col='blue',xlab='Area',ylab='Population',main='Nice Colors!',xlim=c(0,6e5))
points(states$Area[sparse],states$Population[sparse],col='red',xlim=c(0,6e5))
You can produce side-by-side plots by setting a global parameter
par(mfrow=c(1,2))
plot(states$Area,states$Population,xlab='Area',ylab='Population',main='Super')
plot(states$Area,states$Income,xlab='Area',ylab='Income',main='Sweet!')