library(tidyverse)
basicData <- read_csv("https://soc114.github.io/data/basicData.csv")Basics of R
Now that you have installed R, it is time to try using R for the first time. Because there already exist many excellent resources on this topic, we will walk together in class through the material in R4DS Ch 2. This material will prepare you to complete Problem Set 1.
After the basics in R4DS Ch 2, return to this page and continue below to learn about an object we will often use in this course: a data frame.
Using a data frame
The class of object we will use in this course most often is a data.frame or tibble, which is a particular type of data frame. A data frame is an object like the rectangular spreadsheet below.
| Address | Income | Number of People |
|---|---|---|
| 5462 Park St | 54,896 | 2 |
| 4596 Ocean Ave Apt B | 22,465 | 1 |
| 6831 River Dr | 134,297 | 4 |
The code below will load basicData.csv into an object in R.
If you type basicData in your R console, you will see a printout of the data.
Data frames and empirical questions
The structure of these data correspond to elements of empirical questions we study.
- The outcome is a variable to be summarized. It appears as a column in the data. Here, the outcome might be
income. - The unit of analysis is the unit for which the outcome is defined. Each unit corresponds to a row of the data. Here, the unit of analysis is a household.
The data may also contain other columns, such as the address of the household in this example. The other columns may represent predictor variables or variables that define population subgroups.
Now that you understand a data frame object, we will next use one to produce a visualization.