In this lab, you will…
Go to the sta199-fa21-003 organization on GitHub. Click on the repo with the prefix lab-02. It contains the starter documents you need to complete the lab.
Clone the repo and start a new project in RStudio. See the Lab 01 instructions for details on cloning a repo and starting a new R project.
We will use the tidyverse and viridis packages to create and customize plots in R.
The data in this lab is in the midwest
data frame. It is part of the ggplot2 R package, so the midwest
data set is automatically loaded when you load the tidyverse package.
The data contains demographic characteristics of counties in the Midwest region of the United States.
Because the data set is part of the ggplot2 package, you can read documentation for the data set, including variable definitions by typing ?midwest
in the console.
As we’ve discussed in lecture, your plots should include an informative title, axes should be labeled, and careful consideration should be given to aesthetic choices.
In addition, the code should not exceed the 80 character limit, so that all the code can be read when you knit to PDF. To help with this, you can add a vertical line at 80 characters by clicking “Tools” \(\rightarrow\) “Global Options” \(\rightarrow\) “Code” \(\rightarrow\) “Display”, then set “Margin Column” to 80, and click “Apply”.
Remember that continuing to develop a sound workflow for reproducible data analysis is important as you complete the lab and other assignments in this course. There will be periodic reminders in this assignment to remind you to knit, commit, and push your changes to GithHub. You should have at least 3 commits with meaningful commit messages by the end of the assignment.
For more details and code examples for histograms ggplot2 reference page.
See Introduction to the viridis color maps to read more about the viridis R package and see code examples.
percollege
) versus percentage below poverty (percbelowpoverty
) with points colored by state
. Label the axes and legend and give the plot a title. Use the scale_color_viridis
function to apply the viridis color palette to your plot.🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with the commit message “Added answer for Ex 1 -2”. Make sure to commit and push all changed files so that your Git pane is empty afterwards.
se = FALSE
removes the confidence bands around the line.
geom_smooth
with the argument se = FALSE
to add a smooth curve fit to the data. Which plot do you prefer - this plot or the plot in Ex 2? Briefly explain your choice.🧶 ✅ ⬆️ Now is another good time to knit, commit, and push your changes to GitHub with a meaningful commit message.
area
) of a county based on state (state
).
metro
, whether a county is considered in a metro area. The y axis of the segmented barplot should range from 0 to 1.
Note: For this exercise, you should begin with the data wrangling code below. We will learn more about data wrangling code next week.
<- midwest %>%
midwest mutate(metro = ifelse(inmetro == 1, "Yes", "No"))
🧶 ✅ ⬆️ Now is another good time to knit, commit, and push your changes to GitHub with a meaningful commit message.
ggplot2
reference page will be helpful in determining the theme. The size
of the points is 0.75.)🧶 ✅ ⬆️ Knit, commit, and push your final changes to GitHub with a meaningful commit message.
Once you are finished with the lab, you will submit the PDF document produced from your final knit, commit, and push to Gradescope.
Before you wrap up the assignment, make sure all documents are updated on your GitHub repo. We will be checking these to make sure you have been practicing how to commit and push changes. Remember – you must turn in a .pdf file to the Gradescope page by the submission deadline to be considered “on time”.
To submit your assignment:
Component | Points |
---|---|
Ex 1 | 4 |
Ex 2 | 6 |
Ex 3 | 4 |
Ex 4 | 8 |
Ex 5 | 6 |
Ex 6 | 6 |
Ex 7 | 8 |
Workflow & formatting | 8 |
Grading notes: