Course: Data Wrangling with R

Welcome to Data Wrangling with R! This course provides an intensive, hands-on introduction to Data Wrangling with the R programming language. You will learn the fundamental skills required to acquire, munge, transform, manipulate, and visualize data in a computing environment that fosters reproducibility.

Class Information

Course Objectives

Upon successfully completing this course, you will be able to:

  • Perform your data analysis in a literate programming environment
  • Manage different types of data
  • Manage different data structures
  • Import, scrape, and export data
  • Index, subset, reshape and transform your data
  • Compute descriptive statistics
  • Visualize data
  • Perform iterative functions
  • Write your own functions

…all with R!

Class Structure

Each week I plan to have you read through selected tutorials on specific data wrangling activities in R. I will assign problems/activities that you will need to perform prior to each Saturday session. Then in each Saturday class I’ll spend about 30 minutes reviewing the data wrangling activity and answer any burning questions. Then you will break up into defined small groups and review each others code and approaches to solving the assigned problems. And finally, for the last hour of class you and your small group will work together to complete another problem within this same data wrangling domain.

The purpose for this course structure is multi-dimensional:

  1. It will teach you to read and learn R programming tutorials and techniques on your own
  2. The out-of-class assignments will force you to come to each class prepared and these assignments will also prepare you for your final project
  3. The in-class peer review will help you get feedback on your code and also teach you to review other people’s code
  4. The in-class small group work will teach you to work on a coding task collaboratively and within a constrained time limit

Material

All required classroom material will be provided in class or online. Any recommended yet optional material will also be provided in the classroom notes.

Schedule

Session Description
1 Introduction   
  Intro to data wrangling, R, and course outline
  Managing your workflow with RStudio Projects, R Markdown, and R Notebook
2 Getting Your Data   
  Importing and exporting data
  Scraping text & tables
3 First Date Guidelines for Data   
  Understanding the basics of your data
  Initial visualizations
4 Exploratory Data Analysis   
  Transforming your data
  Advancing your visualizations
5 Controlling Your Data   
  Data frames vs. Tibbles
  Tidy data
  Relational data
6 Dealing with Different Types of Data   
  Strings
  Factors
  Dates and times
7 Creating Efficient Code in R   
  Writing functions
  Iteration

Grading Policies

Course grades will consist of:

Final grades will be distributed according to the following cutoffs:

  • A     94 – 100%
  • A-    90 – 93%
  • B+    87 – 89%
  • B      83 – 86%
  • B-    80 – 82%
  • C+    77 – 79%
  • C      73 – 76%
  • C-    70 – 72%
  • D & F   Hopefully None!

Software

We will use this software during the course. Plan on bringing a computer to each class meeting.

  • R and RStudio will be used to perform all programming activities, assignments, and the final project. You can find details on how to download these here.
  • Slack will replace e-mail and Blackboard for our course. You will receive an invitation to the UC Data Wrangling slack team. You may wish to install one of the apps. Here is an introduction to Slack from one of Kris Shaffer’s courses (although this is a completely different course and slack team it provides a nice introduction that you might find useful).

Fine Print

Academic Integrity: As with all Lindner College of Business efforts, this course will uphold the highest ethical standards, which are critical to building character. LCB instructors are required to report ANY incident of academic misconduct (e.g., cheating, plagiarism) to the college review process, which could result in severe consequences, including potential dismissal from the college. For further information on Academic Misconduct or related university policies and procedures, please see the UC Code of Conduct

All academic programs at the Lindner College of Business apply a “Two Strikes Policy” regarding Academic Integrity. Any student who has been found responsible for two cases of academic misconduct may be dismissed from the College. The “Two Strikes Policy” supplements the UC Student Code of Conduct.

All cases of academic misconduct (e.g., cheating, plagiarism, falsification) will be formally reported by faculty. Students will be afforded due process for allegations, as outlined in the policy. If a student is found guilty of academic misconduct in two instances, the student may be dismissed from the Lindner College of Business. The “Two Strikes Policy” is now in effect.

Disability: Students with disabilities who need academic accommodations or other specialized services while attending the University of Cincinnati will receive reasonable accommodations to meet their individual needs as well as advocacy assistance on disability-related issues. Students requiring special accommodation must register with the Disability Services Office.

Attendance: Your attendance is expected at every meeting. If you must be absent, I request that you notify me in advance of the class meeting.

Grade appeals: If you think the grade of your work (homework, peer reviews, participation) is miscalculated, you have the right to appeal. The appeal must be done (through email) within 7 calendar days since the grade is released/posted. After that, your grade is final and will not be changed.

Acknowledgments: I have drawn ideas or readings from the following syllabi: