“What we have is a data glut.” - Vernon Vinge
Data are being generated by everything around us at all times. Every digital process and social media exchange produces it. Systems, sensors and mobile devices transmit it. Countless databases collect it. Data are arriving from multiple sources at an alarming rate and analysts and organizations are seeking ways to leverage these new sources of information. Consequently, analysts need to understand how to get data from these data sources. Furthermore, since analysis is often a collaborative effort analysts also need to know how to share their data.
Welcome to week 2! This week we will cover the process of importing, exporting, and scraping data. First, you will learn the basics of importing tabular and spreadsheet data. You will also cover the equally important process of getting data out of R. Then, since modern day data wrangling often includes scraping data from the flood of web-based data becoming available to organizations and analysts, you will learn the fundamentals of web-scraping with R. This includes importing spreadsheet data files stored online, scraping HTML text and data tables, and leveraging APIs.
Consequently, this week will give you a strong foundation for the different ways to get your data into and out of R. This will prepare you for your first challenge in completing your course project - that of acquiring your data!
Please work through the following tutorials prior to Saturday’s class. The skills and functions introduced in these tutorials will be necessary to complete your assignment, which is due at the beginning of Saturday’s class, and will also be used in Saturday’s in-class small group work.
head()to display the first few rows of the data frame, use
str()to display the structure of each data frame, and be sure that each code chunk fully displays your code.
Submission: Knit this R Markdown document to an HTML file, publish it on RPubs, and send me the URL for your published report prior to class (either by email or through Slack messenger). Also, knit to a PDF document and bring this document to class on Saturday for submission.