Resource Center

Here a links to a variety of data science and business analytics related resources.

General business analytics, data science, statistical modeling

Careers

Programming tutorial hubs

Online courses

There are numerous online courses available through DataCamp, Coursera, EdX, Udemy and others. Here’s a few Python and R ones I’ve checked out over the years.

  • Intro to Data Science in Python - I did this short course in Feb 2017 (Coursera UMich). Great fun. If you want a good pandas/python learning challenge, try the assignments.

  • Python for Everybody course - This site includes a bunch of videos and supplementary files. The whole thing was created by a professor at University of Michigan and is meant to be a totally open set of freely available learning materials for Python in the context of data analysis.

  • Coursera has some well regarded R based data science courses

Learning the command line

As you’ve no doubt gathered from this class, I’m a big fan of using the command line for certain data related tasks and think that command line skills are really important. There’s a new 2e of the O’Reilly book “Data science at the command line”. The second edition is FREELY available online.

https://datascienceatthecommandline.com/ - home page for the book

https://datascienceatthecommandline.com/2e/ - the free 2nd edition

Chapter 1 is a great overview of why you should become adept at using the command line.

Learning R

Online R tutorials, books and examples for getting started

  • R-bloggers- The aggregator for R related blogs.

  • R for Data Science - Free, online version of the book, R for Data Science by Hadley Wickham and Garrett Grolemund.

  • Quick-R - This is a great site dedicated to helping R newbies get over the somewhat steep R learning curve.

  • fasteR: The fast lane to learning R - created by Norm Matloff who is a big proponent of learning base R first before doing things with tidyverse packages.

  • STAT 545 - Data wrangling, exploration, and analysis with R - Jenny Bryan’s course developed at UBC and still used even though JB has moved on to R Studio. Not only does this cover R, but also gets into things like version control, web scraping and Shiny.

  • Cookbook for R - Another great site for learning R. In their words: “The goal of the cookbook is to provide solutions to common tasks and problems in analyzing data.”

  • Webinars from R Studio- The creators of the hugely popular R Studio package have a ton of learning resources on their site.

  • The Official R Manuals - These are accessible from the main R Project page in the Documentation section.

  • Contributed Documentation - Many people have written tutorials, books, and other free documentation for various aspects of R. This is part of the magic of R community.

  • Introducing R to a non-programmer in one hour - Just what it says.

  • Teach yourself Shiny- A somewhat recent development by the folks at R Studio is something called a Shiny web app. Learn to create interactive, R driven, web apps!

The base R vs tidyverse debate

The tidyverse has become increasingly popular and with this popularity has come more scrutiny. In particular, there’s a healthy debate on whether new R users should first learn base R and then move on to the tidyverse or whether they should immediately be taught the tidyverse approach. It really isn’t an either-or question and in this course you will both base R and tidyverse approaches. I do start with base R because I think you need a good understanding of things like vectors to make the most of the R language. At the end of the day, we use R to solve problems and the more tools you have to tackle those problems, the better off you will be. A few good resources on this debate include the following.

Packages

The R ecosystem relies on high quality packages and its community of package developers. Here are some collections of package descriptions and links.

  • RStartHere- A very comprehensive and well organized list of packages for doing data science in R.

  • Awesome R- Curated list of R packages by category (IDE, data manipulation, etc.)

Learning Python

Online Python tutorials, books and examples for getting started

Blogs and listservs

  • Practical Business Python - Super relevant blog for business students learning Python.

  • Pycoders Weekly - Weekly email newsletter. Always has interesting stuff and almost always something directly data science related.

Libraries

  • Awesome Python - A curated list of awesome Python frameworks, libraries, software and resources

Statistics

If you are rusty on statistics, there’s a really good OpenIntro Stats book available as a free online book or you can pay what you want for a paperback copy. It includes R based material.

You can also find high quality free online statistics courses through the Open Learning Initiative as well as places like Coursera and EdX.

Cross Validated is a great Q&A forum for all things statistics. Lots of R related content.

Publicly available data

Workflow and reproducible analysis