Data Processing in Shell
Learn powerful command-line skills to download, process, and transform data, including machine learning pipeline.
We live in a busy world with tight deadlines. As a result, we fall back on what is familiar and easy, favoring GUI interfaces like Anaconda and RStudio. However, taking the time to learn data analysis on the command line is a great long-term investment because it makes us stronger and more productive data people. In this course, we will take a practical approach to learn simple, powerful, and data-specific command-line skills. Using publicly available Spotify datasets, we will learn how to download, process, clean, and transform data, all via the command line. We will also learn advanced techniques such as command-line based SQL database operations. Finally, we will combine the powers of command line and Python to build a data pipeline for automating a predictive model.
Instructor: Susan Sun