A) Introduction to Data Science (W35-36)

Note: Group Portfolio Assignment - Exploratory Data Analysis (EDA) Deadline: Friday, 8 September 2023, 12:00 PM

This topic includes 5 sessions as follows:

  • Welcome to Data Science! (Friday, September 1st, 10:15-14:15): This session will introduce students to the fundamentals of data science, with a focus on Python. Students will learn about the Python data science stack, essential tools and platforms, and software setup. They will also get a preview of the upcoming weeks and a refresher on Python basics.
  • Data Handling and Manipulation I (Lecture) (Monday, September 4th, 12:30-16:15): This session will cover the foundational aspects of data handling in Python. Students will learn about the different types of data that are important in data science, and they will explore essential operations like arrange, group-by, filter, select, and join. By the end of this session, students should have a solid understanding of primary data manipulation techniques.
  • Exploratory Data Analysis & Essential Statistics (Tuesday, September 5th, 08:15-12:00): This session will introduce students to exploratory data analysis (EDA) and essential statistics. Students will learn how to use EDA to uncover patterns, anomalies, and frame questions in data. They will also learn about foundational measures and techniques for data interpretation.
  • EDA-Exercise on a Real Dataset (Tuesday, September 5th, 12:30-14:15): This session will give students the opportunity to practice the EDA techniques they learned earlier on a real dataset. Students will be able to dive into a real dataset, identify patterns and insights, and implement key EDA methods. They will also benefit from on-the-spot guidance by the teacher and TAs during the exercise.
  • Data Visualization in Data Science (Wednesday, September 6th, 10:15-14:15): This session will teach students the importance of effective data visualization in data science. Students will explore Seaborn, a Python library for intuitive statistical graphics, and Altair, a declarative visualization library for Python. They will also have the opportunity to create impactful visualizations with real datasets through hands-on exercises.