Literature & Resources

While this course does not come with a list of mandatory readings, we will often refer to some central resources in R and python, which for the most part can always be accessed in a free and updated online version. We generally recommend you to use these amazing resources for problem-solving and further self-study on the topic.

Main Literature

These pieces of work can be seen as main references for data science using Python. We will frequently refer to selected chapters for further study. Documentation of the used packages, tutorials, papers, podcasts etc. will be added throughout.

  • VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. O’Reilly Media, Inc. Online available here
  • Wilke, C. O. (2019). Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. O’Reilly Media.

Supplementary literature

Note: Papers, Business Cases, Videos, Tutorials, Podcasts, and Blogposts will be presented and assigned during the course.

Further Ressources

Data Science Cloud services

  • Notebook bases:
    • Google Colab: Googles popular service for editing, running & sharing Jupyter notebooks (Only Python Kernel, but R kernel can be accessed via some tricks)
    • Deepnote: New popular online notebook service with good integration to other services (Python, R & more)
    • Kaggle: Also provides their own cloud-based service co create and run computational notebooks. Convenient, unlimited, but a bit slow (Pyhton, r ).
  • Instance based:
    • UCloud: New cloud infrastructure provided by AAU, AU, SDU
    • AAU Strato: AAU CLAUDIA infratructure. Very powerful, but access needs a bit of experience with working via terminal.

Community

  • Kaggle: Crowdsourced data science challanges. Nowadays also provides a vivid community where you find datasets, notebooks for all kind of data science exercises.
  • madewithml

Tools & Helpers