Data Engineering and Machine Learning Operations in Business

This module is meticulously designed to equip students with essential knowledge and skills required for the design, development, and implementation of data science projects in both business and research environments. Central to this module is the practical understanding of acquiring, processing, and storing real-world data within a big data framework.

Students will be adept in querying databases through application programming interfaces (APIs), utilizing common database frameworks tailored for structured and unstructured data, and handling dynamic and large-scale data effectively. The module also imparts critical knowledge in refactoring machine learning models and associated code for deployment in web-based applications.

Throughout the course, students will engage in hands-on activities that mirror real-world challenges associated with deploying machine learning models into end-to-end solutions. This practical approach encompasses the entire spectrum of data science workflows—from data acquisition and processing to the operational deployment of machine learning models.

By the end of this module, students will have gained a robust understanding of the processes, techniques, and workflows essential to delivering functional machine learning solutions. They will be equipped to autonomously plan, manage, and execute complex machine learning projects, including developing client-facing application interfaces, with a clear emphasis on practical application and industry relevance.

Session Overview

Lecture 1: Introduction to Serverless ML and Databases

  • Covering the ease of building end-to-end services with ML models without the need for complex infrastructure. Introduction to databases and API consumption.

Lecture 2: Serverless ML Pipelines in Python

  • Discussing the creation of ML pipelines in Python and running your first Prediction Service using serverless technologies.

Lecture 3: Feature Engineering and Data Modeling

  • Module on data modeling and utilizing a serverless feature store. Introduction to the credit-card fraud prediction service.

Lecture 4: MLOps and Model Management

  • Exploring training and inference pipelines and the model registry within an MLOps framework.

Lecture 5: Serverless UI for ML Systems

  • Module on creating serverless user interfaces for ML systems, focusing on stakeholder communication and interactive UI design.

Lecture 6: Automated Testing and Versioning in MLOps

  • Delving into MLOps principles, focusing on automated testing, version control, and managing upgrades/rollback for ML models.

Lecture 7: Advanced MLOps Strategies

  • Continuation of MLOps principles with advanced strategies for maintaining and updating machine learning systems.

Lecture 8: Real-time Machine Learning Systems

  • Module focusing on the development and management of real-time machine learning systems and online inference pipelines.

Lecture 9: Emerging Technologies in MLOps

  • An outlook on other relevant technologies for MLOps, exploring cutting-edge tools and methodologies in the field.

Literature