Portfolio Assignment 2
Note: M3 - Group Assignment 2 Deadline: Monday 17th of Feb

Introduction
This exercise is designed to deepen your understanding and skills in modern deep learning techniques with a focus on classification tasks. In this assignment, you are required to fine-tune a transformer model for one of the following classification tasks:
- Binary Classification
- Multi-Class Classification
- Multi-Label Classification
You can choose any dataset (from 🤗 Hugging Face Datasets, Kaggle, or your own collection) that is suitable for the classification task you wish to address.
Task Description
Your assignment involves the following key steps:
Dataset Selection and Preprocessing
- Choose a dataset appropriate for your chosen classification task (binary, multi-class, or multi-label).
- Preprocess the data as needed (e.g., tokenization, handling class imbalance, etc.).
Model Selection and Fine-Tuning
- Use a transformer model (e.g., BERT, RoBERTa, DistilBERT, etc.) available from Hugging Face.
- Fine-tune the model on your selected dataset using libraries such as Simple Transformers or Hugging Face Transformers.
- Document your training process, including hyperparameter choices, evaluation metrics, and any challenges encountered.
Evaluation
- Evaluate your fine-tuned model using appropriate metrics (e.g., accuracy, F1-score, ROC-AUC for binary; precision, recall, etc.).
- Provide a brief analysis of the model’s performance on the test set.
Deployment (Optional)
- Optionally, develop a simple web application using Streamlit or Gradio that allows users to input text and see classification predictions.
- This step is meant to demonstrate how your model can be deployed in a real-world scenario.
Upload to Hugging Face Hub
- Once fine-tuning is complete, upload your model to the Hugging Face Hub and provide a link to the repository.
- Ensure that your model card includes information about the dataset, training parameters, evaluation metrics, and intended use.
Data
- You may utilize datasets from 🤗 Hugging Face, Kaggle, or create your own.
- Ensure that you cite your data source appropriately in your documentation.
Delivery
GitHub Repository
- Create a repository containing your code, notebooks, and documentation.
- Include a README.md with a brief description of your assignment, your approach, and instructions on how to run your code.
Colab or Jupyter Notebook
- Save your working notebook in the repository.
- Ensure that the notebook includes detailed explanations of each step, from data preprocessing to model evaluation.
Group Work
- You may work in groups of up to 3 members.
- Each group member’s contribution should be briefly outlined in the README or the notebook.
Technical Explainer Video
- Record a short (~5 minutes) technical explainer video presenting your main ideas, methodology, and results.
- You may use Panopto, OBS Studio, Loom, or any other screen-recording tool.
- Include the video link in your submission.
Submission
- Send an email to Hamid (hamidb@business.aau.dk) with the link to your GitHub repository (and video) by the deadline.
Good luck and have fun exploring classification with transformer models in PyTorch and Hugging Face!