Group Assignment 3
Portfolio Exercise 3: GPT Models
Note: M3 - Group Assignment 3 Deadline: Wednesday 28th of February at 12:00 PM
Introduction
This assignment focuses on leveraging retrieval-augmented generation (RAG) techniques, particularly in the context of extracting and synthesizing information from various documents (or a document). You’ll be using Langchain to implement these concepts and create a system that not only generates responses but also retrieves relevant information from a database.
Objective
Task Description
Your task is to create a system that uses RAG for extracting information from a set of documents or a document which can be either a scientific paper or report. This involves integrating a database to store vectors of document information and designing customized prompts to effectively use GPT models for generation. Here are some project ideas:
- Build a QA system that retrieves information from a given set of documents (or a document) to answer complex queries.
- Develop a tool for summarizing research papers, where the system extracts key points from a database of paper vectors.
- Create a recommendation engine that suggests content based on user queries and retrieved document data.
- Explore other innovative applications of RAG, such as automated content generation, data analysis, or any other creative use case you can envision.
Key Components
- Database Integration: Set up a database to store and retrieve vectors representing document information.
- Customized Prompts: Design and implement prompts that effectively utilize GPT models for generation based on retrieved data.
- RAG Implementation: Use Langchain to integrate retrieval-augmented generation in your system.
Data
- Utilize open-source datasets or create your own corpus of documents for retrieval.
- Ensure the chosen datasets are suitable for demonstrating the capabilities of your RAG system.
Delivery
- Create a dedicated GitHub repository for this assignment.
- Store all relevant materials, including the Colab notebook, in the repository.
- Provide a README.md file with a concise description of the assignment and its components.
- You may work individually or in groups of up to three members.
- Submit your work by emailing a link to the repository to Hamid (hamidb@business.aau.dk).