Machine Learning and Pattern Recognition

AI3011 | Spring 2024

Home Lectures Readings Labs Project Info


Project Description



This class will involve a project which will serve several purposes. First, it will give you the opportunity to explore in-depth the multidimensional (pun intended) facets of the Machine Learning and Pattern Recognition course. Second, it will support the development of your critical thinking and hands-on application skills; in my opinion, this is one of the primary goals of university education.

The students will form groups of two or three (Remember: One is a maverick, two is a pair, three is a team, and four is a crowd) to undertake a project. The project must be inspired by a real-world problem (the instructor will help the students to identify the problem) and which could be potentially solved by developing a machine learning solution. The students are expected to work with the instructor in the first three weeks to identify a problem for each group, the hardware and/or software resources that would be needed, and the methodologies that may be required to work on it. Students are also encouraged to approach other faculty members to access data from their research for this project.

Midterm Project Evaluation Criteria


Problem Statement

What is the problem you are trying to solve and why? What are the potential applications that will come out of the developed solution? What will be the potential impact of the solution?

Literature Survey

A survey of what other research groups/labs/industry have done to solve the above problem. What were the developed solutions that came out of those studies? How are you planning to take these solutions further/fix shortcomings of these solutions/extend these solutions for your application or increase the performance?

Dataset and Features Preprocessing

For both cases, how was the features pre-processing done by you? For e.g., were there any missing features, how did you interpolate them (e.g., regression), did your dataset require you to do features dimensionality reduction using PCA, LDA, etc., did you use any algorithm to see which features are more important than others or if you will need to collect/procure more data to get more valuable features.
Please use lots of plots/other visualization techniques and quantifiable numbers information in this section of your presentation. This will give the instructor an idea of the depth of your work until now and your thinking.

Proposed ML Methodology

How are you planning to use ML on your project between midterm and endterm? Which algorithms do you think would be apt for tackling the problem statement of your project and why? Has any work been done already to this end (if so, any preliminary results)? What are the possible challenges that you foresee in your project (hardware/software, dataset availability, algorithmic complexity, etc.) and how do you think you will tackle them?


Note: All the members of a project group must be present during the group presentation which should be jointly delivered. Additionally, your presentation during the midterm must not exceed 8 minutes and we will keep a maximum of 4 minutes for Q&A.

Endterm Project Evaluation Criteria


Problem Statement

What is the problem you are trying to solve and why? What are the potential applications that will come out of the developed solution? What will be the potential impact of the solution?

Literature Survey

A survey of what other research groups/labs/Industry have done to solve the above problem. What were the developed solutions that came out of those studies? How are you planning to take these solutions further/fix shortcomings of these solutions/extend these solutions for your application or increase the performance?

Dataset and Features Preprocessing

  • If you collected the dataset yourself: what were the considerations you took into account doing data collection, how was the data collected, were there any ethical concerns (if so, how did you address them), how many features and datapoints are there in your dataset, etc.

  • If you did not collect the dataset yourself: what is the nature of the dataset and why did you choose that particular dataset, how was the data collected by its authors, were there any ethical concerns (if so, how did they address them), how many features and datapoints are there in that dataset, etc.
For both cases, how was the features pre-processing done by you? For e.g., were there any missing features, how did you interpolate them (e.g., regression), did your dataset require you to do features dimensionality reduction using PCA, LDA, etc., did you use any algorithm to see which features are more important than others or if you will need to collect/procure more data to get more valuable features.

ML Methodology

Which ML methods did you use to work on the above problem statement and why? How do these models work? What were the challenges that you faced in your project (hardware/software, dataset availability, algorithmic complexity, etc.) and how you tackled them?

Performance Metrics and Deployability of the ML solution

What were the performance metrics and how much were they? How do these performance metrics show that your solution works? Can the solution be deployed at Plaksha to solve the problem you have chosen? If so, how? What may some challenges be for the deployed solution when it will scale up?


Note: All the members of a project group must be present during the group presentation which should be jointly delivered. Additionally, your presentation during the midterm must not exceed 10 minutes and we will keep a maximum of 5 minutes for Q&A.

Please understand that since this is a group project, every member of the group is expected to know about every aspect of the project when the jury will ask questions. Please do not say later that for e.g., “I was only in-charge of data analysis and someone else was in-charge of literature survey.”