This individual coursework is intended to allow you to demonstrate your ability to design and build predictive systems. You are expected to use methods that are presented in the first 5 weeks of the module.
You should produce a Jupyter Notebook that performs the required analysis in a series of cells.
The project should be based on the layout/format/style of the Jupyter Notebook for Chapter 2 of Hands-on Machine Learning with Scikit-Learn and Tensorflow, by Aurélien Géron (This is available online via Library Services).
The Notebook sets out a machine-learning project from end-to-end. You can follow the guidance offered in Appendix B of the same book (reproduced in part in this document) to help you develop the analysis in your project.
The Notebook you submit should contain Markup text to explain your analysis and comprise 2000 words. 2000 words is not many words. You need to develop the skill to lead a reader with short, concise, commentary through your analysis. Comments in code will NOT count towards the word count but can be used sparingly to help guide the examiner where it is considered useful.
The final Notebook should be saved and exported as a pdf before submission to Moodle. Please make sure that the Markup text and relevant figures, images, tables are readable in the exported pdf.
You should also provide a link to a repository (eg. One Drive or a link to Sherlock) with a copy of the Notebook in its original .ipynb format.
Your narrative/Markup text that explains your analysis might consist of approximately 100 words in 20 markup cells over the course of the Notebook. However, you can use the wordcount over the duration of the Notebook at your discretion.
You can add additional figures and explanatory text in cells as an Appendix, clearly marked, at the end of the Notebook. However, you should be clear what cells are included as part of your graded submission and the examiner is not obliged to read the material included in the Appendix cells.
The individual coursework task is to identify a dataset and explore building a predictive model using the methods and techniques presented in the first 5 weeks of the course.
This will characterise the type of predictive model you can build
Visualize and explain the main trends in the data
What is considered good performance will depend on the problem you tackle.
Further details, suggestions and guidance for each of these main steps can be found in Appendix B of Hands- on Machine Learning with Scikit-Learn and Tensorflow, by Aurélien Géron (This is available online via Library Services).
Your task is to find/select/construct a dataset of your choosing around a predictive problem that interests you. If you have questions about the suitability of your chosen dataset, please discuss with the TA in your assigned Teaching Group first.
There is are suggested links that offer example datasets provided in the Slides that accompany the coursework.
This assignment contributes towards the achievement of the following stated module Learning Outcomes as below:
- Demonstrate strong data-based reasoning and computational thinking skills
- Experience creating and working with datasets
- Understand the stages of the data science/machine learning process
- Practise visualising data
- Build models that can be used to fit datasets
- Use models to make predictions
- Choose a best-fit model based on identified criteria
- Test model predictions against new data
Within each section of this coursework you may be assessed on the following aspects, as applicable and appropriate to this particular assessment, and should thus consider these aspects when fulfilling the requirements of each section:
Each part has requirements with allocated marks, maximum word count limits/page limits and where applicable, templates that are required to be used.