# Machine Learning Advanced Learning Path - MAI4CAREU Master in AI

This Machine Learning (ML) learning path offers a structured curriculum that delves into the subfield of artificial intelligence that has revolutionised society, technology, and industry – and is poised to shape our future even more profoundly. It is based on the introductory course offered at the **Master in Artificial Intelligence of the University of Cyprus**, which was developed with co-funding from the **MAI4CAREU European project**.

The course offers a guided exploration of ML’s fundamental principles and diverse applications, and consists of **eighteen units organised in four parts as shown below**, which are further arranged into **Introductory (seven units)** and **Advanced (eleven units)** according to their level of difficulty.__The recommended order for studying all materials is the one shown below (from 1 to 18).__**Part I: Introduction**

1. Introduction to Machine Learning

2. Data Preparation**Part II: Supervised Learning**

3. Regression

4. Classification

5. Model Evaluation and Improvement

6. Trees and Forests

7. Kernel-based Methods 1

8. Kernel-based Methods 2

9. Neural Networks 1: Modelling

10. Neural Networks 2: Training

11. Neural Networks 3: Introduction to Deep Learning**Part III: Unsupervised Learning**

12. Clustering

13. Dimensionality Reduction

14. Anomaly Detection

15. Recommender Systems**Part IV: Reinforcement Learning**

16. Introduction to Reinforcement Learning

17. Markov Decision Processes and Dynamic Programming

18. Model-free Reinforcement Learning

### MAI4CAREU - Machine Learning: Introduction to Machine Learning

Let’s start by understanding what machine learning (ML) is, how it relates to artificial intelligence, and its difference with traditional programming. Here we will introduce the basic terminology, along with the three main types of ML (supervised, unsupervised and reinforcement learning) and its diverse applications. Finally, we will look at the lifecycle of a ML project, its different phases and how they connect with each other.

### MAI4CAREU - Machine Learning: Data Preparation

The process of data preparation is a crucial part of a machine learning (ML) project. In this lecture you will understand the importance of data preparation and the whole pipeline of data collection, data preprocessing, data visualization and exploratory data analysis, data transformation, and dataset splitting.

### MAI4CAREU - Machine Learning: Regression

After data preparation, the next step is to create a predictive model. This lecture will explain what the problem of regression is in supervised learning with a simple model called k-nearest neighbour regression. We will concentrate on the linear regression model, aiming to fit the data by optimizing the mean squared error cost function. We will see how to minimise this function using gradient descent optimisation, as well as by computing its analytic solution. We will finish this lecture by presenting how to effectively handle the model’s main weakness of not modelling nonlinear relationships in the data.

### MAI4CAREU - Machine Learning: Classification

We will start this lecture by explaining the problem of classification using a simple model called k-nearest neighbour classification. Further, we will focus on the logistic regression model for binary classification, and understand its relationship with linear regression from the previous lecture. We will introduce the concept of decision boundary and the cross-entropy cost function. We will continue with how to analyse the errors of binary classifiers using various metrics, the confusion matrix, and the ROC curve. Finally, we will present how to do multi-class classification using the one-vs-rest and the softmax classifier.

### MAI4CAREU - Machine Learning: Model Evaluation and Improvement

Machine learning is about creating predictive models that generalize to unseen data. In this lecture, we will first focus on the issue of generalization and how it is related to model evaluation. We will introduce the concepts of overfitting, underfitting and the bias-variance tradeoff. We will see why it is important to split the dataset into training, validation, and test sets, as well as k-fold cross validation. The second focus of this lecture is on how to improve our models. We will understand when we need more data to do so, how to simplify the models using regularization, and how to tune their hyperparameters.

### MAI4CAREU - Machine Learning: Clustering

This lecture introduces one of the most important and common unsupervised learning problems: clustering. We will understand what clustering is and its difference with supervised classification. We will focus on how the k-means clustering algorithm works using a step-by-step example, look at its problems and how to tune it. We will briefly mention other clustering algorithms such as k-medoids, DBSCAN and hierarchical clustering to understand their difference with k-means, and finally, we will look at how clustering can help supervised learning.

### MAI4CAREU - Machine Learning: Introduction to Reinforcement Learning

In this lecture we will understand what **reinforcement learning (RL)** is about and how it differs from supervised and unsupervised learning. We will explain how to formalize the RL problem using the notion of an agent interacting with an environment. We will look at the various agent components and categories, as well as the different applications of RL.

### MAI4CAREU - Machine Learning: Trees and Forests

This lecture is split into two parts. In the first part, we will focus on the decision tree model. Through a step-by-step example, we will understand how to train decision trees for classification using the concepts of entropy and information gain. We will then see how to extend our basic model using continuous variables, as well as how to create regression trees. The second part focuses on ensemble learning algorithms that aim to improve generalization performance. We will understand the difference between bagging, boosting, and stacking, and briefly look at random forests and XGBoost.

### MAI4CAREU - Machine Learning: Kernel-based Methods 1 & 2

The following two lectures will guide you into the **Kernel-based methods**, from unraveling the fundamentals of **Support Vector Machines **to exploring advanced concepts like **Radial Basis Functions**. We will delve into the intricacies of exact interpolation, optimization, and applications, shaping your expertise in machine learning with precision and depth.

The first lecture **(Kernel-based methods 1)**, starts by explaining the importance of efficiently learning nonlinear decision boundaries. We will explain what kernel methods are and the kernel trick, and present different kernels. We will then focus on the support vector machine (SVM) model and explain the concept of maximum margin classification for optimal generalization. We will look at the objective function of SVMs and how it relates to logistic regression. We will extend linear SVMs with kernels to create nonlinear SVMs, and finally, we will briefly look at how to extend SVMs for regression problems.

Following the previous lecture, where the focus is on classification, the second lecture **(Kernel-based methods 2)** focuses on regression using kernel methods. We will explain what we mean by exact interpolation and approximation. We will introduce Radial Basis Functions (RBFs) and how they can be used in RBF networks. We will understand how to find the parameters of an RBF network using the analytic solution, similarly to linear regression, for both exact interpolation and approximation. We will show how to use gradient descent to optimise more than just the standard parameters. We will briefly mention the normalized RBF network to understand its difference with the unnormalized one. Furthermore, we will look at how to extend RBF networks for classification and introduce the XOR problem. Finally, we will briefly look at the Gaussian Process model and understand how it differs from other models seen so far.

### MAI4CAREU - Machine Learning: Neural Networks 1, 2 and 3

In these three lectures we will dive into the **artificial neural networks (NNs)**, the models that have revolutionized modern artificial intelligence. From grasping the fundamentals and training processes to delving into the realm of deep learning, in this unit you will explore the revolutionary models and methodologies shaping modern artificial intelligence.

The first lecture **(Neural Networks 1: Modelling) **focuses on the fundamentals of artificial neural networks (NNs). We will start by understanding what a single artificial neuron does, and introduce the perceptron model, its learning algorithm, and limitations. We will proceed by explaining how to combine perceptrons to construct nonlinear decision boundaries. We will then present the feedforward NN model, detailing its mathematical representation, and demonstrating efficient implementation using linear algebra operations for both regression and classification tasks. Finally, we will illustrate why NNs compute their own features.

The second lecture **(Neural Networks 2: Training) **delves into the training process. We begin by outlining the cost function for both regression and classification tasks. Through a detailed step-by-step example, we elucidate the backpropagation algorithm, which computes the gradient of the cost function with respect to the NN parameters. We show how to update the parameters using gradient descent (GD) and introduce stochastic and mini-batch GD. Efficiency in implementing backpropagation is also addressed. Subsequently, we introduce stochastic GD with momentum to expedite NN training and touch upon advanced optimization methods. Further enhancements to NN performance are discussed, including strategies like early stopping, hyperparameter tuning, and ensembles. Finally, we provide a brief overview of learning the NN structure using neuroevolution.

In our last session on neural networks **(Neural Networks 3 - Introduction to Deep Learning)**, we delve into the realm of deep learning. We begin by exploring the essence of deep learning and its significance in various applications. We proceed with an overview of convolutional networks for processing image data. Moving forward, we delve into sequential data analysis, introducing recurrent NNs and how to train them by adapting the backpropagation algorithm accordingly. We present the challenges of such training posed by vanishing and exploding gradients. Subsequently, we present two approaches to mitigate these issues: Echo State Networks, elucidating their training methodology and associated pros and cons, followed by an overview of Long Short-Term Memory Networks. We finish this lecture by discussing how to handle text data using word embeddings.

### MAI4CAREU - Machine Learning: Dimensionality Reduction

In this lecture, we will learn about dimensionality reduction. We begin by addressing the underlying problem and the necessity for employing such methods. Next, we delve into the principal component analysis (PCA) algorithm, elucidating its role in linear dimensionality reduction. We proceed with the introduction of nonlinear dimensionality reduction approaches, such as kernel PCA and autoencoders. Moving forward, we present the concept of manifold learning, and we conclude this lecture by discussing the t-distributed stochastic neighbour embedding algorithm.

### MAI4CAREU - Machine Learning: Anomaly Detection

In this lecture, we will introduce the problem of anomaly detection, an important problem in unsupervised learning, and its difference from binary classification. We will then explain the concept of density estimation and how to use it for anomaly detection by fitting the parameters of a Gaussian probability density. Next, we will explore how to build anomaly detection models using kernel density estimation, one-class support vector machines, isolation forests and autoencoders. Finally, we will discuss how to evaluate anomaly detection systems.

### MAI4CAREU - Machine Learning: Recommender Systems

This lecture focuses on **recommender systems** – systems that provide suggestions for items that are most relevant to a particular user. We begin by explaining what recommender systems are and the problem they solve, using a step-by-step example. We then proceed with the introduction of the collaborative filtering algorithm, explaining why it is a type of unsupervised learning. Moving forward, we show how to adapt the algorithm to work with binary labels, and we provide implementation details that would make the algorithm work better in practice. Next, we explore content-based filtering, outlining its difference with collaborative filtering. We conclude this lecture by discussing the process of efficiently generating a recommendation from a large set of items.

### MAI4CAREU - Machine Learning: Markov Decision Processes and Dynamic Programming

We begin this lecture by formalizing the reinforcement learning (RL) problem using the framework of Markov Decision Processes (MDPs). We discuss the objectives in an MDP, what value functions are, and optimality principles in MDPs. We proceed by presenting the Bellman equations and how to use them to solve the two classes of problems we face in RL: prediction and control. We discuss how to compute the optimal value function using dynamic programming, where we introduce and compare the policy iteration and the value iteration algorithms.

### MAI4CAREU - Machine Learning: Model-free Reinforcement Learning

We begin this last session on reinforcement learning (RL) with the problem of **exploration-exploitation**, and the introduction of the **framework of multi-armed bandits**, where we explain the greedy and ε-greedy policies. We proceed with the problem of model-free prediction for estimating values in an unknown Markov Decision Problem (MDP), and we present and compare Monte Carlo, Temporal Difference (TD) learning and Multi-Step TD learning methods. Next, we delve into model-free control for optimising values in an unknown MDP, where we present the SARSA and Q-learning algorithms. We conclude this lecture by discussing how optimistic initialisation of the value function can help exploration.