Introduction
Introduction to machine learning
10
Machine learning?
• Learning from data
• Large datasets, from the growth of the internet, medical records, cameras &
images are ubiquitous, …
• Applications we can’t program by hand
• Handwriting recognition, NLP, Computer Vision, … • «Self-learning» algorithms
• e.g. product or movie recommendations, spam filtering (with occasional/optional supervision input)
11
Machine learning?
• Supervised learning
• Classification, regression
• Unsupervised learning
• Clustering, dimensionality reduction, density estimation
• Others: Reinforcement learning, sequence learning, semi-supervised learning, …
12
Supervised learning – Classification
Cancer data (malignant, benign)
Discrete output
(We could also have more than two output classes – this would be called multi-class classification.)
Often, we have more than two input features. Here, that additionally could be tumor clump thickness, uniformity of cell size, uniformity of cell shape, etc.
13
Supervised learning – Regression
Continuous (real-valued) output
14
Unsupervised learning
supervised
unsupervised
𝑥2
𝑥1
𝑥2
𝑥1
15
Unsupervised learning
Source: Su-In Lee et al., PNAS 2006
16
Individuals
Genes
Sources:
Unsupervised learning
Social network analysis
Identifying fake news
https://en.wikipedia.org/wiki/Social_network_analysis#/media/File:Kencf0618FacebookNetwork.jpg https://towardsdatascience.com/clustering-algorithms-for-customer-segmentation https://medium.com/hackernoon/the-fake-news-arms-race-448675592803
17
Market / customer segmentation
Machine learning – A magic box?
• Data
• Space of possible solutions • Characterise objective
• Find algorithm
• Run
• Validate result
18