Supervised Learning
Introduction & Framework
CMPUT 261: Introduction to Artificial Intelligence
P&M §7.1-7.3Assignments
• Assignment #2 is now available
• Due Oct 21/2025 (two weeks from today) at 11:59pmRecap: Unce
...
Supervised Learning
Introduction & Framework
CMPUT 261: Introduction to Artificial Intelligence
P&M §7.1-7.3Assignments
• Assignment #2 is now available
• Due Oct 21/2025 (two weeks from today) at 11:59pmRecap: Uncertainty
• We represent uncertainty about the world by probabilities
• We update our knowledge by conditioning on observations
• Observations = learning the value of a random variable
• Full, unstructured joint distributions are intractable to reason about
• Conditional independence is a kind of structure that is:
1. widespread
2. easy to reason about
3. allows tractable inference (computing distribution of unobserved variables)
• Belief networks let us compactly represent joint distributions with a lot of
conditional independence
• Variable elimination is an algorithm for efficient inference on belief networksSupervised Learning, informally
• In the uncertainty section, we took the probability distribution as given
• Our only problem was to represent and derive distributions
• Question: Where do these probabilities come from?
• Supervised learning is a way to learn probabilities from examples
• Probability of a target feature (or label) given input features
• i.e., condition on input features to get probability of target
• Basic idea:
• Take a bunch of inputs (e.g., images) and "correct" outputs
• Learn a model that correctly maps inputs to outputsSupervised Learning vs.
Machine Learning vs. Deep Learning
What is the difference between Supervised Learning, Machine Learning, and Deep Learning?
Inside an orange circle labeled "Artificial Intelligence" is a blue circle labelled "Machine Learning". The blue circle is divided into three thirds, labelled "reinforcement", "unsupervised", and "supervised".
Inside the blue circle intersecting all three thirds is a red circle labelled "deep learning". Inside the red circle, intersecting all three thirds, is a cyan circle labelled "LLMs". Green star #1 is in the "supervised"
third of Machine Learning, outside the Deep Learning circle. Green star #2 is in the "supervised" third of Machine Learning, inside the Deep Learning circle.
reinforcement supervised
unsupervised
Artificial
Intelligence
Machine Learning
Deep
Learning
1
2
LLMsLecture Outline
1. Recap & Logistics
2. Supervised Learning Problem
3. Measuring Prediction Quality
After this lecture, you should be able to:
• define supervised learning task, classification, regression, loss function
• represent categorical target values in multiple ways (indicator variables, indexes)
• define generalization performance
• identify an appropriate loss function for different tasks
• explain why a separate test set estimates generalization performance
• define 0/1 error, absolute error, (log-)likelihood loss, mean squared error, worst-case
errorSupervised Learning
Definition: A supervised learning task consists of
• A set of input features
• A set of target features
• A set of training examples
sampled randomly from some population
• A set of test examples
sampled from the same population
X1
, …, Xd
Y1
, …, Yk
S = {(x(i), y(i))}n i=1
T = {(x(i), y(i))}m i=1
The goal is to predict the values of the target features given the input features;
i.e., learn a function that will map features to a prediction of
• Classification: are discrete
• Regression: are real-valued
h(x) X Y
Yi
YiSupervised Learning Examples
1. Computational vision: Given example images and labels representing objects, output a label
for the main object in the image
• Input features: Pixel values of the image
• Target features: One feature for each label (e.g., dog, plane, etc.)
2. Precision medicine: Given examples of symptoms, test results, and treatments, output an
estimate of recovery time
• Input features: symptoms, treatment indicators, test results, demographic information
• Target features: recovery time, survival time, etc.
3. Natural language processing: Given example sentences and labels representing
"sentiment", output how positive or negative the sentence is
• Input features: binary indicators for words or characters (**!)
• Target features: One feature per label (e.g., positive, negative)
[Show More]
Preview 10 out of 122 pages