We will deal with two unplugged (no-coding) activities to understand some foundational ideas about machine learning.
We will try to tackle one classification problem and one prediction problem. We will talk about what classification and prediction even means.
The Goal: To teach a model to sort items into predefined categories or “classes.”
By turning images into data, we can find patterns in images.
Example:
Step 1: Create a landscape drawing area, 16 squares by 9 squares
Step 2: Draw _______ within your drawing area in less than 20 seconds without showing it to your neighbors.
Step 2: Draw a book within your drawing area in less than 20 seconds without showing it to your neighbors.
Step 2.5: Now you can take a look at each others’ drawings.
An example:

Step 3: Pixelate your drawing
For any square that has a line, a dot or any pen/pencil mark, shade the whole square.
An example:
Step 4: Write your algorithm (Training)
Use your drawing as well as the drawings of your teammates (only your teammates) to come up with an algorithm (a set of rules) that can identify an open book. In other words, the algorithm should should identify whether the drawn book is open or closed.
MY CLASSIFICATION ALGORITHM
Algorithm Name: _______________________
My Rule (write it step-by-step):
1. ____________________________________________________
2. ____________________________________________________
3. ____________________________________________________
4. Classification Decision:
if _________________ then predict “open”.
else predict “closed”.
Go through the image one row at a time, from top to bottom.
For each row, check if it qualifies as a “Gapped Row.” A row is a “Gapped Row” if it meets both of these conditions:
gap_row_count.if gap_row_count >= 1 then predict “open”.else predict “closed”.Step 5: Test your algorithm
| image_id | actual_class | predicted_class |
|---|---|---|
| 1 | ||
| 2 | ||
| 3 | ||
| 4 | ||
| 5 | ||
| 6 | ||
| 7 | ||
| 8 | ||
| 9 | ||
| 10 |
actual_class = closed
predicted_class = ?
actual_class = open
predicted_class = ?
actual_class = open
predicted_class = ?
actual_class = open
predicted_class = ?
actual_class = closed
predicted_class = ?
actual_class = closed
predicted_class = ?
actual_class = open
predicted_class = ?
actual_class = open
predicted_class = ?
actual_class = open
predicted_class = ?
actual_class = open
predicted_class = ?
| Predicted: OPEN | Predicted: CLOSED | |
|---|---|---|
| Actual: OPEN | _________ (True Positive) |
________ (False Negative) |
| Actual: CLOSED | _______ (False Positive) |
________ (True Negative) |
True Positives (TP):
The model correctly predicted “OPEN” ____ times.
True Negatives (TN):
The model correctly predicted “CLOSED” ____ times.
False Positives (FP):
The model incorrectly predicted “OPEN” ____ times.
False Negatives (FN):
The model incorrectly predicted “CLOSED” when the book was actually open ____ times.
Overall Accuracy: (Correct / Total) = ____
In a medical test, which one has worse consequences false negative or false positive? Discuss the implications of both of these possible results.
Computers See Data, Not Pictures. We learned to translate a visual concept (a book) into structured data (a grid of 0s and 1s) that a computer can understand.
An Algorithm is a Set of Rules. We created algorithms—step-by-step instructions—to sort our data. An algorithm is the recipe for finding patterns.
A Model is the Result of Training. Our final, specific rule (e.g., “Predict”open” if gap_row_count >= 1”) is our model. It’s the finished cake we can use to make predictions.
No Model is Perfect. Every model has strengths and weaknesses. “All models are wrong, but some are useful” George Fox
Evaluation is Everything. We must test our model on unseen data to find its flaws (False Positives and False Negatives) and truly understand how well it works. A model is only as good as its test results.
Our response variable was categorical and had two categories (classes), i.e., open is a binary variable with TRUE or FALSE as possible values.
Thus the activity we just completed is a binary classification task.
open, closed, half-open


Our response variable was categorical and had two categories (classes) and hence we used a classification algorithm.
If we had a numeric response variable (e.g. temperature) then we would have used a prediction algorithm.