.classify()

🔬

This card depicts two clusters of data points belonging to different categories. One such example is that the points on the left represent a spam email, and the points on the right, a non-spam email.

How can we predict what class something belongs to? The method depicted here is known as a Support Vector Machine, a technique which creates a classification boundary which maximizes the distance from a boundary to each cluster of points. This is one of many possible techniques.

A dotted and a solid line are visible, and each is a candidate for the boundary to separate the two classes. Instinctively, we can tell that the solid line is the better solution, because it looks more balanced on each side. The dotted line correctly classifies each set of points, but may incorrectly classify new points not yet in this set.

🧩

This graphic reminds us that maintaining healthy boundaries in matters of principle, relationships, health (and really any other domain) are formed in three steps. First, we must gather enough data to make the subsequent model valid. Second, we must go through a process of testing different boundaries to see which one fits the data best. Third, we must not compromise on the boundary that we set.

🖋️

Let’s reflect on strong and weak boundaries that we hold. What is an area in which you are particularly clear about what is right and wrong, or what serves and doesn’t serve you? What is an area in which you are less confident about these classifications?

📚

Watch an introduction to classification models

Return home.