19 Jun 2020

The K-Nearest Neighbors Algorithm

The k-nearest neighbors algorithm is a simple machine-learning algorithm that classifies a new data point according to the classes of the data points closest to it. An example should illustrate the nature and effectiveness of the algorithm.

Consider a given apartment renter. If three out of the five renters living nearest to our renter make >$75k/yr, then our renter probably makes >$75k/yr, too. Of course, it’s possible that he or she doesn’t, but it’s likely that he or she does. The k-nearest neighbors algorithm operates on this principle. Given an integer k, the algorithm classifies a new data point as the class that appears most frequently among the k data points nearest to the new data point.

Here is my JavaScript implementation of the kNN algorithm.

As this great blog post explains, it’s crucial to pick an appropriate metric for measuring the nearness of neighbors. In my implementation, I used a common metric, the Euclidean distance.