KNN-Python from scratch

Original article was published on Deep Learning on Medium

Here we are working on

  1. Import required python libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

2. Load the dataset

Loading csv file and displaying top 5 row data

3. Visualising different species(Classifiers) on which we are going to work

Using petalLength and petalWidth we are plotting against different Species
Three Species we are working currently on

Now our task is to predict the new data point belongs to which Species based on sepal Length, sepal Width, petal Length, petal Width

Preprocessing Data

4. Removing Id column from data, which is unnecessary

5. Shuffling the data, to avoid over-fitting problem

Changing the data order

6. Splitting data into train, test sets with 70% data into train and 30% into test

Train data: 105 rows, Test data: 45 rows

KNN in 3 Steps

  1. Measure Distance (Euclidean distance or Manhattan distance)
  2. Get nearest neighbours
  3. Predict Classifier

Step 1

Measuring distance using Euclidean Distance
athematical formula √ (x2 − x1)2 + (y2 − y1)2

Step 2

Getting the Nearest Neighbours

Step 3

Predicting classifier


Predicting the Test Data

Evaluating Model Performance

Final modified test data with predicted labels

For full code visit

KNN-Python from scratch