knn at home homework

Key Word(s): Knn , Knn Regression , MSE , Data Plotting

Title : ¶

Exercise: Simple kNN Regression

Description : ¶

The goal of this exercise is to re-create the plots given below. You would have come across these graphs in the lecture as well.

Data Description: ¶

Instructions: ¶.

Part 1: KNN by hand for k=1

Read the Advertisement data.
Get a subset of the data from row 5 to row 13.
Apply the kNN algorithm by hand and plot the first graph as given above.

Part 2: Using sklearn package

Read the Advertisement dataset.
Split the data into train and test sets using the train_test_split() function.
Set k_list as the possible k values ranging from 1 to 70.
Use sklearn KNearestNeighbors() to fit train data.
Predict on the test data.
Use the helper code to get the second plot above for k=1,10,70.

Hints: ¶

np.argsort() Returns the indices that would sort an array.

df.iloc[] Returns a subset of the dataframe that is contained in the column range passed as the argument.

plt.plot() Plot y versus x as lines and/or markers.

df.values Returns a Numpy representation of the DataFrame.

pd.idxmin() Returns index of the first occurrence of minimum over requested axis.

np.min() Returns the minimum along a given axis.

np.max() Returns the maximum along a given axis.

model.fit() Fit the k-nearest neighbors regressor from the training dataset.

model.predict() Predict the target for the provided data.

np.zeros() Returns a new array of given shape and type, filled with zeros.

train_test_split(X,y) Split arrays or matrices into random train and test subsets.

np.linspace() Returns evenly spaced numbers over a specified interval.

KNeighborsRegressor(n_neighbors=k_value) Regression-based on k-nearest neighbors.

Note: This exercise is auto-graded, hence please remember to set all the parameters to the values mentioned in the scaffold before marking.

Part 1: KNN by hand for $k=1$ ¶

Plotting the data ¶, part 2: knn for $k\ge1$ using sklearn ¶, ⏸ in the plotting code above, re-run ax.plot(x_train, y_train,'x',label='train',color='k') with x_test and y_test instead. according to you, which k value is the best and why ¶.

Python Tutorial

File handling, python modules, python numpy, python pandas, python matplotlib, python scipy, machine learning, python mysql, python mongodb, python reference, module reference, python how to, python examples, machine learning - k-nearest neighbors (knn).

On this page, W3schools.com collaborates with NYC Data Science Academy , to deliver digital training content to our students.

KNN is a simple, supervised machine learning (ML) algorithm that can be used for classification or regression tasks - and is also frequently used in missing value imputation. It is based on the idea that the observations closest to a given data point are the most "similar" observations in a data set, and we can therefore classify unforeseen points based on the values of the closest existing points. By choosing K , the user can select the number of nearby observations to use in the algorithm.

Here, we will show you how to implement the KNN algorithm for classification, and show how different values of K affect the results.

How does it work?

K is the number of nearest neighbors to use. For classification, a majority vote is used to determined which class a new observation should fall into. Larger values of K are often more robust to outliers and produce more stable decision boundaries than very small values ( K=3 would be better than K=1 , which might produce undesirable results.

Start by visualizing some data points:

Now we fit the KNN algorithm with K=1:

from sklearn.neighbors import KNeighborsClassifier data = list(zip(x, y)) knn = KNeighborsClassifier(n_neighbors=1) knn.fit(data, classes)

And use it to classify a new data point:

Now we do the same thing, but with a higher K value which changes the prediction:

Example Explained

Import the modules you need.

You can learn about the Matplotlib module in our "Matplotlib Tutorial .

scikit-learn is a popular library for machine learning in Python.

import matplotlib.pyplot as plt from sklearn.neighbors import KNeighborsClassifier

Create arrays that resemble variables in a dataset. We have two input features ( x and y ) and then a target class ( class ). The input features that are pre-labeled with our target class will be used to predict the class of new data. Note that while we only use two input features here, this method will work with any number of variables:

x = [4, 5, 10, 4, 3, 11, 14 , 8, 10, 12] y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21] classes = [0, 0, 1, 0, 0, 1, 1, 0, 1, 1]

Turn the input features into a set of points:

data = list(zip(x, y)) print(data)

[(4, 21), (5, 19), (10, 24), (4, 17), (3, 16), (11, 25), (14, 24), (8, 22), (10, 21), (12, 21)]

Using the input features and target class, we fit a KNN model on the model using 1 nearest neighbor:

knn = KNeighborsClassifier(n_neighbors=1) knn.fit(data, classes)

Then, we can use the same KNN object to predict the class of new, unforeseen data points. First we create new x and y features, and then call knn.predict() on the new data point to get a class of 0 or 1:

new_x = 8 new_y = 21 new_point = [(new_x, new_y)] prediction = knn.predict(new_point) print(prediction)

When we plot all the data along with the new point and class, we can see it's been labeled blue with the 1 class. The text annotation is just to highlight the location of the new point:

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]]) plt.text(x=new_x-1.7, y=new_y-0.7, s=f"new point, class: {prediction[0]}") plt.show()

However, when we changes the number of neighbors to 5, the number of points used to classify our new point changes. As a result, so does the classification of the new point:

knn = KNeighborsClassifier(n_neighbors=5) knn.fit(data, classes) prediction = knn.predict(new_point) print(prediction)

When we plot the class of the new point along with the older points, we note that the color has changed based on the associated class label:

COLOR PICKER

Report Error

If you want to report an error, or if you want to make a suggestion, do not hesitate to send us an e-mail:

[email protected]

Top Tutorials

Top references, top examples, get certified.

IMAGES

An Introduction to k-Nearest Neighbors in Machine Learning
HW02 Sol
Decisiontree knn homework
KNN Idiomas lança ambiente virtual para se adaptar à pandemia
K-Nearest Neighbor Classifier
homework 01 knn dt.pdf

VIDEO

Assassin's Creed Revelations: Desmond's New Reality
Miniture Crispy Patato Recipe//Tiny Food Recipes//Padma's Miniture Kitchen
Breaking !! Imran Khan Granted Bail in Cypher Case
ONSITE IBADAH RAYA
Iron Man vs Spider Man New Home, Spider Man No Way Home, Spider Man Miles Morales Funny Animation
9 Elabana Street, LOGAN CENTRAL, Queensland

COMMENTS

Algorithms From Scratch: K-Nearest Neighbors
A non-parametric algorithm capable of performing Classification and Regression; Thomas Cover, a professor at Stanford University, first proposed the idea of K-Nearest Neighbors algorithm in 1967. Many often refer to the K-NN as a lazy learner or a type of instance based learner since all computation is deferred until function evaluation.
Harvard CS109A
Instructions: Part 1: KNN by hand for k=1. Read the Advertisement data. Get a subset of the data from row 5 to row 13. Apply the kNN algorithm by hand and plot the first graph as given above. Part 2: Using sklearn package. Read the Advertisement dataset. Split the data into train and test sets using the train_test_split () function.
Python Machine Learning
KNN. KNN is a simple, supervised machine learning (ML) algorithm that can be used for classification or regression tasks - and is also frequently used in missing value imputation. It is based on the idea that the observations closest to a given data point are the most "similar" observations in a data set, and we can therefore classify ...