KNN Python Realization

'''

k a near neighbor （kNN） The working mechanism of the algorithm is relatively simple , According to some distance measure, find out the minimum distance of a given sample k Training samples , according to k Two training samples to predict .

Classification problem ：k The category with the highest frequency is the category of the sample to be tested

The return question ： Usually, the k The average of the training samples is used as the prediction value of the sample to be tested

kNN Three elements of the model ： Distance measure 、k Choice of value 、 Classification or regression decision making

'''

import numpy as np

class KNNClassfier(object): def __init__(self, k=5, distance='euc'):

self.k = k

self.distance = distance

self.x = None

self.y = None

def fit(self,X, Y):

'''

X : array-like [n_samples,shape]

Y : array-like [n_samples,1]

'''

self.x = X

self.y = Y

def predict(self,X_test):

'''

X_test : array-like [n_samples,shape]

Y_test : array-like [n_samples,1]

output : array-like [n_samples,1]

'''

output = np.zeros((X_test.shape[0],1))

for i in range(X_test.shape[0]):

dis = []

for j in range(self.x.shape[0]):

if self.distance == 'euc': # Euclidean distance

dis.append(np.linalg.norm(X_test[i]-self.x[j,:]))

labels = []

index=sorted(range(len(dis)), key=dis.__getitem__)

for j in range(self.k):

labels.append(self.y[index[j]])

counts = []

for label in labels:

counts.append(labels.count(label))

output[i] = labels[np.argmax(counts)]

return output

def score(self,x,y):

pred = self.predict(x)

err = 0.0

for i in range(x.shape[0]):

if pred[i]!=y[i]:

err = err+1

return 1-float(err/x.shape[0]) if __name__ == '__main__':

from sklearn import datasets

iris = datasets.load_iris()

x = iris.data

y = iris.target

# x = np.array([[0.5,0.4],[0.1,0.2],[0.7,0.8],[0.2,0.1],[0.4,0.6],[0.9,0.9],[1,1]]).reshape(-1,2)

# y = np.array([0,1,0,1,0,1,1]).reshape(-1,1)

clf = KNNClassfier(k=3)

clf.fit(x,y)

print('myknn score:',clf.score(x,y))

from sklearn.neighbors import KNeighborsClassifier

clf_sklearn = KNeighborsClassifier(n_neighbors=3)

clf_sklearn.fit(x,y)

print('sklearn score:',clf_sklearn.score(x,y))

Handwritten digit recognition

from sklearn import datasets

from KNN import KNNClassfier

import matplotlib.pyplot as plt

import numpy as np

import time digits = datasets.load_digits()

x = digits.data

y = digits.target myknn_start_time = time.time()

clf = KNNClassfier(k=5)

clf.fit(x,y)

print('myknn score:',clf.score(x,y))

myknn_end_time = time.time() from sklearn.neighbors import KNeighborsClassifier

sklearnknn_start_time = time.time()

clf_sklearn = KNeighborsClassifier(n_neighbors=5)

clf_sklearn.fit(x,y)

print('sklearn score:',clf_sklearn.score(x,y))

sklearnknn_end_time = time.time() print('myknn uses time:',myknn_end_time-myknn_start_time)

print('sklearn uses time:',sklearnknn_end_time-sklearnknn_start_time)

You can see that when dealing with large data sets , Prepared by myself kNN It's very time consuming , The reason is that every time you look up k The entire data set will be scanned when there are two neighboring points , It takes a lot of calculation , therefore

k a near neighbor （kNN） We also need to consider how to find out k Nearest neighbor point , To reduce the number of distance calculations , By constructing kd Trees , Reduce searching for most points 、 Calculation ,kd The structure of the tree can be referred to 《 Statistical learning method 》- expericnce

## KNN Python More articles on Implementation

- 《 Machine learning practice 》 One of ：knn(python Code )
data Nominal and numerical Algorithm normalization : Prevent features with larger values from having a greater impact on distance Calculate the Euclidean distance : Test samples and training sets Sort : Before selection k Distance , Count the frequency ( Number of occurrences ) Most categories def classify0(inX, ...

- KNN python practice
This paper implements a KNN Algorithm , It is ready to be used in the improved version of word frequency statistics , This post is from another blog I just started copy Over here . KNN The algorithm is a simple classification algorithm , Its motivation is particularly simple : Most of the other sample points which are close to one sample point belong to what ...

- Python Machine learning basic course
Introduce This series of tutorials is basically handling <Python Machine learning basic course > The example inside . Github Warehouse Use jupyternote book It's a good way to build code quickly , This series of tutorials can be found in my Gi ...

- facenet
facenet dl face recognition One . function facenet verification lfw Dataset effects : python2.7 src/validate_on_lfw.py ~/dataset/lfw ...

- sklearn Data preprocessing in ----good!! Standardization normalization When to use
RESCALING attribute data to values to scale the range in [0, 1] or [−1, 1] is useful for the optimiz ...

- Machine learning practice notes (Python Realization )-01-K Nearest neighbor algorithm (KNN)
--------------------------------------------------------------------------------------- This series of articles is < machine ...

- be based on Bayes and KNN Of newsgroup 18828 Text classifier's Python Realization
towards @yangliuy Daniel studies NLP, This blog is about data mining - Based on Bayesian algorithm and KNN Algorithm newsgroup18828 Text classifier's JAVA Realization ( On ) Of Python Realization . It's about getting started , Not too much of your own . 1. ...

- KNN Algorithm ——python Realization
Two .Python Realization It's just machine learning ,Python You need to install three extra jewels , Namely Numpy,scipy and Matplotlib. The first two are used for numerical calculation , The latter is used for drawing . Easy to install , Go directly to their official website to download and install ...

- Python KNN Algorithm
New to machine learning , The contact is < Machine learning practice > This book , I think the description in the book is easy to understand , But for the python I am not familiar with the language , There's also a lot of space . What we are learning today is k- Nearest neighbor algorithm . 1. Machine learning In everyday life , It's hard for people to ...

## Random recommendation

- Data storage CoreData
#import "ViewController.h" #import <CoreData/CoreData.h> #import "Person.h" ...

- combobox attribute 、 event 、 Method
One .combobox attribute . event . Method public properties name explain AccessibilityObject Gets the name assigned to the control AccessibleObject. AccessibleDefaultActi ...

- About C++ Pre declaration in ( Attached program diagram )
The experiment was conducted in Yifu Building of Huazhong Agricultural University 2017.3.10 Writing C++ When it comes to programming , Occasionally you need to use a pre declaration (Forward declaration). In the following procedure , The annotated line is the class B The preposition of . It's a must , Because class A in ...

- SDOI2017 Round2 Detailed explanation
This set of questions is really amazing .. I've been doing it for a long time ... A lot of questions are to search the solution will be TAT. The rest of the question is murmuring first , If the provincial election is not retired, fill it in . 「SDOI2017」 Dragon and dungeon The question lose \(Y\) Second dice , The dice have \(X\ ...

- Webpage html Randomly switch the background image
First, prepare some images , The size of the image ( Whether it's size or data size ) Control it , If it is too big , Users can't wait to see the whole picture to jump out , If it's too small , It also affects the quality of the page . stay script Make these images into an array , Easy to call . The length of the array ...

- Win10 series ：JavaScript Use of control
There are two types of controls you can add to a page : The standard HTML Controls and WinJS Library controls . Among them, the standard HTML Control means HTML Basic controls defined in the standard , Like buttons and check boxes :WinJS Library controls are developed for JavaScript Of Wi ...

- 【Ansible file 】【 translation 】Ad-Hoc Command Introduction
Introduction To Ad-Hoc Commands Ad-Hoc Command Introduction The following example shows how to use it /usr/bin/ansible To run the ad hoc Mission . What is? ad hoc command ? One ...

- [unchecked] To make For the original type Hashtable Of the members of put(K,V) The call to is not checked ...
problem : C:\Users\Administrator\Desktop\java\SoundApplet.java:212: Warning : [unchecked] Yes, as primitive type Hashtable Of the members of pu ...

- jmx To configure
1. Boot add jmx Parameters -Dcom.sun.management.jmxremote.port=8999 -Dcom.sun.management.jmxremote.ssl=false -Dcom. ...

- Python Module learning - psutil
psutil Module introduction psutil It's an open source and cross platform library , It provides a convenient function to get the information of the system , such as CPU, Memory , disk , Network, etc . Besides ,psutil It can also be used for process management , Including judging whether the process exists . obtain ...