KNN Python Realization 
'''
k a near neighbor (kNN) The working mechanism of the algorithm is relatively simple , According to some distance measure, find out the minimum distance of a given sample k Training samples , according to k Two training samples to predict .
Classification problem :k The category with the highest frequency is the category of the sample to be tested
The return question : Usually, the k The average of the training samples is used as the prediction value of the sample to be tested
kNN Three elements of the model : Distance measure 、k Choice of value 、 Classification or regression decision making
'''
import numpy as np
class KNNClassfier(object): def __init__(self, k=5, distance='euc'):
self.k = k
self.distance = distance
self.x = None
self.y = None
def fit(self,X, Y):
'''
X : array-like [n_samples,shape]
Y : array-like [n_samples,1]
'''
self.x = X
self.y = Y
def predict(self,X_test):
'''
X_test : array-like [n_samples,shape]
Y_test : array-like [n_samples,1]
output : array-like [n_samples,1]
'''
output = np.zeros((X_test.shape[0],1))
for i in range(X_test.shape[0]):
dis = []
for j in range(self.x.shape[0]):
if self.distance == 'euc': # Euclidean distance
dis.append(np.linalg.norm(X_test[i]-self.x[j,:]))
labels = []
index=sorted(range(len(dis)), key=dis.__getitem__)
for j in range(self.k):
labels.append(self.y[index[j]])
counts = []
for label in labels:
counts.append(labels.count(label))
output[i] = labels[np.argmax(counts)]
return output
def score(self,x,y):
pred = self.predict(x)
err = 0.0
for i in range(x.shape[0]):
if pred[i]!=y[i]:
err = err+1
return 1-float(err/x.shape[0]) if __name__ == '__main__':
from sklearn import datasets
iris = datasets.load_iris()
x = iris.data
y = iris.target
# x = np.array([[0.5,0.4],[0.1,0.2],[0.7,0.8],[0.2,0.1],[0.4,0.6],[0.9,0.9],[1,1]]).reshape(-1,2)
# y = np.array([0,1,0,1,0,1,1]).reshape(-1,1)
clf = KNNClassfier(k=3)
clf.fit(x,y)
print('myknn score:',clf.score(x,y))
from sklearn.neighbors import KNeighborsClassifier
clf_sklearn = KNeighborsClassifier(n_neighbors=3)
clf_sklearn.fit(x,y)
print('sklearn score:',clf_sklearn.score(x,y))

Handwritten digit recognition

from sklearn import datasets
from KNN import KNNClassfier
import matplotlib.pyplot as plt
import numpy as np
import time digits = datasets.load_digits()
x = digits.data
y = digits.target myknn_start_time = time.time()
clf = KNNClassfier(k=5)
clf.fit(x,y)
print('myknn score:',clf.score(x,y))
myknn_end_time = time.time() from sklearn.neighbors import KNeighborsClassifier
sklearnknn_start_time = time.time()
clf_sklearn = KNeighborsClassifier(n_neighbors=5)
clf_sklearn.fit(x,y)
print('sklearn score:',clf_sklearn.score(x,y))
sklearnknn_end_time = time.time() print('myknn uses time:',myknn_end_time-myknn_start_time)
print('sklearn uses time:',sklearnknn_end_time-sklearnknn_start_time)

 You can see that when dealing with large data sets , Prepared by myself kNN It's very time consuming , The reason is that every time you look up k The entire data set will be scanned when there are two neighboring points , It takes a lot of calculation , therefore 
k a near neighbor (kNN) We also need to consider how to find out k Nearest neighbor point , To reduce the number of distance calculations , By constructing kd Trees , Reduce searching for most points 、 Calculation ,kd The structure of the tree can be referred to 《 Statistical learning method 》- expericnce

KNN Python More articles on Implementation

  1. 《 Machine learning practice 》 One of :knn(python Code )

    data Nominal and numerical Algorithm normalization : Prevent features with larger values from having a greater impact on distance Calculate the Euclidean distance : Test samples and training sets Sort : Before selection k Distance , Count the frequency ( Number of occurrences ) Most categories def classify0(inX, ...

  2. KNN python practice

    This paper implements a KNN Algorithm , It is ready to be used in the improved version of word frequency statistics , This post is from another blog I just started copy Over here . KNN The algorithm is a simple classification algorithm , Its motivation is particularly simple : Most of the other sample points which are close to one sample point belong to what ...

  3. Python Machine learning basic course

    Introduce This series of tutorials is basically handling <Python Machine learning basic course > The example inside . Github Warehouse Use jupyternote book It's a good way to build code quickly , This series of tutorials can be found in my Gi ...

  4. facenet

    facenet dl  face recognition  One . function facenet verification lfw Dataset effects : python2.7 src/validate_on_lfw.py ~/dataset/lfw ...

  5. sklearn Data preprocessing in ----good!! Standardization normalization When to use

    RESCALING attribute data to values to scale the range in [0, 1] or [−1, 1] is useful for the optimiz ...

  6. Machine learning practice notes (Python Realization )-01-K Nearest neighbor algorithm (KNN)

    --------------------------------------------------------------------------------------- This series of articles is < machine ...

  7. be based on Bayes and KNN Of newsgroup 18828 Text classifier's Python Realization

    towards @yangliuy Daniel studies NLP, This blog is about data mining - Based on Bayesian algorithm and KNN Algorithm newsgroup18828 Text classifier's JAVA Realization ( On ) Of Python Realization . It's about getting started , Not too much of your own . 1. ...

  8. KNN Algorithm ——python Realization

    Two .Python Realization It's just machine learning ,Python You need to install three extra jewels , Namely Numpy,scipy and Matplotlib. The first two are used for numerical calculation , The latter is used for drawing . Easy to install , Go directly to their official website to download and install ...

  9. Python KNN Algorithm

    New to machine learning , The contact is < Machine learning practice > This book , I think the description in the book is easy to understand , But for the python I am not familiar with the language , There's also a lot of space . What we are learning today is k- Nearest neighbor algorithm . 1. Machine learning In everyday life , It's hard for people to ...

Random recommendation

  1. Data storage CoreData

    #import "ViewController.h" #import <CoreData/CoreData.h> #import "Person.h" ...

  2. combobox attribute 、 event 、 Method

    One .combobox attribute . event . Method public properties name   explain AccessibilityObject  Gets the name assigned to the control AccessibleObject. AccessibleDefaultActi ...

  3. About C++ Pre declaration in ( Attached program diagram )

    The experiment was conducted in Yifu Building of Huazhong Agricultural University 2017.3.10 Writing C++ When it comes to programming , Occasionally you need to use a pre declaration (Forward declaration). In the following procedure , The annotated line is the class B The preposition of . It's a must , Because class A in ...

  4. SDOI2017 Round2 Detailed explanation

    This set of questions is really amazing .. I've been doing it for a long time ... A lot of questions are to search the solution will be TAT. The rest of the question is murmuring first , If the provincial election is not retired, fill it in . 「SDOI2017」 Dragon and dungeon The question lose \(Y\) Second dice , The dice have \(X\ ...

  5. Webpage html Randomly switch the background image

    First, prepare some images , The size of the image ( Whether it's size or data size ) Control it , If it is too big , Users can't wait to see the whole picture to jump out , If it's too small , It also affects the quality of the page . stay script Make these images into an array , Easy to call . The length of the array ...

  6. Win10 series :JavaScript Use of control

    There are two types of controls you can add to a page : The standard HTML Controls and WinJS Library controls . Among them, the standard HTML Control means HTML Basic controls defined in the standard , Like buttons and check boxes :WinJS Library controls are developed for JavaScript Of Wi ...

  7. 【Ansible file 】【 translation 】Ad-Hoc Command Introduction

    Introduction To Ad-Hoc Commands Ad-Hoc Command Introduction The following example shows how to use it /usr/bin/ansible To run the ad hoc Mission . What is? ad hoc command ? One ...

  8. [unchecked] To make For the original type Hashtable Of the members of put(K,V) The call to is not checked ...

    problem : C:\Users\Administrator\Desktop\java\SoundApplet.java:212: Warning : [unchecked] Yes, as primitive type Hashtable Of the members of pu ...

  9. jmx To configure

    1. Boot add jmx Parameters -Dcom.sun.management.jmxremote.port=8999 -Dcom.sun.management.jmxremote.ssl=false -Dcom. ...

  10. Python Module learning - psutil

    psutil Module introduction psutil It's an open source and cross platform library , It provides a convenient function to get the information of the system , such as CPU, Memory , disk , Network, etc . Besides ,psutil It can also be used for process management , Including judging whether the process exists . obtain ...