KNN Python Realization
k a near neighbor （kNN） The working mechanism of the algorithm is relatively simple , According to some distance measure, find out the minimum distance of a given sample k Training samples , according to k Two training samples to predict .
Classification problem ：k The category with the highest frequency is the category of the sample to be tested
The return question ： Usually, the k The average of the training samples is used as the prediction value of the sample to be tested
kNN Three elements of the model ： Distance measure 、k Choice of value 、 Classification or regression decision making
import numpy as np
class KNNClassfier(object): def __init__(self, k=5, distance='euc'):
self.k = k
self.distance = distance
self.x = None
self.y = None
def fit(self,X, Y):
X : array-like [n_samples,shape]
Y : array-like [n_samples,1]
self.x = X
self.y = Y
X_test : array-like [n_samples,shape]
Y_test : array-like [n_samples,1]
output : array-like [n_samples,1]
output = np.zeros((X_test.shape,1))
for i in range(X_test.shape):
dis = 
for j in range(self.x.shape):
if self.distance == 'euc': # Euclidean distance
labels = 
for j in range(self.k):
counts = 
for label in labels:
output[i] = labels[np.argmax(counts)]
pred = self.predict(x)
err = 0.0
for i in range(x.shape):
err = err+1
return 1-float(err/x.shape) if __name__ == '__main__':
from sklearn import datasets
iris = datasets.load_iris()
x = iris.data
y = iris.target
# x = np.array([[0.5,0.4],[0.1,0.2],[0.7,0.8],[0.2,0.1],[0.4,0.6],[0.9,0.9],[1,1]]).reshape(-1,2)
# y = np.array([0,1,0,1,0,1,1]).reshape(-1,1)
clf = KNNClassfier(k=3)
from sklearn.neighbors import KNeighborsClassifier
clf_sklearn = KNeighborsClassifier(n_neighbors=3)
Handwritten digit recognition
from sklearn import datasets
from KNN import KNNClassfier
import matplotlib.pyplot as plt
import numpy as np
import time digits = datasets.load_digits()
x = digits.data
y = digits.target myknn_start_time = time.time()
clf = KNNClassfier(k=5)
myknn_end_time = time.time() from sklearn.neighbors import KNeighborsClassifier
sklearnknn_start_time = time.time()
clf_sklearn = KNeighborsClassifier(n_neighbors=5)
sklearnknn_end_time = time.time() print('myknn uses time:',myknn_end_time-myknn_start_time)
print('sklearn uses time:',sklearnknn_end_time-sklearnknn_start_time)
You can see that when dealing with large data sets , Prepared by myself kNN It's very time consuming , The reason is that every time you look up k The entire data set will be scanned when there are two neighboring points , It takes a lot of calculation , therefore
k a near neighbor （kNN） We also need to consider how to find out k Nearest neighbor point , To reduce the number of distance calculations , By constructing kd Trees , Reduce searching for most points 、 Calculation ,kd The structure of the tree can be referred to 《 Statistical learning method 》- expericnce
- 《 Machine learning practice 》 One of ：knn(python Code )
data Nominal and numerical Algorithm normalization : Prevent features with larger values from having a greater impact on distance Calculate the Euclidean distance : Test samples and training sets Sort : Before selection k Distance , Count the frequency ( Number of occurrences ) Most categories def classify0(inX, ...
- KNN python practice
This paper implements a KNN Algorithm , It is ready to be used in the improved version of word frequency statistics , This post is from another blog I just started copy Over here . KNN The algorithm is a simple classification algorithm , Its motivation is particularly simple : Most of the other sample points which are close to one sample point belong to what ...
- Python Machine learning basic course
Introduce This series of tutorials is basically handling <Python Machine learning basic course > The example inside . Github Warehouse Use jupyternote book It's a good way to build code quickly , This series of tutorials can be found in my Gi ...
facenet dl face recognition One . function facenet verification lfw Dataset effects : python2.7 src/validate_on_lfw.py ~/dataset/lfw ...
- sklearn Data preprocessing in ----good!! Standardization normalization When to use
RESCALING attribute data to values to scale the range in [0, 1] or [−1, 1] is useful for the optimiz ...
- Machine learning practice notes (Python Realization )-01-K Nearest neighbor algorithm (KNN)
--------------------------------------------------------------------------------------- This series of articles is < machine ...
- be based on Bayes and KNN Of newsgroup 18828 Text classifier's Python Realization
towards @yangliuy Daniel studies NLP, This blog is about data mining - Based on Bayesian algorithm and KNN Algorithm newsgroup18828 Text classifier's JAVA Realization ( On ) Of Python Realization . It's about getting started , Not too much of your own . 1. ...
- KNN Algorithm ——python Realization
Two .Python Realization It's just machine learning ,Python You need to install three extra jewels , Namely Numpy,scipy and Matplotlib. The first two are used for numerical calculation , The latter is used for drawing . Easy to install , Go directly to their official website to download and install ...
- Python KNN Algorithm
New to machine learning , The contact is < Machine learning practice > This book , I think the description in the book is easy to understand , But for the python I am not familiar with the language , There's also a lot of space . What we are learning today is k- Nearest neighbor algorithm . 1. Machine learning In everyday life , It's hard for people to ...
- Data storage CoreData
#import "ViewController.h" #import <CoreData/CoreData.h> #import "Person.h" ...
- combobox attribute 、 event 、 Method
One .combobox attribute . event . Method public properties name explain AccessibilityObject Gets the name assigned to the control AccessibleObject. AccessibleDefaultActi ...
- About C++ Pre declaration in ( Attached program diagram )
The experiment was conducted in Yifu Building of Huazhong Agricultural University 2017.3.10 Writing C++ When it comes to programming , Occasionally you need to use a pre declaration (Forward declaration). In the following procedure , The annotated line is the class B The preposition of . It's a must , Because class A in ...
- SDOI2017 Round2 Detailed explanation
This set of questions is really amazing .. I've been doing it for a long time ... A lot of questions are to search the solution will be TAT. The rest of the question is murmuring first , If the provincial election is not retired, fill it in . 「SDOI2017」 Dragon and dungeon The question lose \(Y\) Second dice , The dice have \(X\ ...
- Webpage html Randomly switch the background image
First, prepare some images , The size of the image ( Whether it's size or data size ) Control it , If it is too big , Users can't wait to see the whole picture to jump out , If it's too small , It also affects the quality of the page . stay script Make these images into an array , Easy to call . The length of the array ...
- 【Ansible file 】【 translation 】Ad-Hoc Command Introduction
Introduction To Ad-Hoc Commands Ad-Hoc Command Introduction The following example shows how to use it /usr/bin/ansible To run the ad hoc Mission . What is? ad hoc command ? One ...
- [unchecked] To make For the original type Hashtable Of the members of put(K,V) The call to is not checked ...
problem : C:\Users\Administrator\Desktop\java\SoundApplet.java:212: Warning : [unchecked] Yes, as primitive type Hashtable Of the members of pu ...
- jmx To configure
1. Boot add jmx Parameters -Dcom.sun.management.jmxremote.port=8999 -Dcom.sun.management.jmxremote.ssl=false -Dcom. ...
- Python Module learning - psutil
psutil Module introduction psutil It's an open source and cross platform library , It provides a convenient function to get the information of the system , such as CPU, Memory , disk , Network, etc . Besides ,psutil It can also be used for process management , Including judging whether the process exists . obtain ...