simple kNN implementation

There are already a number of knn implementations in R, and GBeliakov and I have already used his algorithms with a Choquet integral aggregation of the k-nearest neighbours, however given that I will need any future kNN to be flexible, starting with the basic code that I can then manipulate seemed like an okay idea.

It needs a definition of the mode function, which strangely isn’t in R already.  An additional thing is that usually with kNN you need to do a random tie-break if there are multiple modes, so it’s best to show all modes.  This also needs a sort function once the distances are found.

Some publications which used kNN


required input: training data, test data, value of k

# write the mode function
 all.modes <- function(x) {
z <- table(as.vector(x))
names(z)[z == max(z)]
#read the data
k <- 5
output1 <- cbind(test1,NA)
for(h in 1:nrow(test1)) {
#create working train file to calculate distances
trainwork <- matrix(NA,nrow=nrow(train1),ncol=2)
trainwork[,1] <- train1[,ncol(train1)]
#find distances to training point 1
for(i in 1:nrow(trainwork)) {
trainwork[i,2] <- sum((test1[h,]-train1[i,])^2)-(test1[h,ncol(test1)]-train1[i,ncol(train1)])^2
#reorder by distances
ordtrain <- trainwork[order(trainwork[,2]),]
#create short matrix of nrow=k
ktrainwork <- matrix(NA,nrow=k,ncol=2)
for(i in 1:k)  {
#take the modes with random selection and add to output file
output1[h,ncol(output1)] <- as.numeric(sample(statmod(ktrainwork[,1]),1))

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s