Likelihood inference in nearest-neighbour classification models
Traditionally the neighbourhood size k in the k-nearest-neighbour algorithm is either fixed at the first nearest neighbour or is selected on the basis of a crossvalidation study. In this paper we present an alternative approach that develops the k-nearest-neighbour algorithm using likelihood-based inference. Our method takes the form of a generalised linear regression on a set of k-nearest-neighbour autocovariates. By defining the k-nearest-neighbour algorithm in this way we are able to extend the method to accommodate the original predictor variables as possible linear effects as well as allowing for the inclusion of multiple nearest-neighbour terms. The choice of the final model proceeds via a stepwise regression procedure. It is shown that our method incorporates a conventional generalised linear model and a conventional k-nearest-neighbour algorithm as special cases. Empirical results suggest that the method out-performs the standard k-nearest-neighbour method in terms of misclassification rate on a wide variety of datasets. Copyright Biometrika Trust 2003, Oxford University Press.
Year of publication: |
2003
|
---|---|
Authors: | Holmes, Christopher C. |
Published in: |
Biometrika. - Biometrika Trust, ISSN 0006-3444. - Vol. 90.2003, 1, p. 99-112
|
Publisher: |
Biometrika Trust |
Saved in:
Saved in favorites
Similar items by person
-
Interacting sequential Monte Carlo samplers for trans-dimensional simulation
Jasra, Ajay, (2008)
-
Heard, Nicholas A., (2006)
-
Population-Based Reversible Jump Markov Chain Monte Carlo
Jasra, Ajay, (2007)
- More ...