Author : Upinder Kaur 1
Date of Publication :15th August 2017
Abstract: The goal of Nearest Neighbour (NN) search is to find the objects in a dataset A that are closest to a query point q. Existing algorithms presume that the dataset is indexed by an R-tree and searching a query point q in a large volume of a dataset, is a tedious task that effects the quality and usefulness of the NNQ processing algorithms which determined by the time as well as space complexity. The simplest solution to the NNS problem is to compute the distance from the query point to every other point in the database. However, due to these complexities issue, the various research techniques have been proposed. It is a technique which has applications in various fields such as pattern recognition, moving object recognition etc. In this paper, a comprehensive analysis on data structures, processing techniques and variety of algorithms in this field is done along with different way to categorize the NNS techniques is presented. This different category such as weighted, additive, reductional, continuous, reverse, principal axis, which are analyzed and compared in this paper. Complexity of different data structures used in different NNS algorithms is also discussed.
Reference :
-
1. Andoni, Nearest Neighbor Search - the Old, the New, and the Impossible, Ph.D. dissertation, Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2009.
2. G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest Neighbor Methods in Learning and Vision : Theory and Practice, March 2006.
3. N. Bhatia and V. Ashev, Survey of Nearest Neighbor Techniques, International Journal of Computer Science and Information Security, 8(2), 2010, pp. 1-4.
4. S. Dhanabal and S. Chandramathi, A Review of various k-Nearest Neighbor Query Processing Techniques, Computer Applications. 31(7), 2011, pp. 14- 22.
5. A. Rajaraman and J. D. Ullman. Mining of Massive Datasets, December 2011.
6. R. F. Sproull, Refinements to Nearest-Neighbor Searching in k Dimensional Trees, Algorithmica. 6(4), 1991, pp. 579-589.
7. J. L. Bentley, Multidimensional binary search trees Used for Associative Searching, Communications of the ACM. 18(9), 1975, pp. 509-517.
8. R. Panigrahy, An Improved Algorithm Finding Nearest Neighbor Using Kd-Trees, in 8th Latin American conference on Theoretical informatics, Bzios, Brazil, 2008, pp. 387-398.
9. A. Nuchter, K. Lingemann, and J. Hertzberg, Cached KD Tree Search for ICP Algorithms, in 6th International Conference on 3 -D Digital Imaging and Modeling, Montreal, Quebec, Canada, 2007, pp. 419-426.
10. V. Gaede and O. Gunther, Multidimensional access methods, ACM Computing Surveys. 30(2), 1998, pp. 170-231.
11. H. Samet, The Quadtree and Related Hierarchical Data Structures, ACM Computing Surveys. 16(2), 1984, pp. 187-260.
12. J. Tayeb, O. Ulusoy, and O. Wolfson, A Quadtree Based Dynamic Attribute Indexing Method, The Computer Journal. 41(3), 1998,pp. 185-200.
13. C. A. Shaffer and H. Samet, Optimal Quadtree Construction Algorithms, Computer Vision, Graphics, and Image Processing. 37(3), 1987, pp. 402-419.
14. D. Meagher, Geometric Modeling Using Octree Encoding, Computer Graphics and Image Processing. 19(2), 1982, pp. 129-147.
15. T. Liu, A. W. Moore, and A. Gray, New Algorithms for Efficient High Dimensional Nonparametric Classification, The Journal of Machine Learning Research. 7(1), 2006, pp. 1135–1158.
16. S. M. Omohundro, Five Balltree Construction Algorithms, International Computer Science Institute, Berkeley, California, USA, Tech, 1989.
17. Y. Manolopoulos, A. Nanopoulos, A. N. Papadopoulos, and Y. Theodoridis, R-Trees: Theory and Applications, November 2005.
18. A. Guttman, R-Trees: A dynamic index structure for spatial searching, ACM SIGMOD Record. 14(2), 1984, pp. 47-57.
19. M. K. Wu, Evaluation of R-trees for Nearest Neighbor Search, M.Sc. thesis, Computer Science, University of Houston, 2006.
20. N. Roussopoulos, S. Kelley, and F. Vincent, Nearest Neighbor Queries, ACM SIGMOD Record. 24(2), 1995, pp.71-79.
21. M. Adler and B. Heeringa, Search Space Reductions for Nearest Neighbor Queries, in 5th international conference on Theory and applications of models of computation, Xian, China, 2008, pp. 554-568.
22. P. Ciaccia, M. Patella, and P. Zezula, M-tree: An Efficient Access Method for Similarity Search in Metric Spaces, in 23rd Very Large Data Bases Conference, Athens, Greece, 1997.
23. T. Skopal, Pivoting M-tree: A Metric Access Method for Efficient Similarity Search, in Dateso 2004 Annual International Workshop on Databases, Texts, Specifications and Objects, Desna, Czech Republic, 2004, pp. 27-37
24. T. M. Cover and P. E. Hart, Nearest Neighbor Pattern Classification, IEEE Transactions on Information Theory. 13(1), 1967, pp. 21-27.
25. S. A. Dudani, The Distance-Weighted k-NearestNeighbor Rule, IEEE Transactions on Systems, Man and Cybernetics. 6(4), 1976, pp. 325-327.
26. T. Bailey and A. K. Jain, A Note on DistanceWeighted k-Nearest Neighbor Rules, IEEE Transactions on Systems, Man and Cybernetics. 8(4), 1978, pp. 311- 313.
27. S. Tan, Neighbor-weighted K-nearest neighbor for unbalanced text corpus, Expert Systems with Applications. 28(4), 2005, pp. 667-671.
28. K. Hechenbichler and K. Schliep, Weighted kNearest-Neighbor Techniques and Ordinal Classification, Collaborative Research Center, LMU University, Munich, Germany, Tech, 2006.
29. V. Lobo, Ship Noise Classification: A Contribution to Prototype Based Classifier Design, Ph.D. dissertation, College of Science and Technology, New University of Lisbon, 2002.
30. P. E. Hart, The Condensed Nearest Neighbor Rule, IEEE Transactions on Information Theory. 14(3), 1968, pp. 515-516.
31. F. Angiulli, Fast Condensed Nearest Neighbor Rule, in 22nd International Conference on Machine Learning, Bonn, Germany, 2005, pp. 25-32.
32. G. W. Gates, The Reduced Nearest Neighbor Rule, IEEE Transactions on Information Theory. 18(3), 1972, pp. 431-433.
33. G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, KNN Model-Based Approach in Classification, in OTM Confederated International Conferences on the Move to Meaningful Internet Systems, Catania, Sicily, Italy, 2003, pp. 986-996
34. G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, An KNN Model-Based Approach and its Application in Classification in Text Categorization, in 5th International Conference on Computational Linguistics and Intelligent Text Processing, Seoul, Korea, 2004, pp. 559-570.
35. Z. Yong, L. Youwen, and X. Shixiong, An Improved KNN Text Classification Algorithm Based on Clustering, Journal of Computers. 4(3), 2009, pp. 230- 237
36. S. Z. Li and J. Lu, Face Recognition Using the Nearest Feature Line Method, IEEE Transactions on Neural Networks. 10(2), 1999, pp. 439-443.
37. S. Z. Li, K. L. Chan, and C. Wang, Performance Evaluation of the Nearest Feature Line Method in Image Classification and Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence. 22(11), 2000, pp. 1335-1339.
38. W. Zheng, L. Zhao, and C. Zou, Locally nearest neighbor classifiers for pattern classification, Pattern Recognition. 37(6), 2004, pp. 1307-1309.
40. Y. Zhou, C. Zhang, and J. Wang, Tunable Nearest Neighbor Classifier, in 26th DAGM Symposium on Artificial Intelligence, Tubingen, Germany, 2004, pp. 204-211.
41. Y. Zhou, C. Zhang, and J. Wang, Extended Nearest Feature Line Classifier, in 8th Pacific Rim International Conference on Artificial Intelligence, Auckland, New Zealand, 2004, pp. 183-190.
42. F. Korn and S. M. ukrishnan, Influence Sets Based on Reverse Nearest Neighbor Queries, ACM SIGMOD Record. 29(2), 2000, pp. 201-212.
43. C. Yang and K. I. Lin, An Index Structure for Efficient Reverse Nearest Neighbor Queries, in 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 485-492.
44. R. Benetis, C. S. Jensen, G. Karciauskas, and S. Saltenis, Nearest Neighbor and Reverse Nearest Neighbor Queries for Moving Objects, The VLDB Journal. 15(3), 2006, pp. 229-249
45. Y. Tao, M. L. Yiu, and N. Mamoulis, Reverse Nearest Neighbor Search in Metric Spaces, IEEE Transactions on Knowledge and Data Engineering. 18(9), 2006, pp. 1239-1252
-
- 1. Andoni, Nearest Neighbor Search - the Old, the New, and the Impossible, Ph.D. dissertation, Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2009.
- 2. G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest Neighbor Methods in Learning and Vision : Theory and Practice, March 2006.
- 3. N. Bhatia and V. Ashev, Survey of Nearest Neighbor Techniques, International Journal of Computer Science and Information Security, 8(2), 2010, pp. 1-4.
- 4. S. Dhanabal and S. Chandramathi, A Review of various k-Nearest Neighbor Query Processing Techniques, Computer Applications. 31(7), 2011, pp. 14- 22.
- 5. A. Rajaraman and J. D. Ullman. Mining of Massive Datasets, December 2011.
- 6. R. F. Sproull, Refinements to Nearest-Neighbor Searching in k Dimensional Trees, Algorithmica. 6(4), 1991, pp. 579-589.
- 7. J. L. Bentley, Multidimensional binary search trees Used for Associative Searching, Communications of the ACM. 18(9), 1975, pp. 509-517.
- 8. R. Panigrahy, An Improved Algorithm Finding Nearest Neighbor Using Kd-Trees, in 8th Latin American conference on Theoretical informatics, Bzios, Brazil, 2008, pp. 387-398.
- 9. A. Nuchter, K. Lingemann, and J. Hertzberg, Cached KD Tree Search for ICP Algorithms, in 6th International Conference on 3 -D Digital Imaging and Modeling, Montreal, Quebec, Canada, 2007, pp. 419-426.
- 10. V. Gaede and O. Gunther, Multidimensional access methods, ACM Computing Surveys. 30(2), 1998, pp. 170-231.
- 11. H. Samet, The Quadtree and Related Hierarchical Data Structures, ACM Computing Surveys. 16(2), 1984, pp. 187-260.
- 12. J. Tayeb, O. Ulusoy, and O. Wolfson, A Quadtree Based Dynamic Attribute Indexing Method, The Computer Journal. 41(3), 1998,pp. 185-200.
- 13. C. A. Shaffer and H. Samet, Optimal Quadtree Construction Algorithms, Computer Vision, Graphics, and Image Processing. 37(3), 1987, pp. 402-419.
- 14. D. Meagher, Geometric Modeling Using Octree Encoding, Computer Graphics and Image Processing. 19(2), 1982, pp. 129-147.
- 15. T. Liu, A. W. Moore, and A. Gray, New Algorithms for Efficient High Dimensional Nonparametric Classification, The Journal of Machine Learning Research. 7(1), 2006, pp. 1135–1158.
- 16. S. M. Omohundro, Five Balltree Construction Algorithms, International Computer Science Institute, Berkeley, California, USA, Tech, 1989.
- 17. Y. Manolopoulos, A. Nanopoulos, A. N. Papadopoulos, and Y. Theodoridis, R-Trees: Theory and Applications, November 2005.
- 18. A. Guttman, R-Trees: A dynamic index structure for spatial searching, ACM SIGMOD Record. 14(2), 1984, pp. 47-57.
- 19. M. K. Wu, Evaluation of R-trees for Nearest Neighbor Search, M.Sc. thesis, Computer Science, University of Houston, 2006.
- 20. N. Roussopoulos, S. Kelley, and F. Vincent, Nearest Neighbor Queries, ACM SIGMOD Record. 24(2), 1995, pp.71-79.
- 21. M. Adler and B. Heeringa, Search Space Reductions for Nearest Neighbor Queries, in 5th international conference on Theory and applications of models of computation, Xian, China, 2008, pp. 554-568.
- 22. P. Ciaccia, M. Patella, and P. Zezula, M-tree: An Efficient Access Method for Similarity Search in Metric Spaces, in 23rd Very Large Data Bases Conference, Athens, Greece, 1997.
- 23. T. Skopal, Pivoting M-tree: A Metric Access Method for Efficient Similarity Search, in Dateso 2004 Annual International Workshop on Databases, Texts, Specifications and Objects, Desna, Czech Republic, 2004, pp. 27-37
- 24. T. M. Cover and P. E. Hart, Nearest Neighbor Pattern Classification, IEEE Transactions on Information Theory. 13(1), 1967, pp. 21-27.
- 25. S. A. Dudani, The Distance-Weighted k-NearestNeighbor Rule, IEEE Transactions on Systems, Man and Cybernetics. 6(4), 1976, pp. 325-327.
- 26. T. Bailey and A. K. Jain, A Note on DistanceWeighted k-Nearest Neighbor Rules, IEEE Transactions on Systems, Man and Cybernetics. 8(4), 1978, pp. 311- 313.
- 27. S. Tan, Neighbor-weighted K-nearest neighbor for unbalanced text corpus, Expert Systems with Applications. 28(4), 2005, pp. 667-671.
- 28. K. Hechenbichler and K. Schliep, Weighted kNearest-Neighbor Techniques and Ordinal Classification, Collaborative Research Center, LMU University, Munich, Germany, Tech, 2006.
- 29. V. Lobo, Ship Noise Classification: A Contribution to Prototype Based Classifier Design, Ph.D. dissertation, College of Science and Technology, New University of Lisbon, 2002.
- 30. P. E. Hart, The Condensed Nearest Neighbor Rule, IEEE Transactions on Information Theory. 14(3), 1968, pp. 515-516.
- 31. F. Angiulli, Fast Condensed Nearest Neighbor Rule, in 22nd International Conference on Machine Learning, Bonn, Germany, 2005, pp. 25-32.
- 32. G. W. Gates, The Reduced Nearest Neighbor Rule, IEEE Transactions on Information Theory. 18(3), 1972, pp. 431-433.
- 33. G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, KNN Model-Based Approach in Classification, in OTM Confederated International Conferences on the Move to Meaningful Internet Systems, Catania, Sicily, Italy, 2003, pp. 986-996
- 34. G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, An KNN Model-Based Approach and its Application in Classification in Text Categorization, in 5th International Conference on Computational Linguistics and Intelligent Text Processing, Seoul, Korea, 2004, pp. 559-570.
- 35. Z. Yong, L. Youwen, and X. Shixiong, An Improved KNN Text Classification Algorithm Based on Clustering, Journal of Computers. 4(3), 2009, pp. 230- 237
- 36. S. Z. Li and J. Lu, Face Recognition Using the Nearest Feature Line Method, IEEE Transactions on Neural Networks. 10(2), 1999, pp. 439-443.
- 37. S. Z. Li, K. L. Chan, and C. Wang, Performance Evaluation of the Nearest Feature Line Method in Image Classification and Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence. 22(11), 2000, pp. 1335-1339.
- 38. W. Zheng, L. Zhao, and C. Zou, Locally nearest neighbor classifiers for pattern classification, Pattern Recognition. 37(6), 2004, pp. 1307-1309.
- 40. Y. Zhou, C. Zhang, and J. Wang, Tunable Nearest Neighbor Classifier, in 26th DAGM Symposium on Artificial Intelligence, Tubingen, Germany, 2004, pp. 204-211.
- 41. Y. Zhou, C. Zhang, and J. Wang, Extended Nearest Feature Line Classifier, in 8th Pacific Rim International Conference on Artificial Intelligence, Auckland, New Zealand, 2004, pp. 183-190.
- 42. F. Korn and S. M. ukrishnan, Influence Sets Based on Reverse Nearest Neighbor Queries, ACM SIGMOD Record. 29(2), 2000, pp. 201-212.
- 43. C. Yang and K. I. Lin, An Index Structure for Efficient Reverse Nearest Neighbor Queries, in 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 485-492.
- 44. R. Benetis, C. S. Jensen, G. Karciauskas, and S. Saltenis, Nearest Neighbor and Reverse Nearest Neighbor Queries for Moving Objects, The VLDB Journal. 15(3), 2006, pp. 229-249
- 45. Y. Tao, M. L. Yiu, and N. Mamoulis, Reverse Nearest Neighbor Search in Metric Spaces, IEEE Transactions on Knowledge and Data Engineering. 18(9), 2006, pp. 1239-1252
- 46. I. Stanoi, D. Agrawal, and A. E. Abbadi, Reverse Nearest Neighbor Queries for Dynamic Databases, in ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Dallas, Texas, USA, 2000, pp.44-53.
- 47. E. Achtert, C. Bohm, P. Kroger, P. Kunath, M. Renz, and A. Pryakhin, Efficient Reverse k-Nearest Neighbor Search in Arbitrary Metric Spaces, in ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, 2006, pp. 515-526.
- 48. Y. Gao, B. Zheng, G. Chen, and Q. Li, On Efficient Mutual Nearest Neighbor Query Processing in Spatial Databases, Data & Knowledge Engineering. 68(8), 2009, pp. 705-727.
- 49. Y. Tao, D. Papadias, and Q. Shen, Continuous Nearest Neighbor Search, in 28th international conference on Very Large Data Bases, Hong Kong, China, 2002, pp. 287-298.
- 50. H. Hu and D. L. Lee, Range Nearest-Neighbor Query, IEEE Transactions on Knowledge and Data Engineering. 18(1), 2006, pp. 78 -91.