Abstract
This article provides an overview of methods used to cluster data, that is, to discover and allocate objects to unknown subgroups. We review cluster analysis techniques for hierarchical, optimization, and model-based clustering. To derive at such techniques we first introduce the concept of proximity and then proceed to describe commonly used techniques for creating dendrograms, such as linkage methods and Wards method, and for searching for globally optimal partitions such as the popular k-means algorithm. Special attention is given to the issues of determining the number of clusters and checking cluster validity.
Original language | English |
---|---|
Title of host publication | International Encyclopedia of Education, Third Edition |
Publisher | Elsevier |
Pages | 72-83 |
Number of pages | 12 |
ISBN (Electronic) | 9780080448947 |
DOIs | |
Publication status | Published - 1 Jan 2009 |
Keywords
- Agglomerative hierarchical clustering
- Cluster number
- Dendrogram
- k-Means algorithm
- Model-based clustering
- Optimization clustering
- Proximities
- Validity check