Preface.
Acknowledgement.
1 An Introduction to classification and clustering.
1.1 Introduction.
1.2 Reasons for classifying.
1.3 Numerical methods of classification cluster analysis.
1.4 What is a cluster?
1.5 Examples of the use of clustering.
1.5.1 Market research.
1.5.2 Astronomy.
1.5.3 Psychiatry.
1.5.4 Weather classification.
1.5.5 Archaeology.
1.5.6 Bioinformatics and genetics.
1.6 Summary.
2 Detecting clusters graphically.
2.1 Introduction.
2.2 Detecting clusters with univariate and bivariate plots of data.
2.2.1 Histograms.
2.2.2 Scatterplots.
2.2.3 Density estimation.
2.2.4 Scatterplot matrices.
2.3 Using lower-dimensional projections of multivariate data for graphical representations.
2.3.1 Principal components analysis of multivariate data.
2.3.2 Exploratory projection pursuit.
2.3.3 Multidimensional scaling.
2.4 Three-dimensional plots and trellis graphics.
2.5 Summary.
3 Measurement of proximity.
3.1 Introduction.
3.2 Similarity measures for categorical data.
3.2.1 Similarity measures for binary data.
3.2.2 Similarity measures for categorical data with more than two levels.
3.3 Dissimilarity and distance measures for continuous data.
3.4 Similarity measures for data containing both continuous and categorical variables.
3.5 Proximity measures for structured data.
3.6 Inter-group proximity measures.
3.6.1 Inter-group proximity derived from the proximity matrix.
3.6.2 Inter-group proximity based on group summaries for continuous data.
3.6.3 Inter-group proximity based on group summaries for categorical data.
3.7 Weighting variables.
3.8 Standardization.
3.9 Choice of proximity measure.
3.10 Summary.
4 Hierarchical clustering.
4.1 Introduction.
4.2 Agglomerative methods.
4.2.1 Illustrative examples of agglomerative methods.
4.2.2 The standard agglomerative methods.
4.2.3 Recurrence formula for agglomerative methods.
4.2.4 Problems of agglomerative hierarchical methods.
4.2.5 Empirical studies of hierarchical agglomerative methods.
4.3 Divisive methods.
4.3.1 Monothetic divisive methods.
4.3.2 Polythetic divisive methods.
4.4 Applying the hierarchical clustering process.
4.4.1 Dendrograms and other tree representations.
4.4.2 Comparing dendrograms and measuring their distortion.
4.4.3 Mathematical properties of hierarchical methods.
4.4.4 Choice of partition the problem of the number of groups.
4.4.5 Hierarchical algorithms.
4.4.6 Methods for large data sets.
4.5 Applications of hierarchical methods.
4.5.1 Dolphin whistles agglomerative clustering.
4.5.2 Needs of psychiatric patients monothetic divisive clustering.
4.5.3 Globalization of cities polythetic divisive method.
4.5.4 Womens life histories divisive clustering of sequence data.
4.5.5 Composition of mammals milk exemplars, dendrogram seriation and choice of partition.
4.6 Summary.
5 Optimization clustering techniques.
5.1 Introduction.
5.2 Clustering criteria derived from the dissimilarity matrix.
5.3 Clustering criteria derived from continuous data.
5.3.1 Minimization of trace(W).
5.3.2 Minimization of det(W).
5.3.3 Maximization of trace (BW1).
5.3.4 Properties of the clustering criteria.
5.3.5 Alternative criteria for clusters of different shapes and sizes.
5.4 Optimization algorithms.
5.4.1 Numerical example.
5.4.2 More on k-means.
5.4.3 Software implementations of optimization clustering.
5.5 Choosing the number of clusters.
5.6 Applications of optimization methods.
5.6.1 Survey of student attitudes towards video games.
5.6.2 Air pollution indicators for US cities.
5.6.3 Aesthetic judgement of painters.
5.6.4 Classification of nonspecific back pain.
5.7 Summary.
6 Finite mixture densities as models for cluster analysis.
6.1 Introduction.
6.2 Finite mixture densities.
6.2.1 Maximum likelihood estimation.
6.2.2 Maximum likelihood estimation of mixtures of multivariate normal densities.
6.2.3 Problems with maximum likelihood estimation of finite mixture models using the EM algorithm.
6.3 Other finite mixture densities.
6.3.1 Mixtures of multivariate t-distributions.
6.3.2 Mixtures for categorical data latent class analysis.
6.3.3 Mixture models for mixed-mode data.
6.4 Bayesian analysis of mixtures.
6.4.1 Choosing a prior distribution.
6.4.2 Label switching.
6.4.3 Markov chain Monte Carlo samplers.
6.5 Inference for mixture models with unknown number of components and model structure.
6.5.1 Log-likelihood ratio test statistics.
6.5.2 Information criteria.
6.5.3 Bayes factors.
6.5.4 Markov chain Monte Carlo methods.
6.6 Dimension reduction variable selection in finite mixture modelling.
6.7 Finite regression mixtures.
6.8 Software for finite mixture modelling.
6.9 Some examples of the application of finite mixture densities.
6.9.1 Finite mixture densities with univariate Gaussian components.
6.9.2 Finite mixture densities with multivariate Gaussian components.
6.9.3 Applications of latent class analysis.
6.9.4 Application of a mixture model with different component densities.
6.10 Summary.
7 Model-based cluster analysis for structured data.
7.1 Introduction.
7.2 Finite mixture models for structured data.
7.3 Finite mixtures of factor models.
7.4 Finite mixtures of longitudinal models.
7.5 Applications of finite mixture models for structured data.
7.5.1 Application of finite mixture factor analysis to the categorical versus dimensional representation debate.
7.5.2 Application of finite mixture confirmatory factor analysis to cluster genes using replicated microarray experiments.
7.5.3 Application of finite mixture exploratory factor analysis to cluster Italian wines.
7.5.4 Application of growth mixture modelling to identify distinct developmental trajectories.
7.5.5 Application of growth mixture modelling to identify trajectories of perinatal depressive symptomatology.
7.6 Summary.
8 Miscellaneous clustering methods.
8.1 Introduction.
8.2 Density search clustering techniques.
8.2.1 Mode analysis.
8.2.2 Nearest-neighbour clustering procedures.
8.3 Density-based spatial clustering of applications with noise.
8.4 Techniques which allow overlapping clusters.
8.4.1 Clumping and related techniques.
8.4.2 Additive clustering.
8.4.3 Application of MAPCLUS to data on social relations in a monastery.
8.4.4 Pyramids.
8.4.5 Application of pyramid clustering to gene sequences of yeasts.
8.5 Simultaneous clustering of objects and variables.
8.5.1 Hierarchical classes.
8.5.2 Application of hierarchical classes to psychiatric symptoms.
8.5.3 The error variance technique.
8.5.4 Application of the error variance technique to appropriateness of behaviour data.
8.6 Clustering with constraints.
8.6.1 Contiguity constraints.
8.6.2 Application of contiguity-constrained clustering.
8.7 Fuzzy clustering.
8.7.1 Methods for fuzzy cluster analysis.
8.7.2 The assessment of fuzzy clustering.
8.7.3 Application of fuzzy cluster analysis to Roman glass composition.
8.8 Clustering and artificial neural networks.
8.8.1 Components of a neural network.
8.8.2 The Kohonen self-organizing map.
8.8.3 Application of neural nets to brainstorming sessions.
8.9 Summary.
9 Some final comments and guidelines.
9.1 Introduction.
9.2 Using clustering techniques in practice.
9.3 Testing for absence of structure.
9.4 Methods for comparing cluster solutions.
9.4.1 Comparing partitions.
9.4.2 Comparing dendrograms.
9.4.3 Comparing proximity matrices.
9.5 Internal cluster quality, influence and robustness.
9.5.1 Internal cluster quality.
9.5.2 Robustness split-sample validation and consensus trees.
9.5.3 Influence of individual points.
9.6 Displaying cluster solutions graphically.
9.7 Illustrative examples.
9.7.1 Indo-European languages a consensus tree in linguistics.
9.7.2 Scotch whisky tasting cophenetic matrices for comparing clusterings.
9.7.3 Chemical compounds in the pharmaceutical industry.
9.7.4 Evaluating clustering algorithms for gene expression data.
9.8 Summary.
Bibliography.
Index.