.BG
.EQ
delim $$
.EN
.FN hclust
.TL
Hierarchical Clustering
.CS
hclust(dist, method="compact", sim)
.AG dist
a distance structure or distance matrix.  Normally this will be the
result of the function `dist', but it can be any data of the
form returned by `dist', or a full, symmetric matrix.
.AG method
character string giving the clustering method.  The three
methods currently implemented are "average", "connected"
(single linkage) and "compact" (complete linkage).
(The first three characters of the
method are sufficient.)
.AG sim=
optional structure replacing `dist', but giving similarities
rather than distances. Exactly one of `sim' or `dist' must
be given.
.RT
a "tree" representing the clustering, consisting of the
following components:
.RC merge
an ($n~-~1$) by 2 matrix, if there were $n$ objects in the
original data.  Row $i$ of `merge' describes the
merging of clusters at step $i$ of the clustering.
If an element $j$ in the row is negative, then object $-j$
was merged at this stage.  If $j$ is positive
then the merge was with the cluster formed at
the (earlier) stage $j$ of the algorithm.
.RC height
the clustering "height"; that is, the distance between
clusters merged at the successive stages.
.RC order
a vector giving a permutation of the original objects
suitable for plotting, in the sense that a cluster plot
using this ordering will not have crossings of the branches.
.PP
In hierarchical cluster displays, a decision is needed
at each merge to specify which subtree should go on the left
and which on the right.
Since, for `n' individuals, there are `n\-1' merges, there are
`2^(n\-1)' possible orderings for the leaves in a cluster tree.
The default algorithm in `hclust' is to order the
subtrees so that the tighter cluster is on the left
(the last merge of the left subtree is at a lower value than
the last merge of the right subtree).
Individuals are the tightest clusters possible, and merges involving
two individuals place them in order by their observation number.
.SA
The functions `plcust' and `labclust' are used for plotting
the result of a hierarchical clustering. `dist' computes distance matrices.
Functions `cutree', `clorder', and `subtree' can be used
to manipulate the tree data structure.
.EX
h <- hclust(dist(x))
plclust(h)

hclust(dist(x),"ave")
.KW cluster
.KW multivariate
.KW array
.EQ
delim off
.EN
.WR
