dendrogram {stats}R Documentation

General Tree Structures

Description

Class "dendrogram" provides general functions for handling tree-like structures. It is intended as a replacement for similar functions in hierarchical clustering and classification/regression trees, such that all of these can use the same engine for plotting or cutting trees.

The code is still in testing stage and the API may change in the future.

Usage

as.dendrogram(object, ...)
## S3 method for class 'hclust':
as.dendrogram(object, hang = -1, ...)

## S3 method for class 'dendrogram':
plot(x, type = c("rectangle", "triangle"),
      center = FALSE, edge.root = isLeaf(x) || !is.null(attr(x,"edgetext")),
      nodePar = NULL, edgePar = list(), xlab = "", ylab = "",
      horiz = FALSE, frame.plot = FALSE, ...)

## S3 method for class 'dendrogram':
cut(x, h, ...)

## S3 method for class 'dendrogram':
print(x, digits, ...)

## S3 method for class 'dendrogram':
rev(x)

## S3 method for class 'dendrogram':
str(object, max.level = 0, digits.d = 3,
    give.attr = FALSE, wid = getOption("width"),
    nest.lev = 0, indent.str = "", ...)

Arguments

object any R object that can be made into one of class "dendrogram".
x object of class "dendrogram".
hang numeric scalar indicating how the height of leaves should be computed from the heights of their parents; see plot.hclust.
type type of plot.
center logical; if TRUE, nodes are plotted centered with respect to the leaves in the branch. Otherwise (default), plot them in the middle of all direct child nodes.
edge.root logical; if true, draw an edge to the root node.
nodePar a list of plotting parameters to use for the nodes (see points) or NULL by default which does not draw symbols at the nodes. The list may contain components named pch, cex, col, and/or bg each of which can have length two for specifying separate attributes for inner nodes and leaves.
edgePar a list of plotting parameters to use for the edge (see lines). The list may contain components named col, lty and/or lwd.
horiz logical indicating if the dendrogram should be draw horizontally or not.
frame.plot logical indicating if a box around the plot should be drawn, see plot.default.
h height at which the tree is cut.
..., xlab, ylab graphical parameters, or arguments for other methods.
digits integer specifiying the precision for printing, see print.default.
max.level, digits.d, give.attr, wid, nest.lev, indent.str arguments to str, see str.default(). Note that by default give.attr = FALSE, it still shows height and members attributes for each node.

Details

Warning: This documentation is preliminary.

The dendrogram is directly represented as a nested list where each component corresponds to a branch of the tree. Hence, the first branch of tree z is z[[1]], the second branch of the corresponding subtree is z[[1]][[2]] etc.. Each node of the tree carries some information needed for efficient plotting or cutting as attributes:

members
total number of leaves in the branch
height
numeric non-negative height at which the node is plotted.
midpoint
numeric horizontal distance of the node from the left border (the leftmost leaf) of the branch (unit 1 between all leaves). This is used for plot(*, center=FALSE).
label
character; the label of the node
edgetext
character; the label for the edge leading to the node
nodePar
a named list of length one vectors specifying node-specific attributes for points plotting, see the nodePar argument above.
edgePar
a named list of length one vectors specifying attributes for segments plotting of the edge leading to the node, see the edgePar argument above.
leaf
logical, if TRUE, the node is a leaf of the tree.

cut.dendrogram() returns a list with components $upper and $lower, the first is a truncated version of the original tree, also of class dendrogram, the latter a list with the branches obtained from cutting the tree, each a dendrogram.

There are [[, print, and str methods for "dendrogram" objects where the first one (extraction) ensures that selecting sub-branches keeps the class.

Objects of class "hclust" can be converted to class "dendrogram" using method as.dendrogram.

isLeaf(), plotNode() and plotNodeLimit() are helper functions.

Note

When using type = "triangle", center = TRUE often looks better.

Examples

data(USArrests)
hc <- hclust(dist(USArrests), "ave")
(dend1 <- as.dendrogram(hc)) # "print()" method
str(dend1) # "str()" method

op <- par(mfrow= c(2,2), mar = c(3,3,1,1))
plot(dend1)
## "triangle" type and show inner nodes:
plot(dend1, nodePar=list(pch = c(1,NA),cex=0.8), type = "t", center=TRUE)
plot(dend1, edgePar=list(col = 1:2, lty = 2:3), edge.root = TRUE)
plot(dend1, nodePar=list(pch = 2:1,cex=.4*2:1, col = 2:3), horiz = TRUE)

dend2 <- cut(dend1, h=70)
plot(dend2$upper)
## leafs are wrong horizontally:
plot(dend2$upper, nodePar=list(pch = c(1,7), col = 2:1))
##  dend2$lower is *NOT* a dendrogram, but a list of .. :
plot(dend2$lower[[3]], nodePar=list(col=4), horiz = TRUE, type = "tr")
## "inner" and "leaf" edges in different type & color :
plot(dend2$lower[[2]], nodePar=list(col=1),# non empty list
     edgePar = list(lty=1:2, col=2:1), edge.root=TRUE)
par(op)

[Package Contents]