CA, provide annotation tools in planes
Some additionnal drawings in the CA planes can be seen as automatic annotation tools:
- confidence ellipses
- connexion lines between ordered CA points
- convex hulls around points of HCPC hierarchy nodes or clusters
These annotations can be used as interpretation aids in the same way as the interpretation aids for CA factorial planes visualisation.
Drawing confidence ellipses
See FactoMineR plotellipses()
method:
Drawing connexion lines between CA points
Drawing connexion lines between CA points can be driven by ordered (values of) properties of the points.
A representative moderately complex example is given in (Volle, 1997) with the CA of the table of the population of 'french regions' by 'sexe x age-ranges' in 1968:
- women (F) are connected from line-point 'F 0-4' to 'F 75 et +' in chronological order, with a dashed line
- men (H) are connected from line-point 'H 0-4' to 'H 75 et +' in chronological order, with a plain line
The two broken lines show the parallel behavior between the age-ranges line-points of the two different sexes of the popultation.
Style : lines can be oriented (eg with arrow heads) to visualize the order.
The data table on which this CA plan is based is currently being re-constructed from INSEE public data, and will be filed in this ticket as soon as it becomes available. It will be available for the recette of this type of annotation (contact SLH in the meantime).
Functional validation
For the VOEUX corpus, lines could connect 'text@loc' or 'text@annee' column-points (which are all ordered chronologically).
Drawing convex hulls of HCPC hierarchy nodes or clusters
In the previous CA plane, one can draw convex hulls around column-points at each nodes of a HCPC hierarchy and around the column-points of the clusters choosen:
In this example :
- plain (smoothed) convex hulls are drawn around HCPC hierarchy nodes to each leaf node
- 5 hulls corresponding to 5 HCPC clusters, choosen by cutting the hierarchy, are labeled 1, 2, 3, 4 and 5
- 3 dashed hulls corresponding to 3 HCPC clusters, choosen by cutting the hierarchy, are labeled 'I', 'II' and 'III'
Remarks:
- this example clearly shows how some column-points work alone to build the HCPC hierarchy or with some limited number of mates:
- the '3' or 'II' cluster hull around the 'Paris' column-point is stable between the node levels selection showing that it works alone
- 'Alsace' and 'Rhone Alpes' work together
- etc.
- in this paper based example, all node groups hulls are drawn. In an interactive UI, only some node groups hulls could be drawn according for example to the 'number of clusters' parameter value.
Implementation
For convex hull computing see for example in R the chull
package:
install.packages("chull")
X <- matrix(rnorm(2000), ncol = 2)
plot(X, cex = 0.5)
hpts <- chull(X)
hpts <- c(hpts, hpts[1])
lines(X[hpts, ])
For smoothed convex hull computing see for example, based on D3.polygonHull, https://gist.github.com/hollasch/9d3c098022f5524220bd84aae7623478