ABSTRACT
This paper views clusterings as elements of a lattice. Distances between clusterings are analyzed in their relationship to the lattice. From this vantage point, we first give an axiomatic characterization of some criteria for comparing clusterings, including the variation of information and the unadjusted Rand index. Then we study other distances between partitions w.r.t these axioms and prove an impossibility result: there is no "sensible" criterion for comparing clusterings that is simultaneously (1) aligned with the lattice of partitions, (2) convexely additive, and (3) bounded.
References
- Ben-Hur, A., Elisseeff, A.,
& Guyon, I. (2002). A stability based method for discovering
structure in clustered data. Pacific Symposium on Biocomputing (pp.
6--17).]]Google Scholar
- Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. Wiley.]] Google Scholar
Digital Library
- Fowlkes, E. B., &
Mallows, C. L. (1983). A method for comparing two hierarchical
clusterings. Journal of the American Statistical Association, 78,
553--569.]]Google Scholar
Cross Ref
- Golumbic, M. (1980). Algorithmic graph theory and perfect graphs. Academic Press, New York.]] Google Scholar
Digital Library
- Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193--218.]]Google Scholar
Cross Ref
- Kleinberg,
J. (2002). An impossibility theorem for clustering. Advances in Neural
Information Processing Systems. Cambridge, MA: MIT Press.]]Google Scholar
- Meilǎ,
M. (2003). Comparing clusterings by the variation of information.
Proceedings of the Sixteenth Annual Conference ofn Computational
Learning Theory (COLT). Springer.]]Google Scholar
Cross Ref
- Mirkin, B. (1996). Mathematical classification and clustering. Kluwer Academic Press.]]Google Scholar
- Rand,
W. M. (1971). Objective criteria for the evaluation of clustering
methods. Journal of the American Statistical Association, 66,
846--850.]]Google Scholar
Cross Ref
- Rènyi, A. (1970). Probability theory. North-Holland.]]Google Scholar
- Stanley, R. P. (1997). Enumerative combinatorics. Cambridge University Press.]] Google Scholar
Digital Library
- van
Dongen, S. (2000). Performance criteria for graph clustering and Markov
cluster experiments (Technical Report INS-R0012). Centrum voor Wiskunde
en Informatica.]] Google Scholar
Digital Library
- Wallace, D. L. (1983). Comment. Journal of the American Statistical Association, 78, 569--576.]]Google Scholar
Index Terms
(auto-classified)Comparing clusterings
Comments