Clustering Concepts
Jump to navigation
Jump to search
External
Internal
Overview
We talk about "clustering" when we have a set of n "points", which we may think about as points in space in geometrical sense. It is actually quite rare that the underlying problem we care about is intrinsically geometric. Usually we are representing something else we care about (web pages, genome sequence fragments, etc.) and we want to cluster them in coherent groups. In machine learning, the same problem is referred to as "unsupervised learning", meaning that the data is unlabeled and we are looking for patterns in data, when data is not annotated.
Similarity Measure
The similarity measure is a function that for any two object returns a numerical result that expresses how similar (or dissimilar) those objects are.
The Clustering Problem
Organizatorium
- Some well known clustering algorithms are greedy.