Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions

August 2022

Abstract

We address general-shaped clustering problems under very weak parametric assumptions with a two-step hybrid robust clustering algorithm based on trimmed k-means and hierarchical agglomeration. The algorithm has low computational complexity and effectively identifies the clusters also in the presence of data contamination. Its generalizations and an adaptive procedure to estimate the amount of contamination are also presented.

Type

Book section

Publication

Part of the Advances in Intelligent Systems and Computing book series

Luca Insolia

Postdoctoral Researcher

My primary research interests concern robust statistics and high-dimensional modeling. During my PhD, I developed statistical methodologies for analyzing sparse regression problems affected by different forms of adversarial data contamination. The developed methodologies encompass continuous optimization methods as well as mixed-integer programming techniques. I applied these tools to analyze biomedical data and to investigate the main possible drivers of honey bee colony loss.

Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions

Abstract

Luca Insolia

Postdoctoral Researcher

Domenico Perrotta

Researcher