Automatic topography of multidimensional probability densities

A Seminar of the NOMAD Laboratory

  • Online Seminar of the NOMAD Laboratory
  • Date: Feb 4, 2021
  • Time: 02:15 PM (Local Time Germany)
  • Speaker: Prof. Alessandro Laio
  • International School for Advanced Studies (SISSA), Trieste
  • Location: https://us02web.zoom.us/j/88500983651?pwd=M01ZRGJiVzFKRkhYRmN5ZkJqeUxKQT09
  • Room: Meeting ID: 885 0098 3651 I Password: NOMAD
  • Host: NOMAD Laboratory
Automatic topography of multidimensional probability densities
Unsupervised methods in data analysis aim at obtaining a synthetic description of high-dimensional data landscapes, revealing their structure and their salient features. We will describe an approach for charting complex and heterogeneous data spaces, providing a topography of the high-dimensional probability density from which the data are harvested.

We obtain information on the number and the height of the probability peaks, the depth of the "valleys" separating them, the relative location of the peaks and their hierarchical organization. The topography is reconstructed by using an unsupervised variant of Density Peak clustering[1,2] exploiting a non-parametric density estimator[3], which automatically measures the density in the manifold containing the data[4]. Importantly, the density estimator provides an estimate of the error. This is a key feature, which allows distinguishing genuine probability peaks from density fluctuations due to finite sampling. We show that this approach allows identifying the Markov States explored during a protein folding molecular dynamic trajectory directly from the shape of the multidimensional probability density, namely without exploiting any kinetic information[5].

[1] Science, 1492, vol 322 (2014)
[2] Inf. Sci., doi.org/10.1016/j.ins.2021.01.010 (2021)
[3] JCTC ,1206, vol 14 , (2018)
[4] Sci Rep. 12140, vol 7 (2017)
[5] JCTC 80, vol 1, (2020)

Go to Editor View