Large-scale Machine Learning

Data sets grow rapidly both in their volumes and dimensionalities, in part because they are increasingly gathered by cheap and numerous information-sensing mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks. This addresses new algorithmic challenges in machine learning tasks, where we have worked on developing new algorithms resolving the issues.

Approximating Spectral Sums of Large-scale Matrices

Computation of the trace of a matrix function plays an important role in many scientific computing applications, including applications in machine learning, computational physics (e.g., lattice quantum chromodynamics), network analysis and computational biology (e.g., protein folding), just to name a few application areas. We have worked on developing fast algorithms for approximating the trace of matrix functions of large symmetric matrices with tens of millions dimensions.

Faster Greedy MAP Inference for Determinantal Point Processes: Insu Han, Prabhanjan Kambadur, Kyoungsoo Park and Jinwoo Shin
submitted
Approximating Spectral Sums of Large-scale Matrices using Stochastic Chebyshev Approximations: Insu Han, Dmitry Malioutov, Haim Avron and Jinwoo Shin
SIAM Journal on Scientific Computing, 2017 (accepted to appear)
Large-scale Log-determinant Computation through Stochastic Chebyshev Expansions (code): Insu Han, Dmitry Malioutov, and Jinwoo Shin
ICML 2015 (edited by Jinwoo Shin, Mar 2017)