The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other recent methods from applied mathematics, computer science, statistics, signal processing, and information science. In this paper, we explain and demonstrate a compressed-sensing based methodology for feature selection, specifically for discovering physical descriptors, i.e., physical parameters that describe the material and its properties of interest, and associated equations that explicitly and quantitatively describe those relevant properties.
As showcase application and proof of concept, we describe how to build a physical model for the quantitative prediction of the crystal structure of binary compound semiconductors.
Learning physical descriptors for materials science by compressed sensing
Luca M. Ghiringhelli (1), Jan Vybiral (2), Emre Ahmetcik (1), Runhai Ouyang (1), Sergey V. Levchenko (1), Claudia Draxl (1,3), Matthias Scheffler (1,4) ((1) Fritz-Haber-Institut der Max-Planck-Gesellschaft, Berlin-Dahlem, Germany, (2) Charles University, Department of Mathematical Analysis, Prague, Czech Republic, (3) Humboldt-Universität zu Berlin, Institut für Physik and IRIS Adlershof, Berlin, Germany, (4) University of California - Santa Barbara, Department of Chemistry and Biochemistry and Materials Department, Santa Barbara, CA, USA)
https://arxiv.org/abs/1612.04285
(Submitted on 13 Dec 2016)