Martin Uhrin

In this work we apply methods for describing three-dimensional images to the problem of encoding atomic environments in a way that is invariant to rotations, translations, and permutations of the atoms and, crucially, can be decoded back into the original environment modulo global orientation without the need for training a model. From the point of view of decoding, the descriptor is optimally complete and can be extended to arbitrary order, allowing for a systematic convergence of the fidelity of the description. In experiments on molecules ranging from 3 to 29 atoms in size, we demonstrate that positions can be decoded with a 97% success rate and positions plus species with a 70% rate of success, rising to 95% if a second fingerprint is used. In all cases, consistent recovery is observed for molecules with 17 or fewer atoms. Additionally, we evaluate the descriptor's performance in predicting the energies and forces of bulk Ni, Cu, Li, Mo, Si, and Ge by means of a neural network model trained on DFT data. When comparing to six machine learning interaction potential methods that use various descriptors and regression schemes, our descriptor is found to be competitive, in several cases outperforming well established methods. The combined ability to both decode and make property predictions from a representation that does not need to be learned lays the foundations for a novel way of building generative models that are tasked with solving the inverse problem of predicting atomic arrangements that are statistically likely to have certain desired properties.