A multidimensional dataset for structure-based machine learning - Nature Computational Science