The structure and training features are summarized in each structure folder. The training energies (labels) are summarized in a separate .csv file.

Criegee dataset (criegee.zip) -'criegee.csv': Contain the RHF and MRCI+Q energies computed with cc-pVTZ basis. -'criegee_diagnostic.csv': Contain all the T1 and D1 diagnostic computed at CCSD/cc-pVTZ. -For each structure folder (800 in total): 'geo.xyz': Configuration structure 'features_tz.hdf5': Diagonal MOB features used for KA-GPR

H10 chain dataset (h10.zip) -'h10.csv': Contain the RHF and MRCI+Q energies computed with cc-pVTZ-F12 basis. -For each structure folder: 'geo.xyz': Configuration structure 'features_tz.hdf5': Diagonal MOB features used for KA-GPR

Small free rdical dataset (small_radicals.zip) -For each x radical (9 in total): --'x.csv': Contain the ROHF and MRCI+Q energies computed with cc-pVTZ basis. --For each thermalized structure folder (200 in total): 'geo.xyz': Structure 'features_alpha.hdf5': Diagonal MOB features for alpha spin orbital (or up spin) used for KA-GPR 'features_beta.hdf5': Diagonal MOB features for beta spin orbital (or down spin) used for KA-GPR

Water bond dissociation dataset (h2o_dissociation.zip) -'h2o_dissociation.csv': Contain the inital conformer ID, bond length facor of the inital OH bond length on the dissociation PES, ROHF and MRCI+Q energies computed with aug-cc-pVTZ basis. -For each inital conformer folder (50 in total): --For each structure on dissociation PES (20 in total): 'geo.xyz': Structure 'features_alpha.hdf5': Diagonal MOB features for alpha spin orbital (or up spin) used for KA-GPR 'features_beta.hdf5': Diagonal MOB features for beta spin orbital (or down spin) used for KA-GPR

QMSpin dataset (qmspin.zip) -'qmspin.csv': Contain the conformer ID, RHF for singlet energies, ROHF for triplet energies and MRCI+Q energies computed with cc-pVDZ basis. The spin state is listed as 0 for singlet and 2 for triplet. -geometries_singlet: Contain the optimized structure at singlet state -geometries_triplet: Contain the optimized structure at triplet state -singlet: --For each structure folder: 'geo.xyz': Structure 'features_dz_singlet.hdf5': Diagonal MOB features for singlet energies 'features_alpha_dz_triplet.hdf5': Diagonal MOB features for alpha spin orbital (or up spin) used for triplet energies 'features_beta_dz_triplet.hdf5': Diagonal MOB features for beta spin orbital (or down spin) used for triplet energies