The PubChemQC PM6 datasets
License and Copyright
Copyright © 2019,202,2021 NAKATA Maho, MAEDA Toshiyuki, SHIMAZAKI Tomomi, HASHIMOTO Masatomo
The PubChemQC PM6 datasets are licensed under a Creative Commons Attribution 4.0 International License.
News
2021-08-20: PubChemQC PM6 ver2.0.0 is available. We added experimental databases using Docker compose, smaller subsets CHON300noSalt and CHNOPSFCl300noSalt, and raw Gaussian output files. By using databases you can query molecules very easily!
Downloads
How to use docker databases
Please refer this page.
Older versions
The PubChemQC PM6 dataset (ver.1.0.3.3) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.3.2) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.3.1) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.3) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.0) can be downloaded from here.
History
2020-10-26: PubChemQC PM6: Data Sets of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties is now published.
2020-09-08: update to 1.0.3.3. Now salts are included for CHNOPSFClNaKMgCa datasets.
2020-08-19: an essential part of jobscripts have been uploaded . These scripts are just for reference.
2020-06-24: update to 1.0.3.2. Remake sub-datasets to use mnemonic like CHON and CHNOPS. No changes are made expept for the sub-dataset4. We add Mg to sub-dataset4 so that cover the most common elements of human body except for Fluorine.
2020-06-21: Sub-Datasets are added: (1) contains C, H, N and O elements, molecular weight less than 500, and no salt.
(2) contains C, H, N, O, S and P elements, molecular weight less than 500, and no salt.
(3) contains C, H, N, F, Cl, O, S and P elements, molecular weight less than 500, and no salt.
(4) contains C, H, N, F, Cl, O, S, P, K, Na and Ca elements, molecular weight less than 500. No changes in the fullset; just added sub-datasets.
2019-05-29: Ver.1.0.3 is released
2019-02-28: Ver.1.0 is released
Reference
(published version) PubChemQC PM6: Data Sets of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties
(arXiv version) PubChemQC PM6: A dataset of 221 million molecules with optimized molecular geometries and electronic properties
Nakata Maho