List view
The SPICE dataset [1] has a subset of 14643 molecules (731856 conformers) with <50 atoms from {H, C, N, O, F, P, S, Cl, Br, I}. Goal: Reproduce the PubChem subset of the SPICE dataset using different triple zeta basis [2] sets {cc-pVTZ, 6-311G, def2-TZVP, def2-TZVPD, def2-TZVPP, def2-TZVPPD}. Subgoal: Compare data science error between basis sets. Subgoal: Try pretrain on cheaper basis set and fine-tune on expensive basis set. [1] https://github.com/openmm/spice-dataset [2] https://en.wikipedia.org/wiki/Basis_set_(chemistry)#Split-valence_basis_sets
No due dateSimulate the largest molecule in the PCQ dataset using the DFT options 6-31G*/b3lyp. This should roughly give N=280.
No due date