Papers
arxiv:2605.14973

THEMol dataset: Torsion, Hessian, and Energy of Molecules

Published on May 14
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

THEMol is a large-scale quantum mechanical dataset featuring diverse molecular systems with comprehensive conformational sampling and detailed energy landscape information.

AI-generated summary

We present THEMol (Torsion, Hessian, Energy of Molecules), a massive open-source collection of quantum mechanical properties tailored for closed-shell organic molecules, with up to 50 heavy atoms. THEMol includes a Hessian subset with more than 3 million relaxed geometries with Hessian matrices, a TorsionScan subset with nearly 100 million constrained relaxed geometries with energies and forces, and relaxation-trajectory subsets (HessianRelax and TorsionScanRelax) that together comprise about 3 billion DFT calculations. The chemical space sampling is comprehensive, spanning twelve essential elements and diverse molecular architectures relevant to drug discovery, electrolytes, ionic liquids, and beyond. The dataset also features exhaustive conformational sampling through the TorsionScan and TorsionScanRelax subsets, including comprehensive in-ring and non-ring torsional scans. Furthermore, it contains an extensive library of Hessian matrices, computed at relaxed geometries, to capture critical second-derivative information of the potential energy landscape. Additionally, we supply electron density-derived atomic multipoles computed via the Minimal Basis Iterative Stockholder partition scheme. Organized into five distinct subsets (Hessian, TorsionScan, HessianRelax, TorsionScanRelax, and MBIS), the data encompasses optimized geometries, relaxation trajectories, and derived molecular properties. We anticipate that this massive and diverse dataset will significantly empower the development of highly accurate and transferable molecular potentials.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.14973
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.14973 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.14973 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.