This file was generated on 2020-08-06 by Marcos Calegari Andrade GENERAL INFORMATION 1. Title of Dataset: Deep Potential training data for subcritical and supercritical water 2. Author Information A. Principal Investigator Contact Information Name: Roberto Car Institution: Princeton University Address: Frick Chemistry Lab, Princeton, NJ, USA Email: rcar@princeton.edu B. Associate or Co-investigator Contact Information Name: Marcos F. Calegari Andrade Institution: Princeton University Address: Frick Chemistry Lab, Princeton, NJ, USA Email: mandrade@princeton.edu C. Alternate Contact Information Name: Hsin-Yu Ko Institution: Princeton University Address: Frick Chemistry Lab, Princeton, NJ, USA Email: hsinyu@princeton.edu 3. Date of data collection (YYYY-MM): 2020-02 4. Geographic location of data collection: Princeton, NJ, USA 5. Information about funding sources that supported the collection of the data: U.S. Department of Energy (DOE) under Grant No. DE-SC0019394 SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: no restrictions. 2. Was data derived from another source? no 3. Recommended citation for this dataset: Carla Andreani, Giovanni Romanelli, Alexandra Parmentier, Roberto Senesi, Alexander I. Kolesnikov, Hsin-Yu Ko, Marcos F. Calegari Andrade and Roberto Car. DATA & FILE OVERVIEW 1. File List: - raw_files/24_atoms/box.raw - raw_files/24_atoms/coord.raw - raw_files/24_atoms/energy.raw - raw_files/24_atoms/force.raw - raw_files/24_atoms/type.raw - raw_files/96_atoms/box.raw - raw_files/96_atoms/coord.raw - raw_files/96_atoms/energy.raw - raw_files/96_atoms/force.raw - raw_files/96_atoms/type.raw - raw_files/raw_to_set.sh - train/1/graph1.pb - train/1/input.json - train/1/lcurve.out - train/1/sub_gpu - train/1/train.log - train/2/graph2.pb - train/2/input.json - train/2/lcurve.out - train/2/sub_gpu - train/2/train.log - train/3/graph3.pb - train/3/input.json - train/3/lcurve.out - train/3/sub_gpu - train/3/train.log METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: Data set used to train a Deep Potential (DP) model for subcritical and supercritical water. Training data contain atomic forces, potential energy, atomic coordinates and cell tensor. Energy and forces were evaluated with the density functional SCAN. Atomic configurations were extracted from DP molecular dynamics at P = 250 bar and T = 553, 623, 663, 733 and 823 K. Input files used to train the DP model are also provided. 2. Methods for processing the data: In order to train a DNN potential, execute the following commands: cd raw_files/24_atoms ../raw_to_set.sh cd ../96_atoms ../raw_to_set.sh cd ../../train/1 python -m deepmd train input.json We already provided you frozen graphs in train/1, train/2 and train/3 folders. You can use the Lammps code to run DPMD simualations using these potentials. For more details, please visit: https://github.com/deepmodeling/deepmd-kit 3. Instrument- or software-specific information needed to interpret the data: DeepMD-Kit: https://github.com/deepmodeling/deepmd-kit Lammps: https://lammps.sandia.gov/ DATA-SPECIFIC INFORMATION FOR: box.raw 1. Number of variables: 9 per line 2. Number of cases/rows: number of lines is equal to the total number of atomic configurations in the data 3. Variable List: each line contains the cell tensor (3x3 matrix) in the following order: a(1,1) a(1,2) a(1,3) a(2,1) a(2,2) a(2,3) a(3,1) a(3,2) a(3,3) All values are in angstrom units. DATA-SPECIFIC INFORMATION FOR: coord.raw 1. Number of variables: 3N numbers per line. N is the total number of atoms. 2. Number of cases/rows: number of lines is equal to the total number of atomic configurations in the data. Same number of lines as in box.raw. 3. Variable List: each line contains the cartesian coordinates of all atoms in the system. The order is the following: P(1,1) P(1,2) P(1,3) P(2,1) ... P(N,1) P(N,2) P(N,3). The coordinates are in angstrom units. Indexes 1 to N has to follow the same order as in the file type.raw. DATA-SPECIFIC INFORMATION FOR: force.raw 1. Number of variables: 3N numbers per line. N is the total number of atoms. 2. Number of cases/rows: number of lines is equal to the total number of atomic configurations in the data. Same number of lines as in box.raw. 3. Variable List: each line contains the atomic forces of all atoms in the system computed with the SCAN density functional. The order is the following: F(1,1) F(1,2) F(1,3) F(2,1) ... F(N,1) F(N,2) F(N,3). The coordinates are in eV/angstrom units. Indexes 1 to N has to follow the same order as in the file type.raw. DATA-SPECIFIC INFORMATION FOR: energy.raw 1. Number of variables: 1 value line. 2. Number of cases/rows: number of lines is equal to the total number of atomic configurations in the data. Same number of lines as in box.raw. 3. Variable List: each line contains the potential energy of the system computed with the SCAN density functional. Energy is in eV units. DATA-SPECIFIC INFORMATION FOR: type.raw 1. Number of variables: N integers in one line. N is the total number of atoms. 2. Number of cases/rows: one line. 3. Variable List: contains the order of atomic types. This order has to be followed exactly in coord.raw and forces.raw. Oxygen is represented as "0" and hydrogen is represented as "1". DATA-SPECIFIC INFORMATION FOR: input.json DeepMD-kit input file used to train a DNN potential. More information about this file can be found at https://github.com/deepmodeling/deepmd-kit. DATA-SPECIFIC INFORMATION FOR: lcurve.json DeepMD-kit output file obtained during a DNN potential training. More information about this file can be found at https://github.com/deepmodeling/deepmd-kit. DATA-SPECIFIC INFORMATION FOR: train.log DeepMD-kit output file obtained during a DNN potential training. More information about this file can be found at https://github.com/deepmodeling/deepmd-kit. DATA-SPECIFIC INFORMATION FOR: graph?.pb (? = 1,2,3) DeepMD potential (frozen graph) used to run DPMD simulations. More information about this file can be found at https://github.com/deepmodeling/deepmd-kit. DATA-SPECIFIC INFORMATION FOR: sub_gpu Example submission script we use to run the Deep Potential training.