This DPMDMBpolreadme.txt was generated on 2023-09-10 GENERAL INFORMATION 1. Title of Dataset: Data from "A Neural Network Water Model Based on the MB-pol Many-Body Potential" 2. Author Information Name: Maria Carolina Muniz Institution: Princeton University Department of Chemical and Biological Engineering Address: Princeton, NJ 08544, USA Email: mcmuniz@princeton.edu Names: Roberto Car Institution: Princeton University Department of Chemistry Address: Princeton, NJ 08544, USA Emails: rcar@princeton.edu Name: Athanassios Z. Panagiotopoulos Institution: Princeton University Department of Chemical and Biological Engineering Address: Princeton, NJ 08544, USA Emails: azp@princeton.edu 3. Date of data collection: 2021-04 to 2023-09 4. Geographic location of data collection: Princeton, NJ, USA 5. Information about funding sources that supported the collection of the data: Computational resources were provided by Princeton Research Computing, a consortium of groups including the Princeton Institute for Computational Science and Engineering (PICSciE) and the Office of Information Technology's High Performance Computing Center and Visualization Laboratory at Princeton University. This work was supported by the ``Chemistry in Solution and at Interfaces'' (CSI) Center funded by the U.S. Department of Energy Award DE-SC001934. Additional support was provided by U.S. Department of Energy Award DE-SC0002128. SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: CC-BY 4.0 2. Links to publications that cite or use the data: M. C. Muniz, R. Car and A. Z. Panagiotopoulos. A Neural Network Water Model Based on the MB-pol Many-Body Potential. Submitted to Journal of Physical Chemistry B. (2023). Link to the article will be added upon publication. 3. Links to other publicly accessible locations of the data: N/A 4. Links/relationships to ancillary data sets: N/A 5. Recommended citation for this dataset: M. C. Muniz, R. Car and A. Z. Panagiotopoulos. Data from "A Neural Network Water Model Based on the MB-pol Many-Body Potential". Princeton DataSpace. Deposited September 2023. DATA & FILE OVERVIEW 1. File List: potentials -- protobuf(.pb) files that represent each of the models trained in this project and can be used in simulations simulations -- input files and data files to reproduce simulations performed in this project training -- input files and training data used for training of each machine learning model 2. Relationship between files, if important: N/A 3. Additional related data collected that was not included in the current data package: N/A 4. Are there multiple versions of the dataset? No METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: Molecular simulations were performed using the DeePMD-kit (https://github.com/deepmodeling/deepmd-kit) interfaced with the LAMMPS molecular simulation software (https://lammps.sandia.gov/). Detailed methods are as described in M. C. Muniz, R. Car and A. Z. Panagiotopoulos. A Neural Network Water Model Based on the MB-pol Many-Body Potential. Submitted to Journal of Physical Chemistry B. (2023). Link to the article will be added upon publication. 2. Methods for processing the data: Data can be analyzed with Python scripts 3. Instrument- or software-specific information needed to interpret the data: Python v.2.7.15 NumPy v.1.16.5 SciPy v.1.2.1 Matplotlib v.2.2.5 MDAnalysis v.0.20.1 4. Standards and calibration information, if appropriate: N/A 5. Environmental/experimental conditions: N/A 6. Describe any quality-assurance procedures performed on the data: N/A 7. People involved with sample collection, processing, analysis and/or submission: N/A DATA-SPECIFIC INFORMATION FOR EACH DIRECTORY: ** OVERVIEW ** Each directory contains potential files, input files, and training data sets from simulations performed in this work. ** potentials ** The files graph-model1.pb and graph-model2.pb are the protobuf(.pb) files of models 1 and 2, respectively. ** simulations ** This directory is divided into 2 subdirectories: liquid_density and vle. Each one contains input files and data files to initiate simulations discussed in this project. ** training ** This directory is divided into 2 subdirectories: model1 and model2. Each one of these directories contain the training data and input files used to train the machine learning models described in this project.