Welcome to qmflows’ documentation!
Contents:

QMFlows
See documentation for tutorials and documentation.
Motivation
Research on modern computational quantum chemistry relies on a set of computational tools to carry out calculations. The complexity of the calculations usually requires intercommunication between the aforementioned tools, such communication is usually done through shell scripts that try to automate input/output actions like: launching the computations in a cluster, reading the resulting output and feeding the relevant numerical result to another program. Such scripts are difficult to maintain and extend, requiring a significant programming expertise to work with them. Being then desirable a set of automatic and extensible tools that allows to perform complex simulations in heterogeneous hardware platforms.
This library tackles the construction and efficient execution of computational chemistry workflows. This allows computational chemists to use the emerging massively parallel compute environments in an easy manner and focus on interpretation of scientific data rather than on tedious job submission procedures and manual data processing.
Description
This library consists of a set of modules written in Python3 to automate the following tasks:
Tutorial and Examples
A tutorial written as a jupyter-notebook is available from: tutorial-qmflows. You can also access direclty more advanced examples.
Installation
Download miniconda for python3: miniconda (also you can install the complete anaconda version).
Install according to: installConda.
Create a new virtual environment using the following commands:
conda create -n qmflows
Activate the new virtual environment
source activate qmflows
To exit the virtual environment type source deactivate
.
Dependencies installation
Type in your terminal:
conda activate qmflows
Using the conda environment the following packages should be installed:
Package installation
Finally install the package:
Install QMFlows using pip: -
pip install qmflows
Now you are ready to use qmflows.
Notes:
Once the libraries and the virtual environment are installed, you only need to type
conda activate qmflows
each time that you want to use the software.
Starting the environment
Once QMflows has been installed the user should run the following command to initialize the environment:
[user@int1 ~]$ source activate qmflows
discarding /home/user/anaconda3/bin from PATH
prepending /home/user/anaconda3/envs/qmflows/bin to PATH
(qmflows)[user@int1 ~]$ python --version
Python 3.5.2 :: Anaconda custom (64-bit)
To leave the environment the following command is used
(qmflows)[user@int1 ~]$ source deactivate
discarding /home/user/anaconda3/envs/qmflows/bin from PATH
To finalize preparations before running QMflows: if you don’t want the results to end up in the current work directory, create a new results folder.
[26]:
mkdir tutorial_results
What is QMflows?
QMflows is a python library that enables executing complicated workflows of interdependent quantum chemical (QM) calculations in python. It aims at providing a common interface to multiple QM packages, enabling easy and systematic generation of the calculation inputs, as well as facilitating automatic analysis of the results. Furthermore it is build on top of the powerful Noodles framework for executing the calculations in parallel where possible.
The basics: calling packages
Currently QMFLOWS offers an interface with the following simulation software:
SCM (ADF and DTFB)
CP2K
ORCA
Note:
Please make sure that the packages you want to use in QMflows are installed and active; in most supercomputer the simulation package are available using a command like (consult your system administrator):
load module superAwesomeQuantumPackage/3.1421
Also some simulation packages required that you configure a scratch folder. For instance Orca requires a SCR folder to be defnied while ADF called it SCM_TMPDIR.
With qmflows
you can write a python script that simply calls one of the package objects adf, dftb, cp2k, or orca
. As arguments to the call, you need to provide a settings
objects defining the input of a calculation, a molecular geometry, and, optionally, a job name that enables you to find back the “raw” data of the calculation later on.
Let’s see how this works:
First we define a molecule, for example by reading one from an xyz file:
[7]:
from scm.plams import Molecule
acetonitrile = Molecule("../test/test_files/molecules/acetonitrile.xyz")
print(acetonitrile)
Atoms:
1 C 2.366998 0.579794 -0.000000
2 C 1.660642 1.834189 0.000000
3 N 1.089031 2.847969 0.000000
4 H 2.100157 0.010030 0.887206
5 H 2.100157 0.010030 -0.887206
6 H 3.439065 0.764079 -0.000000
Then we can perform geometry optimization on the molecule by a call to the dftb package object:
[8]:
from qmflows import dftb, templates, run
job = dftb(templates.geometry, acetonitrile, job_name="dftb_geometry_optimization")
print(job)
<noodles.interface.decorator.PromisedObject object at 0x7ff8fe871da0>
As you can see, “job” is a so-called “promised object”. It means it first needs to be “run” by the Noodles scheduler to return a normal python object.
[29]:
result = run(job, path="tutorial_results", folder="run_one", cache="tutorial_cache.json")
print(result)
[09:14:04] PLAMS working folder: /home/lars/workspace/qmflows/jupyterNotebooks/tutorial_results/run_one
╭─(running jobs)
│ Running dftb dftb_geometry_optimization...
╰(✔)─(success)
<qmflows.packages.SCM.DFTB_Result object at 0x7f6c8e30bcf8>
We can easily retrieve the calculated properties from the DFTB calculation such as the dipole or the optimized geometry for use in subsequent calculations.
[30]:
print("Dipole: ", result.dipole)
print(result.molecule)
Dipole: [1.0864213029, -1.9278296041, -0.0]
Atoms:
1 C 2.366998 0.579794 -0.000000
2 C 1.660642 1.834189 0.000000
3 N 1.089031 2.847969 0.000000
4 H 2.100157 0.010030 0.887206
5 H 2.100157 0.010030 -0.887206
6 H 3.439065 0.764079 -0.000000
Settings and templates
In the above example templates.geometry
was actually a predefined Settings object. You can define and manipulate Settings in a completely flexible manner as will be explained in this section. To facilitate combining different packages in one script, QMflows defines a set of commonly used generic keywords, which can be combined with package specific keywords, to provide maximum flexibility.
[10]:
from qmflows import Settings
s = Settings()
s.basis = "DZP"
s.specific.adf.basis.core = "large"
s.freeze = [1,2,3]
print(s)
basis: DZP
specific:
adf:
basis:
core: large
freeze: [1, 2, 3]
This code snippet illustrates that the Settings
can be specified in two ways, using generic or specific keywords. Generic keywords represent input properties that are present in most simulation packages like a basis set while specific keywords allow the user to apply specific keywords for a package that are not in a generic dictionary.
Expert info:
Settings are a subclass of python dictionaries to represent herarchical structures, like
[3]:
from IPython.display import Image
Image(filename="files/simpleTree.png")
[3]:

In QMflows/PLAMS multiple settings objects can be combined using the
overlay
function.
[11]:
merged_settings = templates.geometry.overlay(s)
print(merged_settings)
specific:
adf:
basis:
type: SZ
core: large
geometry:
optim: delocal
numericalquality: good
scf:
converge: 1e-6
iterations: 100
ams:
ams:
GeometryOptimization:
MaxIterations: 500
Task: GeometryOptimization
cp2k:
force_eval:
dft:
mgrid:
cutoff: 400
ngrids: 4
qs:
method: gpw
scf:
OT:
N_DIIS: 7
minimizer: DIIS
preconditioner: full_single_inverse
eps_scf: 1e-06
max_scf: 200
scf_guess: atomic
subsys:
cell:
periodic: xyz
global:
print_level: low
project: cp2k
run_type: geometry_optimization
motion:
geo_opt:
max_iter: 500
optimizer: bfgs
type: minimization
cp2k_mm:
force_eval:
method: FIST
mm:
forcefield:
do_nonbonded:
ei_scale14: 1.0
ignore_missing_critical_params:
shift_cutoff: .TRUE.
spline:
r0_nb: 0.25
vdw_scale14: 1.0
poisson:
ewald:
ewald_type: NONE
periodic: NONE
print:
ff_info low:
spline_data: .FALSE.
spline_info: .FALSE.
subsys:
cell:
abc: [angstrom] 50.0 50.0 50.0
periodic: NONE
topology:
center_coordinates:
center_point: 0.0 0.0 0.0
coord_file_format: OFF
global:
print_level: low
project: cp2k
run_type: geometry_optimization
motion:
geo_opt:
max_iter: 500
optimizer: lbfgs
type: minimization
dftb:
dftb:
resourcesdir: DFTB.org/3ob-3-1
task:
runtype: GO
orca:
basis:
basis: sto_sz
method:
functional: lda
method: dft
runtyp: opt
basis: DZP
freeze: [1, 2, 3]
The overlay method merged the template containing default settings for geometry optimizations with different packages with the arguments provided by the user
[2]:
Image(filename="files/merged.png")
[2]:

resulting in:
[3]:
Image(filename="files/result_merged.png")
[3]:

Note that the generic and specific keywords still exist next to each other and may not be consistent (e.g. different basis sets are defined in generic and specific keywords). Upon calling a package with a Settings object, the generic keywords are first translated into package specific keywords and combined with the relevant user defined specific keywords. In this step, the settings defined in generic keywords take preference. Subsequently, the input file(s) for the given package is/are generated, based on the keywords after specific.[package] based on the PLAMS software.
[12]:
from qmflows import adf
print(adf.generic2specific(merged_settings))
basis: DZP
freeze: [1, 2, 3]
specific:
adf:
basis:
core: large
type: DZP
geometry:
optim: cartesian
numericalquality: good
scf:
converge: 1e-6
iterations: 100
constraints:
atom 1:
atom 2:
atom 3:
ams:
ams:
GeometryOptimization:
MaxIterations: 500
Task: GeometryOptimization
cp2k:
force_eval:
dft:
mgrid:
cutoff: 400
ngrids: 4
qs:
method: gpw
scf:
OT:
N_DIIS: 7
minimizer: DIIS
preconditioner: full_single_inverse
eps_scf: 1e-06
max_scf: 200
scf_guess: atomic
subsys:
cell:
periodic: xyz
global:
print_level: low
project: cp2k
run_type: geometry_optimization
motion:
geo_opt:
max_iter: 500
optimizer: bfgs
type: minimization
cp2k_mm:
force_eval:
method: FIST
mm:
forcefield:
do_nonbonded:
ei_scale14: 1.0
ignore_missing_critical_params:
shift_cutoff: .TRUE.
spline:
r0_nb: 0.25
vdw_scale14: 1.0
poisson:
ewald:
ewald_type: NONE
periodic: NONE
print:
ff_info low:
spline_data: .FALSE.
spline_info: .FALSE.
subsys:
cell:
abc: [angstrom] 50.0 50.0 50.0
periodic: NONE
topology:
center_coordinates:
center_point: 0.0 0.0 0.0
coord_file_format: OFF
global:
print_level: low
project: cp2k
run_type: geometry_optimization
motion:
geo_opt:
max_iter: 500
optimizer: lbfgs
type: minimization
dftb:
dftb:
resourcesdir: DFTB.org/3ob-3-1
task:
runtype: GO
orca:
basis:
basis: sto_sz
method:
functional: lda
method: dft
runtyp: opt
In the case of adf the above keywords result in the following input file for ADF package:
[34]:
adf_job = adf(merged_settings, acetonitrile, job_name='adf_acetonitrile')
result = run(adf_job, path="tutorial_results",
folder="run_two", cache="tutorial_cache.json")
print(open('tutorial_results/run_two/adf_acetonitrile/adf_acetonitrile.in').read())
[09:14:04] PLAMS working folder: /home/lars/workspace/qmflows/jupyterNotebooks/tutorial_results/run_two
╭─(running jobs)
│ Running adf adf_acetonitrile...
(✔)╰─(success)
atoms
1 C 2.419290 0.606560 0.000000
2 C 1.671470 1.829570 0.000000
3 N 1.065290 2.809960 0.000000
4 H 2.000000 0.000000 1.000000
5 H 2.000000 0.000000 -1.000000
6 H 3.600000 0.800000 0.000000
end
basis
core large
type DZP
end
constraints
atom 2
atom 3
atom 4
end
geometry
optim cartesian
end
integration
accint 6.0
end
scf
converge 1e-06
iterations 100
end
xc
lda
end
end input
Combining multiple jobs
Multiple jobs can be combined, while calling the run function only once. The script below combines components outlined above:
[35]:
from scm.plams import Molecule
from qmflows import dftb, adf, templates, run, Settings
acetonitrile = Molecule("files/acetonitrile.xyz")
dftb_opt = dftb(templates.geometry, acetonitrile, job_name="dftb_opt")
s = Settings()
s.basis = "DZP"
s.specific.adf.basis.core = "large"
adf_single = adf(templates.singlepoint.overlay(s), dftb_opt.molecule, job_name="adf_single")
adf_result = run(adf_single, path="tutorial_results", folder="workflow", cache="tutorial_cache.json")
print(adf_result.molecule)
print(adf_result.energy)
[09:15:08] PLAMS working folder: /home/lars/workspace/qmflows/jupyterNotebooks/tutorial_results/workflow
╭─(running jobs)
│ Running dftb dftb_opt...
(✔)│ Running adf adf_single...
(✔)╰─(success)
Atoms:
1 C 0.000000 0.000000 0.656511
2 C 0.000000 0.000000 -0.783088
3 N 0.000000 0.000000 -1.946913
4 H -0.512221 -0.887193 1.022016
5 H 1.024442 0.000000 1.022016
6 H -0.512221 0.887193 1.022016
-1.4094874734528888
In this case the second task adf_single reads the molecule optimized in the first job dftb_opt. Note that dftb_opt as well as dftb_opt.molecule are promised objects. When run is applied to the adf_single job, noodles builds a graph of dependencies and makes sure all the calculations required to obtain adf_result are performed.
All data related to the calculations, i.e. input files generated by QMflows and the resulting output files generated by the QM packages are stored in folders named after the job_names, residing inside a results folder:
[36]:
ls tutorial_results
run_one/ run_two/ workflow/
[37]:
ls tutorial_results/workflow
adf_single/ dftb_opt/ workflow.log
[38]:
ls tutorial_results/workflow/adf_single
adf_single.dill adf_single.in adf_single.run* logfile t21.H
adf_single.err adf_single.out adf_single.t21 t21.C t21.N
[ ]:
Further Reading
Generic Keywords
Quantum chemistry packages use gaussian type orbitals (GTO) or slater type orbitals (STO) to perform the simulation. The packages use the same standards for the basis set and it will be really handy if we can defined a “generic” keyword for basis sets. Fortunately qmflows
already offers such keyword that can be used among the packages that use the same basis standard,
[11]:
from qmflows import Settings
s = Settings()
s.basis = "DZP"
Internally QMflows will create a hierarchical structure representing basis DZP for the packages that can handle that basis set. Other generic keyowrds like: functional
, inithess
, etc. have been implemented.
The Following table describes some of the available generic keywords and the Packages where the funcionality is implemented
Generic Keyword |
Packages |
Description |
---|---|---|
basis |
ADF, CP2K, Orca |
Set the basis set |
potential |
CP2K |
Set the pseudo potential |
cell_angles |
CP2K |
Specified the angles of the unit cell |
cell_parameters |
CP2K |
Specified the vectors of the unit cell |
constraint |
ADF, Orca |
Constrain the distance, angle or dihedral angle for a set of molecules |
freeze |
ADF, Orca |
Freeze a set of atoms indicated by their indexes or symbols |
functional |
ADF, CP2K |
Set the DFT functional |
inithess |
ADF, Orca |
Provide an initial Hessian matrix |
optimize |
ADF, DFTB, Orca |
Perform a molecular optimization |
selected_atoms |
ADF, Orca |
Optimize the given set of atoms while keeping the rest fixed |
ts |
ADF, DFTB, Orca |
Carry out a transition state optimization |
Note: QMflows Does not have chemical intuition and if you provide a meaningless keyword, like a wrong basis set it will not warm you.
Templates
As has been shown so far, Settings can be specified in two ways: generic or specific. Generic keywords represent input properties that are present in most simulation packages like a basis set while specific keywords resemble the input structure of a given package.
Generic and Specific Settings can express both simple and complex simulation inputs, but it would be nice if we can pre-defined a set of templates for the most common quantum chemistry simulations like: single point calculations, geometry optimizations, transition state optimization, frequency calculations, etc. qmflows already has a pre-defined set of templates containing some defaults that the user can modify for her/his own purpose. Templates
are stored inside the
qmflows.templates
module and are load from JSON files. A JSON file is basically a nested dictionary that is translated to a Settings
object by qmflows.
Below it is shown the defaults for single point calculation
[1]:
from qmflows import templates
templates.singlepoint
RDKit WARNING: [16:43:01] Enabling RDKit 2019.09.1 jupyter extensions
[1]:
specific:
adf:
basis:
type: SZ
numericalquality: normal
scf:
converge: 1e-6
iterations: 100
ams:
ams:
Task: SinglePoint
dftb:
dftb:
resourcesdir: DFTB.org/3ob-3-1
task:
runtype: SP
cp2k:
force_eval:
dft:
mgrid:
cutoff: 400
ngrids: 4
print:
mo:
add_last: numeric
each:
qs_scf: 0
eigenvalues:
eigenvectors:
filename: ./mo.data
ndigits: 36
occupation_numbers:
qs:
method: gpw
scf:
eps_scf: 1e-06
max_scf: 200
scf_guess: restart
subsys:
cell:
periodic: xyz
global:
print_level: low
project: cp2k
run_type: energy
cp2k_mm:
force_eval:
method: FIST
mm:
print:
ff_info low:
spline_data: .FALSE.
spline_info: .FALSE.
forcefield:
ei_scale14: 1.0
vdw_scale14: 1.0
ignore_missing_critical_params:
do_nonbonded:
shift_cutoff: .TRUE.
spline:
r0_nb: 0.25
poisson:
periodic: NONE
ewald:
ewald_type: NONE
subsys:
cell:
abc: [angstrom] 50.0 50.0 50.0
periodic: NONE
topology:
coord_file_format: OFF
center_coordinates:
center_point: 0.0 0.0 0.0
global:
print_level: low
project: cp2k
run_type: energy
orca:
method:
method: dft
functional: lda
basis:
basis: sto_sz
_ipython_canary_method_should_not_exist_:
The question is then, how I can modify a template with my own changes?
Suppose you are perfoming a bunch of constrained DFT optimizations using ADF
. You need first to define a basis set and the constrains.
[3]:
from qmflows import Settings
s = Settings()
# Basis
s.basis = "DZP"
s.specific.adf.basis.core = "Large"
# Constrain
s.freeze = [1, 2, 3]
We use two generic keywords: freeze
to indicate a constrain and basis
to provide the basis set. Also, we introduce an specific ADF
keywords core = Large
. Now you merge your Settings with the correspoding template to carry out molecular geometry optimizations, using a method called overlay
.
[4]:
from qmflows import templates
inp = templates.geometry.overlay(s)
The overlay
method takes as input a template containing a default set for different packages and also takes the arguments provided by the user, as shown schematically
This overlay
method merged the defaults for a given packages (ADF in this case) with the input supplied by the user, always given preference to the user input
Below it is shown a combination of templates, generic and specific keywords to generate the input for a CP2K
job
[2]:
from qmflows import templates
# Template
s = templates.singlepoint
# Generic keywords
s.cell_angles = [90.0, 90.0, 90.0]
s.cell_parameters= 38.0
s.basis = 'DZVP-MOLOPT-SR-GTH'
s.potential ='GTH-PBE'
# Specific Keywords
s.specific.cp2k.force_eval.dft.scf.max_scf = 100
s.specific.cp2k.force_eval.subsys.cell.periodic = 'None'
print(s)
specific:
adf:
basis:
type: SZ
numericalquality: normal
scf:
converge: 1e-6
iterations: 100
ams:
ams:
Task: SinglePoint
dftb:
dftb:
resourcesdir: DFTB.org/3ob-3-1
task:
runtype: SP
cp2k:
force_eval:
dft:
mgrid:
cutoff: 400
ngrids: 4
print:
mo:
add_last: numeric
each:
qs_scf: 0
eigenvalues:
eigenvectors:
filename: ./mo.data
ndigits: 36
occupation_numbers:
qs:
method: gpw
scf:
eps_scf: 1e-06
max_scf: 100
scf_guess: restart
subsys:
cell:
periodic: None
global:
print_level: low
project: cp2k
run_type: energy
cp2k_mm:
force_eval:
method: FIST
mm:
print:
ff_info low:
spline_data: .FALSE.
spline_info: .FALSE.
forcefield:
ei_scale14: 1.0
vdw_scale14: 1.0
ignore_missing_critical_params:
do_nonbonded:
shift_cutoff: .TRUE.
spline:
r0_nb: 0.25
poisson:
periodic: NONE
ewald:
ewald_type: NONE
subsys:
cell:
abc: [angstrom] 50.0 50.0 50.0
periodic: NONE
topology:
coord_file_format: OFF
center_coordinates:
center_point: 0.0 0.0 0.0
global:
print_level: low
project: cp2k
run_type: energy
orca:
method:
method: dft
functional: lda
basis:
basis: sto_sz
_ipython_canary_method_should_not_exist_:
cell_angles: [90.0, 90.0, 90.0]
cell_parameters: 38.0
basis: DZVP-MOLOPT-SR-GTH
potential: GTH-PBE
Molecule
The next component to carry out a simulation is a molecular geometry. qmflows offers a convinient way to read Molecular geometries using the Plams library in several formats like: xyz , pdb, mol, etc.
[6]:
from scm.plams import Molecule
acetonitrile = Molecule("../test/test_files/acetonitrile.xyz")
print(acetonitrile)
Atoms:
1 C 2.366998 0.579794 -0.000000
2 C 1.660642 1.834189 0.000000
3 N 1.089031 2.847969 0.000000
4 H 2.100157 0.010030 0.887206
5 H 2.100157 0.010030 -0.887206
6 H 3.439065 0.764079 -0.000000
You can also create the molecule one atom at a time
[7]:
from scm.plams import (Atom, Molecule)
m = Molecule()
m.add_atom(Atom(symbol='C', coords=(2.41929, 0.60656 , 0.0)))
m.add_atom(Atom(symbol='C', coords=(1.67147, 1.82957, 0.0)))
m.add_atom(Atom(symbol='N', coords=(1.06529, 2.80996, 0.0)))
m.add_atom(Atom(symbol='H', coords=(2.0, 0.0, 1.0)))
m.add_atom(Atom(symbol='H', coords=(2.0, 0.0, -1.0)))
m.add_atom(Atom(symbol='H', coords=(3.6, 0.8, 0.0)))
print(m)
Atoms:
1 C 2.419290 0.606560 0.000000
2 C 1.671470 1.829570 0.000000
3 N 1.065290 2.809960 0.000000
4 H 2.000000 0.000000 1.000000
5 H 2.000000 0.000000 -1.000000
6 H 3.600000 0.800000 0.000000
QMflows Can also handle smiles as shown below
[8]:
from scm.plams import from_smiles
# String representing the smile
smile = 'C1CC2(CCCCC2)C=C1'
#Molecule creation
mol = from_smiles(smile)
print(mol)
Atoms:
1 C -2.532311 0.731744 0.709758
2 C -1.368476 0.367070 1.231109
3 C -0.433052 -0.045396 0.145886
4 C -1.425943 -0.476305 -0.942143
5 C -2.482346 0.596323 -0.777123
6 C 0.340730 -1.284527 0.522195
7 C 1.554856 -1.363207 -0.361120
8 C 2.541783 -0.246962 -0.011749
9 C 1.783591 1.008298 0.379075
10 C 0.441973 1.070128 -0.323205
11 H -3.396188 1.077206 1.265628
12 H -1.110278 0.360527 2.283834
13 H -1.876171 -1.425792 -0.595758
14 H -0.977017 -0.510609 -1.929188
15 H -2.195853 1.551129 -1.261514
16 H -3.436929 0.219830 -1.185449
17 H 0.683363 -1.246115 1.584294
18 H -0.240309 -2.217667 0.374878
19 H 1.304291 -1.253517 -1.418776
20 H 2.005028 -2.361315 -0.225630
21 H 3.183193 -0.002291 -0.890513
22 H 3.139644 -0.569212 0.868506
23 H 1.573985 0.978425 1.469548
24 H 2.392337 1.915661 0.175061
25 H -0.024235 2.038641 -0.029064
26 H 0.554336 1.087931 -1.429810
Bonds:
(1)--2.0--(2)
(2)--1.0--(3)
(3)--1.0--(4)
(4)--1.0--(5)
(3)--1.0--(6)
(6)--1.0--(7)
(7)--1.0--(8)
(8)--1.0--(9)
(9)--1.0--(10)
(5)--1.0--(1)
(10)--1.0--(3)
(1)--1.0--(11)
(2)--1.0--(12)
(4)--1.0--(13)
(4)--1.0--(14)
(5)--1.0--(15)
(5)--1.0--(16)
(6)--1.0--(17)
(6)--1.0--(18)
(7)--1.0--(19)
(7)--1.0--(20)
(8)--1.0--(21)
(8)--1.0--(22)
(9)--1.0--(23)
(9)--1.0--(24)
(10)--1.0--(25)
(10)--1.0--(26)
The Molecule class has an extensive functionally to carry out molecular manipulations, for a comprenhesive disccusion about it have a look at the molecule documentation. Also the module scm.plams.interfaces.molecule.molkit
contains an extensive functionality to apply transformation over a molecule using the RDKit library.
Runinng a quantum mechanics simulation
We now have our components to perform a calculation: Settings and Molecule. We can now invoke a quantum chemistry package to perform the computation,
[9]:
from qmflows import adf
optmized_mol_adf = adf(inp, acetonitrile, job_name='acetonitrile_opt')
the previous code snippet does not execute the code immediatly, instead the simulation is started when the user invokes the run function, as shown below
from plams import Molecule
from qmflows import (adf, run, Settings, templates)
# Settings
s = templates.geometry
s.basis = "DZP"
s.specific.adf.basis.core = "Large"
s.freeze = [1, 2, 3]
# molecule
from plams import Molecule
acetonitrile = Molecule("acetonitrile.xyz")
# Job
optimized_mol_adf = adf(s, acetonitrile, job_name='acetonitrile_opt')
# run the job
result = run(optimized_mol_adf.molecule, folder='acetonitrile')
you can run the previous script by saving it in a file called acetonitrile_opt.py
and typing the following command in your console:
(qmflows)[user@int1 ~]$ python acetonitrile_opt.py
you will then see in your current work directory something similar to the following
(qmflows)[user@int1 ~]$ ls
acetonitrile acetonitrile_opt.py cache.json acetonitrile.xyz
acetonitrile
is the folder containing the output from the quantum package call (ADF
in this case). The cache.json
file contains all the information required to perform a restart, as we will explore below. Inside the acetonitrile
you can find the input/output files resulting from the simulation
(qmflows)[user@int1 ~]$ ls acetonitrile
acetonitrile.log acetonitrile_opt
(qmflows)[user@int1 ~]$ ls acetonitrile/acetonitrile_opt
acetonitrile_opt.dill acetonitrile_opt.out logfile t21.N
acetonitrile_opt.err acetonitrile_opt.run t21.C
acetonitrile_opt.in acetonitrile_opt.t21 t21.H
Extracting Properties
In general, properties are extracted using the standard Object.attribute
notation in python, as shown below.
result = optmized_mol_adf.molecule
Some of the available properties are shown in the following table,
Property |
type |
Description |
---|---|---|
dipole |
Double |
Molecular dipole mopment |
energy |
Double |
Total energy |
enthalpy |
Double |
Enthalpy |
gradient |
Numpy array |
First derivatives of the energy |
hessian |
Numpy array |
Second derivative of the energy |
molecule |
Molecule |
Object representing a physical entity |
runtime |
Double |
Time spent in the simulation |
On the background QMflows has a mechanism to read the properties from the output files and make them available inside Python.
Communicating different packages
We can use the previous optimized geometry for further calculations using for instance another package like Orca to run a frequencies calculation,
[20]:
from qmflows import orca
s2 = Settings()
s2.specific.orca.main = "freq"
s2.specific.orca.basis.basis = 'sto_sz'
s2.specific.orca.method.functional = 'lda'
s2.specific.orca.method.method = 'dft'
job_freq = orca(s2, optmized_mol_adf)
frequencies = job_freq.frequencies
The whole script is
from qmflows import (adf, orca, run, templates, Settings)
from plams import Molecule
import plams
def main():
s = templates.geometry
s.basis = "DZP"
s.specific.adf.basis.core = "large"
acetonitrile = Molecule("files/acetonitrile.xyz")
job = adf(inp, acetonitrile)
optmized_mol_adf = job.molecule
s2 = Settings()
s2.specific.orca.main = "freq"
s2.specific.orca.basis.basis = 'sto_sz'
s2.specific.orca.method.functional = 'lda'
s2.specific.orca.method.method = 'dft'
job_freq = orca(s2, optmized_mol_adf)
frequencies = job_freq.frequencies
print(run(frequencies))
Once you run the script an input file for the ADF and Orca jobs are created. The ADF input looks like
Atoms
1 C 2.419290 0.606560 0.000000
2 C 1.671470 1.829570 0.000000
3 N 1.065290 2.809960 0.000000
4 H 2.000000 0.000000 1.000000
5 H 2.000000 0.000000 -1.000000
6 H 3.600000 0.800000 0.000000
End
Basis
Type DZP
End
Constraints
Atom 1
Atom 2
Atom 3
End
Geometry
Optim cartesian
End
Integration
Accint 6.0
End
Scf
Converge 1e-06
Iterations 100
End
Running in a supercomputer
Running in Cartesius or Bazis through the Slurm resource manager can be done using and script like
#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -n 8
module load orca
module load adf/2016.102
source activate qmflows
python optimization_ADF_freq_ORCA.py
The Slurm output looks like:
[21]:
"""
load orca/3.0.3 (PATH)
discarding /home/user/anaconda3/envs/qmflows/bin from PATH
prepending /home/user/anaconda3/envs/qmflows/bin to PATH
[11:17:59] PLAMS working folder: /nfs/home/user/orca/Opt/example/plams.23412
+-(running jobs)
| Running adf ...
[11:17:59] Job ADFjob started
[11:18:18] Job ADFjob finished with status 'successful'
[11:18:18] Job ORCAjob started
[11:18:26] Job ORCAjob finished with status 'successful'
[ 0. 0. 0. 0. 0. 0.
-360.547382 -360.14986 953.943089 954.3062 1049.2305
1385.756519 1399.961717 1399.979552 2602.599662 3080.45671
3175.710785 3177.612274]
"""
[21]:
"\nload orca/3.0.3 (PATH)\ndiscarding /home/user/anaconda3/envs/qmflows/bin from PATH\nprepending /home/user/anaconda3/envs/qmflows/bin to PATH\n[11:17:59] PLAMS working folder: /nfs/home/user/orca/Opt/example/plams.23412\n+-(running jobs)\n| Running adf ...\n[11:17:59] Job ADFjob started\n[11:18:18] Job ADFjob finished with status 'successful' \n[11:18:18] Job ORCAjob started\n[11:18:26] Job ORCAjob finished with status 'successful' \n\n[ 0. 0. 0. 0. 0. 0.\n -360.547382 -360.14986 953.943089 954.3062 1049.2305\n 1385.756519 1399.961717 1399.979552 2602.599662 3080.45671\n 3175.710785 3177.612274]\n"
How the run function works?
A little discussion about graphs
qmflows is meant to be used for both workflow generation and execution. When you write a python script representing a workflow you are explicitly declaring set of computations and their dependencies. For instance the following workflow represent ADF and Orca computations of the aforementioned example. In this graph the octagons represent quantum simulation using a package, while the ovals represent both user input or data extracted from a
simulation. Finally, the arrows (called edges) represent the dependencies between all these objects.
QMflows automatically identify the dependencies between computations and run them in the correct order (if possible in parallel).
Restarting a simulation
If you are running many computationally expensive calculations in a supercomputer, it can happen that the computations take more time than that allowed by the resource manager in your supercomputer and the workflows gets cancel. But do not worry, you do not need to re-run all the computations. Fortunately, QMflows offers a mechanism to restart the workflow computations.
When running a workflow you will see that QMflows creates a set of files called cache
. These files contain the information about the workflow and its calculation. In order to restart a workflow you only need to relaunch it, that’s it!
Advanced Examples
Conditional Workflows
from noodles import gather
from qmflows import dftb, adf, orca, run, Settings, templates, molkit, find_first_job
# This examples illustrates the possibility to use different packages interchangeably.
# Analytical frequencies are not available for B3LYP in ADF
# This workflow captures the resulting error and submits the same job to ORCA.
# Define the condition for a successful calculation
def is_successful(result):
return result.status not in ["failed", "crashed"]
# Generate water molecule
water = molkit.from_smiles('[OH2]', forcefield='mmff')
# Pre-optimize the water molecule
opt_water = dftb(
templates.geometry, water, job_name="dftb_geometry")
jobs = []
# Generate freq jobs for 3 functionals
for functional in ['pbe', 'b3lyp', 'blyp']:
s=Settings()
s.basis = 'DZ'
s.functional = functional
# Try to perform the jobs with adf or orca
# take result from first successful calculation
freqjob = find_first_job(
is_successful, [adf, orca], templates.freq.overlay(s),
opt_water.molecule, job_name=functional)
jobs.append(freqjob)
# Run workflow
results = run(gather(*jobs), n_processes=1)
After running the above script you have a table like
pbe 1533.267 3676.165 3817.097
b3lyp 1515.799 3670.390 3825.813
blyp 1529.691 3655.573 3794.110
Non-adiabatic couplings
qmflows-namd is a package based on QMflows to compute the Non-adiabatic couplings for large system involving thr use of QMflows, Cython and Numpy.
Exception Handling
Suppose you have a set of non-dependent calculations, for example single point calculations coming from a molecular dynamic trajectory, as shown in the figure below
If one of the single point calculations fails, the rest of the point in the workflow will keep on running and the failed job will return a None value for the requested property.
If the single point calculation would be the dependency of another quantum calculation then the computation will crash.
Library Documentation
For a more detailed description of QMFlows read the documentation
Settings
Settings
is a dict
subclass implemented in PLAMS and modified in Qmflows.
This class represents the data in a hierarchical tree-like structure. for example:
>>> from qmflows import Settings, templates
>>> s = Settings() # (1)
# generic keyword
>>> s.basis = "DZP" # (2)
# "specific" allows the user to apply specific keywords for a package
>>> s.specific.adf.basis.core = "large" # (3)
>>> input_settings = templates.singlepoint.overlay(s) # (4)
The above code snippet shows how to create a Settings
instance object in (1),
then in (2) the generic keyword basis declares that the “DZP” should be used together with the large keyword
of ADF as shown at (3).
Finally in line (4) the user’s keywords are merged with the defaults resultin in a input like:
basis
Core large
Type DZP
end
integration
accint 4.0
end
scf
converge 1e-06
iterations 100
wnd
xc
lda
end
API
- class qmflows.Settings(*args, **kwargs)[source]
A subclass of
plams.Settings
.The difference with respect to plams’ Settings are:
settings['a.b']
is equivalent tosettings['a']['b'] = settings.a.b
- __deepcopy__(_)[source]
Implement
copy.deepcopy(self)
.Serves as an alias for
Settings.copy()
.
- __delitem__(name)[source]
Implement
del self[name]
.
- __getitem__(name)[source]
Implement
self[name]
.
- __setitem__(name, value)[source]
Implement
self[name] = value
.Like its counterpart in
dict
but passed dictionaries are converted into instances oftype(self)
.
Templates
The input generations consist of two parts: chosing a template (see templates) for the kind of calculation to perform and adding some settings to that template. Notice that the user can either pick a specific package template or provides only generic keywords.
YAML
The YAML markdown format is used together with the yaml
module to implement the mechanism to load the templates.
For Example, the default parameter for a single point calculation for several package is:
specific:
adf:
basis:
type: SZ
numericalquality:
normal
scf:
converge: 1e-6
iterations: 100
ams:
ams:
Task: SinglePoint
dftb:
dftb:
resourcesdir:
"DFTB.org/3ob-3-1"
task:
runtype: SP
cp2k:
force_eval:
dft:
mgrid:
cutoff: 400
ngrids: 4
print:
mo:
add_last: numeric
each:
qs_scf: 0
eigenvalues: ""
eigenvectors: ""
filename: "./mo.data"
ndigits: 36
occupation_numbers: ""
qs:
method: gpw
scf:
eps_scf: 1e-06
max_scf: 200
scf_guess: restart
subsys:
cell:
periodic: xyz
global:
print_level: low
project: cp2k
run_type: energy
orca:
method:
method: dft
functional: lda
basis:
basis: sto_sz
Dictionaries
While templates
are used as defaults, the YAML files stored in the dictionaries folder are use for two mainly two purposes:
translation of the generic keywords and properties extraction.
to translate the generic keywords provided by the user to specific keywords used for each package.
For instance these files are used by the Package
class.
Packages
The base class Package
is the library core, it provides the general scaffold to call a quantum code.
On top of this infrastructure it has been created several subclasses that contain the specific details for each quantum code.
The available interfaces to quantum codes are:
This class must not be call directly, instead the correspoding class for the quantum package should be called or in case that there is not an interface to your quantum code, you can make a new subclass that implement the following method:
run_job
– This methods takes aSettings
object a molecule and call a function to create the input automatically and takes cares of the bookkeeping associated with creating new folders, calling the package and retrieving and Object-result.
Instead of implementing all the runners and the bookkeeping functions ourselves, we use the plams library.
For example to carry out a simulation using ADF we call plams as follows:
>>> from scm import plams
>>> mol = plams.Molecule(...)
>>> adf_settings = plams.Settings(...)
>>> result = plams.ADFJob(molecule=mol, settings=adf_settings).run()
Running a workflow
A workflow in QMFlows consist of a set of computations and the dependencies between them, explicitly declared by the user in the python script. This dependecies and the relation between them form an graph (specifically an acyclic direct graph) that represent these relations.
QMFlows Builds the aforemention graph in order to realize the workflow evaluation order.
For instance the figure below represent a simulation where firstly a molecular geometry optimization is carried out using the ADF package and
some user defined Settings
for the ADF simulation package.
Subsequently, using the optimized molecular geometry from the previous step and
another Settings
for an orca simulation a job to compute the molecular frequencies is carried out.

A python script corresponding with this graph can be
>>> from plams import Molecule
>>> from qmflows import (adf, orca, run, Settings)
>>> inp = Settings(...)
>>> acetonitrile = Molecule(...)
# ADF optimization
>>> optmized_mol_adf = adf(inp, acetonitrile, job_name='acetonitrile_opt')
# Orca Settings definition
>>> s2 = Settings()
>>> s2.specific.orca.main = "freq"
>>> s2.specific.orca.basis.basis = 'sto_sz'
>>> s2.specific.orca.method.functional = 'lda'
>>> s2.specific.orca.method.method = 'dft'
# Orca Frequencies
>>> job_freq = orca(s2, optmized_mol_adf)
# Extract the frequencies from the Orca job
>>> frequencies = job_freq.frequencies
# Run the graph
>>> result = run(frequencies)
>>> print(result)
Up to the invocation of the run()
function none of the computations have been executed,
it is the run()
function which builds and executes the dependencies.
Since QMFlows needs to figure out all the dependecies in the script,
the run()
function takes as argument last dependency (or inner most dependy),
which in this case are the frequencies. The reason behind this, is that from the last dependency it is possible to
retrace all the dependecies.
QMFlows uses the noodles library under the hook to takes care of the construction and execution of the dependecy graph.
Extracting numerical properties from output files
Quantum packages simulations generate output file in different formats. For examples the SCM simulation suite
(ADF
and DFTB
in QMFlows) generate binary outputs,
while other packages like CP2K
and ORCA
generate ascii text files.
QMFlows abstract away all the different commmunication protocols with the different output formats, allowing the user to extract the desire property by using the convention:
>>> result = job.property
where job is the simulation perform with a given package and property is the numerical value of interest (scalar or array).
The QMFlows implementation of the aforemention mechanism search in the YAML files located at qmflows/data/dictionaries/
for instructions about how to read that given property from the output file. Nevertheless, Not all the properties for a given
pacakge are implemented. If the property of your interest is not available you can request it in the Qmflows issues page.
Parsers
Internally QMFlows uses different mechanism to extract different properties from the output files. In the case of the ADF
and
DFTB
packages, QMFlows take advantages of the python interface to kftools files developed by the SCM. In the case of XML output,
QMFlows direcltly uses the python built-in xml reader. For the output files in text format Qmflows uses a mixuture of awk and
parsers.
Parsers are a robust alternative to regular expressions, parsers are modular and reusable, while
re
tends to be abstruse and difficult to reuse. A parser is a function that decomposes a string (or binary) into its syntactic components using some predefined rules or grammar.
The library pyparsing offers all the functionality to parse strings, some detail explanation about the library can be found at docs.
HDF5
HDF5 is a data model to store and represent complex numerical data in a hierarchical way. HDF5 is extremely optimized to perform fast I/O operations in large data set and it is implemented as a library with APIs to Python, C, C++, etc. The python interface is called pyh5 and it was designed to run along with Numpy as Scipy.
PackageWrapper
A module which adds the PackageWrapper
class.
The herein implemented class serves as a wrapper around the qmflows
Package
class,
taking a single plams.Job
type as argument
and, upon calling pkg = PackageWrapper(...); pkg()
an
appropiate instance of Package subclas instance is called.
For example, passing plams.ADFJob
will
automatically call adf
,
plams.Cp2kJob
will
call cp2k
, etc.
When no appropiate Package is found, let’s say after passing the MyFancyJob
type,
the PackageWrapper class will still run the job as usual and return the matching
ResultWrapper
object.
There are however three caveats:
No generic keywords are implemented for such jobs.
Specialized warning message will not be available.
The availability of property-extraction methods is limited.
The therein embedded results can be still extracted by calling the
plams.Results
methods appropiate to the passed job type,
e.g. plams.AMSResults
.
For example:
>>> from scm.plams import AMSJob, Molecule, Settings
>>> from qmflows import run, PackageWrapper
>>> from qmflows.packages import ResultWrapper
>>> mol = Molecule(...)
>>> settings = Settings(...)
>>> pkg = PackageWrapper(AMSJob)
>>> job = pkg(settings, mol, name='amsjob')
>>> result: ResultWrapper = run(job)
>>> energy = result.get_energy()
>>> mol = result.get_molecule()
>>> freq = result.get_frequencies()
Index
|
A |
|
Initialize this instance. |
|
(scheduled) If possible, call |
|
Run the job and pass the resulting |
Method not implemented. |
|
|
The matching |
Placeholder docstring for sphinx. |
API
- class qmflows.packages.PackageWrapper(job_type, name=None)[source]
A
Package
subclass for processing arbitraryplams.Job
types.Will automatically convert the passed Job type into the appropiate Package instance upon calling
PackageWrapper.__call__()
.Examples
>>> from scm.plams import ADFJob, AMSJob >>> from qmflows import PackageWrapper, run >>> from qmflows.packages import ResultWrapper, ADF_Result # Start of with two PackageWrapper instances >>> pkg_adf = PackageWrapper(ADFJob) >>> pkg_ams = PackageWrapper(AMSJob) # End up with two different Result instances >>> result_adf: ADF_Result = run(pkg_adf(...), ...) >>> result_ams: ResultWrapper = run(pkg_ams(...), ...)
- job_type
The to-be executed
plams.Job
type. Will be automatically translated, when possible, into to the appropiatePackage
instance upon callingPackageWrapper.__call__()
. If not, default to the more bare-bones implementation within this class and the matchingResultWrapper
instance.- Type:
See also
JOB_MAP
A dictionary mapping PLAMS Job types to appropiate QMFlows Package instances.
- PackageWrapper.__init__(job_type, name=None)[source]
Initialize this instance.
- Parameters:
job_type (
type[plams.Job]
) – The to-be executedplams.Job
type. Will be automatically translated, when possible, into to the appropiatePackage
instance upon callingPackageWrapper.__call__()
. If not, default to the more bare-bones implementation within this class and the matchingResultWrapper
instance. See alsoPackageWrapper.job_type
.
- PackageWrapper.__call__(settings, mol, job_name='', **kwargs)[source]
(scheduled) If possible, call
__call__()
of the Package instance appropiate toPackageWrapper.job_type
.If not, default to the base
__call__()
method.
- PackageWrapper.run_job(settings, mol, job_name='job', work_dir=None, validate_output=True, **kwargs)[source]
Run the job and pass the resulting
plams.Results
object toResultWrapper
.
- static PackageWrapper.handle_special_keywords(settings, key, value, mol)[source]
Method not implemented.
- class qmflows.packages.ResultWrapper(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
The matching
Result
subclass forPackageWrapper
.
- qmflows.packages.JOB_MAP : dict[type[plams.Job], Package]
A dictionary mapping PLAMS
Job
types to appropiate QMFlowsPackage
instance.>>> from __future__ import annotations >>> from qmflows.packages import Package, cp2k, adf, dftb, orca >>> from scm import plams >>> plams.Job = plams.core.basejob.Job >>> plams.ORCAJob = plams.interfaces.thirdparty.orca.ORCAJob >>> JOB_MAP: dict[type[plams.Job], Package] = { ... plams.Cp2kJob: cp2k, ... plams.ADFJob: adf, ... plams.DFTBJob: dftb, ... plams.ORCAJob: orca ... }
qmflows.packages
The Package
(sub-)classes of QMFlows.
The Package Class
|
|
|
Initialize a |
|
(scheduled) Perform a job with the package specified by |
|
Run a set of tasks before running the actual job. |
|
Abstract method; should be implemented by the child class. |
|
Run a set of tasks after running the actual job. |
|
Traverse settings and convert generic into package specific keys. |
|
Abstract method; should be implemented by the child class. |
|
|
|
|
|
A Package subclass for running CP2K Jobs. |
|
A Package subclass for running CP2K Jobs for classical forcefield calculations. |
|
A class for preparing the input for running a Orca job using both PLAMS and templates. |
|
A |
Package Instances
|
(scheduled) Perform a job with the package specified by |
|
(scheduled) Perform a job with the package specified by |
|
(scheduled) Perform a job with the package specified by |
|
(scheduled) Perform a job with the package specified by |
|
(scheduled) Perform a job with the package specified by |
The Result Class
|
Class containing the results associated with a quantum chemistry simulation. |
|
Initialize a |
|
Look for the optional arguments to parse a property, which are stored in the properties dictionary. |
Getter for |
|
Class providing access to PLAMS ADFJob result results. |
|
Class providing access to PLAMS DFTBJob result results. |
|
Class providing access to CP2K result. |
|
A class providing access to CP2KMM result. |
|
Class providing access to PLAMS OrcaJob results. |
|
The matching |
API
- qmflows.packages.run(job, runner=None, path=None, folder=None, load_jobs=False, **kwargs)[source]
Pickup a runner and initialize it.
Serves as a wrapper around
noodles.run_parallel()
.- Parameters:
job (
noodles.PromisedObject
) – The computation to run as constructed byPackage.__call__()
.runner (
str
, optional) – The job runner. Note that this value should be left atNone
.path (
str
orPathLike
, optional) – The path where the PLAMS working directory will be created. Will default to the current working directory ifNone
.folder (
str
orPathLike
, optional) – The name of the new PLAMS working directory. Will default to"plams_workdir"
ifNone
.load_jobs (
bool
) – Load all pre-existing Jobs (contained within the working directory) into memory. Note that this can be quite slow if a large number of pre-existing jobs is present.**kwargs (
Any
) – Further keyword arguments to-be passed tocall_default()
.
- Returns:
A new Result instance. The exact type will depend on job.
- Return type:
See also
noodles.run_parallel()
Run a workflow in parallel threads, storing results in a Sqlite3 database.
- class qmflows.packages.Package(pkg_name)[source]
Package
is the base class to handle the invocation to different quantum package.The only relevant (instance) attribute of this class is
Package.pkg_name
which is a string representing the quantum package name that is going to be used to carry out the compuation.The life-cycle of
Package
consists of 5 general steps:Initializing an instance:
Package.__init__()
.Starting the job:
Package.__call__()
. This method handles the task distribution between the instance’s various methods.Converting all generic into specific settings:
Package.generic2specific()
.Running the actual
plams.Job
(including pre- and post-processing):Package.run_job()
.Returning the final
Result
instance at the end ofPackage.__call__()
.
- generic_mapping = NotImplemented
A class variable with the name of the generic .yaml file. Should be set when creating a subclass.
- pkg_name
An instance variable with the name of the respective quantum chemical package.
- Package.__init__(pkg_name)[source]
Initialize a
Package
instance.- Parameters:
pkg_name (
str
) – The name of the respective quantum chemical package. SeePackage.pkg_name
.
- Package.__call__(settings, mol, job_name='', validate_output=True, **kwargs)[source]
(scheduled) Perform a job with the package specified by
Package.pkg_name
.- Parameters:
settings (
qmflows.Settings
) – The user settings.mol (
plams.Molecule
orrdkit.Mol
) – A PLAMS or RDKit molecule to-be passed to the calculation.job_name (
str
) – The name of the job.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.prerun()
,Package.run_job()
andPackage.post_run()
.
- Returns:
A new Result instance.
- Return type:
- Package.prerun(settings, mol, **kwargs)[source]
Run a set of tasks before running the actual job.
- Parameters:
settings (
qmflows.Settings
) – Settings provided by the user. Note that these settings can still contain generic keywords.mol (
plams.Molecule
, optional) – A PLAMS molecule to-be passed to the calculation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.run_job()
.
See also
Package.run_job()
A method which handles the running of the actual
plams.Job
.
- abstract classmethod Package.run_job(settings, mol, job_name='job', work_dir=None, validate_output=False, **kwargs)[source]
Abstract method; should be implemented by the child class.
A method which handles the running of the actual
plams.Job
.- Parameters:
settings (
qmflows.Settings
, optional) – User-provided Settings as processed byPackage.generic2specific()
.mol (
plams.Molecule
) – A PLAMS molecule to-be passed to the calculation.job_name (
str
) – The name of the job.workdir (
str
orPathLike
, optional) – The path+folder name of the PLAMS working directory.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments.
- Returns:
A new Result instance.
- Return type:
- Package.postrun(result, output_warnings=None, settings=None, mol=None, **kwargs)[source]
Run a set of tasks after running the actual job.
- Parameters:
result (
Result
) – A Result instance.output_warnings (
Mapping
[str
,type
[Warning
]], optional) – A Mapping which maps an error messages to Warning types.settings (
qmflows.Settings
, optional) – User-provided Settings as processed byPackage.generic2specific()
. Will beNone
if an error occured before this point.mol (
plams.Molecule
, optional) – A PLAMS molecule as passed to the calculation. Will beNone
if an error occured before the molecule was parsed inPackage.__call__()
.**kwargs (
Any
) – Further keyword arguments that were passed toPackage.run_job()
.
See also
Package.run_job()
A method which handles the running of the actual
plams.Job
.
- Package.generic2specific(settings, mol=None)[source]
Traverse settings and convert generic into package specific keys.
Traverse all the key, value pairs of the settings, translating the generic keys into package specific keys as defined in the specific dictionary. If one key is not in the specific dictionary an error is raised. These new specific settings take preference over existing specific settings.
- Parameters:
settings (
qmflows.Settings
) – Settings provided by the user.mol (
plams.Molecule
, optional) – A PLAMS molecule to-be passed to the calculation.
- Returns:
A new settings instance without any generic keys.
- Return type:
- abstract static Package.handle_special_keywords(settings, key, value, mol)[source]
Abstract method; should be implemented by the child class.
A method providing additional processing for
Package
dependant generic keywords.- Parameters:
settings (
qmflows.Settings
, optional) – User-provided Settings as being processed byPackage.generic2specific()
.key (
str
) – The key associated with the special keywordvalue (
Any
) – The value associated with the special key.mol (
plams.Molecule
) – A PLAMS molecule to-be passed to the calculation.
See also
Package.generic2specific()
Traverse settings and convert generic into package specific keys.
- class qmflows.packages.ADF(pkg_name='adf')[source]
Package
subclass for ADF.This class takes care of calling the ADF quantum package. it uses both Plams and the Templates module to create the input files, while Plams takes care of running the
plams.ADFJob
. It returns aADF_Result
instance containing the output data.
- class qmflows.packages.CP2K(pkg_name='cp2k')[source]
A Package subclass for running CP2K Jobs.
It uses plams together with the templates to generate the stucture input and also uses Plams to invoke the binary CP2K code. This class is not intended to be called directly by the user, instead the
cp2k
function should be called.
- class qmflows.packages.CP2KMM(pkg_name='cp2k')[source]
A Package subclass for running CP2K Jobs for classical forcefield calculations.
It uses plams together with the templates to generate the stucture input and also uses Plams to invoke the binary CP2K code. This class is not intended to be called directly by the user, instead the
cp2k_mm
function should be called.
- class qmflows.packages.ORCA(pkg_name='orca')[source]
A class for preparing the input for running a Orca job using both PLAMS and templates.
It also does the manangement of the input/output files resulting from running Orca and returns a Results object that containing the methods and data required to retrieve the output.
- class qmflows.packages.PackageWrapper(job_type, name=None)[source]
A
Package
subclass for processing arbitraryplams.Job
types.Will automatically convert the passed Job type into the appropiate Package instance upon calling
PackageWrapper.__call__()
.Examples
>>> from scm.plams import ADFJob, AMSJob >>> from qmflows import PackageWrapper, run >>> from qmflows.packages import ResultWrapper, ADF_Result # Start of with two PackageWrapper instances >>> pkg_adf = PackageWrapper(ADFJob) >>> pkg_ams = PackageWrapper(AMSJob) # End up with two different Result instances >>> result_adf: ADF_Result = run(pkg_adf(...), ...) >>> result_ams: ResultWrapper = run(pkg_ams(...), ...)
- job_type
The to-be executed
plams.Job
type. Will be automatically translated, when possible, into to the appropiatePackage
instance upon callingPackageWrapper.__call__()
. If not, default to the more bare-bones implementation within this class and the matchingResultWrapper
instance.- Type:
See also
JOB_MAP
A dictionary mapping PLAMS Job types to appropiate QMFlows Package instances.
- class qmflows.packages.Result(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
Class containing the results associated with a quantum chemistry simulation.
- Result.__init__(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
Initialize a
Result
instance.- Parameters:
settings (
qmflows.Settings
) – Job Settings.molecule (
plams.Molecule
) – molecular Geometryjob_name (str) – Name of the computations
dill_path (str) – The absolute path to the pickled .dill file.
plams_dir (str) – path to the
Plams
folder.work_dir – scratch or another directory different from the
plams_dir
.
- Type:
work_dir: str
- Result.get_property(prop)[source]
Look for the optional arguments to parse a property, which are stored in the properties dictionary.
- property Result.results
Getter for
Result.results
.Get will load the .dill file and add all of its class attributes to this instance, barring the following three exceptions:
Private attributes/methods.
Magic methods.
Methods/attributes with names already contained within this instance.
This attribute’s value is set to
None
if the unpickling process fails.
- class qmflows.packages.ADF_Result(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
Class providing access to PLAMS ADFJob result results.
- class qmflows.packages.DFTB_Result(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
Class providing access to PLAMS DFTBJob result results.
- class qmflows.packages.CP2K_Result(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
Class providing access to CP2K result.
- class qmflows.packages.CP2KMM_Result(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
A class providing access to CP2KMM result.
- class qmflows.packages.ORCA_Result(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
Class providing access to PLAMS OrcaJob results.
- class qmflows.packages.ResultWrapper(settings, molecule, job_name, dill_path=None, plams_dir=None, work_dir=None, status='successful', warnings=None)[source]
The matching
Result
subclass forPackageWrapper
.
- qmflows.adf(self, settings, mol, job_name='', validate_output=True, **kwargs)
(scheduled) Perform a job with the package specified by
Package.pkg_name
.- Parameters:
settings (
qmflows.Settings
) – The user settings.mol (
plams.Molecule
orrdkit.Mol
) – A PLAMS or RDKit molecule to-be passed to the calculation.job_name (
str
) – The name of the job.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.prerun()
,Package.run_job()
andPackage.post_run()
.
- Returns:
A new Result instance.
- Return type:
Result
- qmflows.dftb(self, settings, mol, job_name='', validate_output=True, **kwargs)
(scheduled) Perform a job with the package specified by
Package.pkg_name
.- Parameters:
settings (
qmflows.Settings
) – The user settings.mol (
plams.Molecule
orrdkit.Mol
) – A PLAMS or RDKit molecule to-be passed to the calculation.job_name (
str
) – The name of the job.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.prerun()
,Package.run_job()
andPackage.post_run()
.
- Returns:
A new Result instance.
- Return type:
Result
- qmflows.cp2k(self, settings, mol, job_name='', validate_output=True, **kwargs)
(scheduled) Perform a job with the package specified by
Package.pkg_name
.- Parameters:
settings (
qmflows.Settings
) – The user settings.mol (
plams.Molecule
orrdkit.Mol
) – A PLAMS or RDKit molecule to-be passed to the calculation.job_name (
str
) – The name of the job.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.prerun()
,Package.run_job()
andPackage.post_run()
.
- Returns:
A new Result instance.
- Return type:
Result
- qmflows.cp2k_mm(self, settings, mol, job_name='', validate_output=True, **kwargs)
(scheduled) Perform a job with the package specified by
Package.pkg_name
.- Parameters:
settings (
qmflows.Settings
) – The user settings.mol (
plams.Molecule
orrdkit.Mol
) – A PLAMS or RDKit molecule to-be passed to the calculation.job_name (
str
) – The name of the job.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.prerun()
,Package.run_job()
andPackage.post_run()
.
- Returns:
A new Result instance.
- Return type:
Result
- qmflows.orca(self, settings, mol, job_name='', validate_output=True, **kwargs)
(scheduled) Perform a job with the package specified by
Package.pkg_name
.- Parameters:
settings (
qmflows.Settings
) – The user settings.mol (
plams.Molecule
orrdkit.Mol
) – A PLAMS or RDKit molecule to-be passed to the calculation.job_name (
str
) – The name of the job.validate_output (
bool
) – IfTrue
, perform a package-specific validation of the output files’ content. Only relevant if the particularPackage
subclass has actually implemented output validation.**kwargs (
Any
) – Further keyword arguments to-be passed toPackage.prerun()
,Package.run_job()
andPackage.post_run()
.
- Returns:
A new Result instance.
- Return type:
Result