The CAROL workflow
15 Aug 2018
- Dr Ardita Shkurti



​​​​​CAROL (Calculation And extRaction Of Logp and other properties) is an object-oriented modular python implementation of a generalized simulation-analysis pattern framework for material chemistry simulations.




Workflow details

CAROL has been specifically designed to glue together into the same workflow a series of operations that can be summarized into the following steps:

  •   User inputs gathering for configuration of the specific use case;
  •   Set up of simulation input files;
  •   Generation of submission scripts and launches of a simulator;
  •   Analysis of simulation outputs;
  •   Extraction of data of interest.

​The whole process will go on for a specific number of pre-selected iterations where each iteration will include all the above-mentioned steps in order. For each iteration, information on the current and previous system state is conserved within CAROL so as to deal with possible failures of the specific model that do not lead to losing the whole history of where the simulations have arrived at.

Carol has been designed to be used as a workflow platform that can enable the data extraction of several materials chemistry ensemble properties and help investigate them. Such investigations are part of a long term plan to implement an automated self-contained force-field parametrisation server.

CAROL application to materials chemistry simulations

CAROL automatically ​​​configures the necessary inputs for an equilibration stage (that is carried out once at the beginning for each new set of simulations) with global bonds set on. In this equilibration stage, for each given solute a simulation cell comprised of two solvents (usually octanol and water)  and one solute is set up. DL_MESO CONFIG files are generated prior to the simulation to ensure that the simulation cell is partitioned into two bulk solvent regions. Solute molecules are then placed at random throughout the box and are allowed to equilibrate between the two bulk phases. 

The equilibration stage is then followed by a pre-selected number of long-run simulations that are then analyzed by means of UMMAP to computationally calculate the LogP (partition coefficient) value. More information about LogP can be found here -

CAROL combined with a Bayesian optimisation logic for force-field parameterisation

In one of our use cases the CAROL workflow is coupled with a cognitive logic based on Bayesian optimisation ​(the software for such logic was provided by our IBM colleagues). At the end of each CAROL+optimiser workflow iteration, the optimiser provides a new set of interaction parameters for the DL_MESO force field that is then used as a simulation input in the next iteration.

In particular, the values of the property of interest extracted in the last step of the CAROL workflow (the property of interest is the partition coefficient in our use case but in principle can be applied to any other property of interest) are given in input to the cognitive optimiser. Initially, the optimisation logic trains on a few observations of how differing force-field interaction parameters impact the prediction of the partition coefficient compared to experimental data. Then, based on such observations, the optimisation logic defines interaction parameters for the force-field that would minimise the error between predicted and experimental values of the partition coefficient. At the end of a specific user defined number of iterations, the interaction parameters that have the minimal error are recommended to be used for simulations of those given chemical systems.

Further information​

​For further information on CAROL, including downlowading the code and opportunities to couple it with differing optimisation logics, or to apply it in a field different than the chemistry of material and DPD simulations please contact

Contact: Shkurti, Ardita (STFC,DL,HC)