Changelog#
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
unreleased#
3.7.3 - 2024.07.05#
Changed#
Name from MolDrug to moldrug (mainly for documentation).
3.7.2 - 2024.01.22#
Fix#
It was possible to generate the same children in the same generations. Now there is a filter to avoid repeated evaluations.
3.7.1 - 2024.01.14#
Fix#
Remove an unnecessary print statement.
3.7.0 - 2024.01.14#
Fixed#
Calculate SA_score with the molecule without explicit hydrogens for all built-in fitness functions. Based on: rdkit/rdkit#7047
Added#
The
moldrug.constraintconf.clashes_presentfunction.
Changed#
bio.PDBobject is no longer used for filtering of conformations that clash with the protein. Now the coordinates are retrieved from the RDKit molecule object and the distance is calculated with NumPy.
Removed#
biodependency for constraint conformer generation.
3.6.1 - 2023.12.11#
Fix#
LICENCE on PyPi.
Add#
Extra metadata to the package.
3.6.0 - 2023.12.07#
Added#
vina_seedargument for:all the cost functions of
moldrug.fitness,moldrug.fitness._vinadock.
randomseed:moldrug.utils.confgen,moldrug.utils.Individual,moldrug.utils.Local,moldrug.utils.GA,moldrug.constraintconf.constraintconfas well its CLI with the flag--seed.
-Vand--verboseflag in the CLI of moldrug.moldrug.utils.softmaxfunction.moldrug.utils.deep_updatefunction.moldrug.utils.__get_default_desirability` to store the default desirability values.vina_score_type = 'ensemble', this is used by CostMultiReceptors* functions. It is meant to be used to account for flexibility in the receptor. This is equivalent to performing ensemble docking.
Removed#
outdirflag inmoldrugCLI.
Changed#
Use
random.randominsteadnumpy.random.randfunction for some small cases.AllChem.MMFFOptimizeMoleculeis only used internally formoldrug.utils.confgenandmoldrug.constraintconf.gen_aligned_confifrandomseedis not set and withmaxIters = 500.picklemodule is replaced bydill. It is better to handle users’ custom fitness functions.Data now is retrieved with
moldrug``.data.get_datafunction.Is not needed to input the whole desirability definition if only is intended to change one part of the default desirability used internally by the cost functions of
moldrug.fitness. For example, if you would like to change theTargetvalue ofvina_scorefrom its default value.Before:
ga = GA( ... costfunc = moldrug.fitness.Cost costfunc_kwargs = { ... "desirability": { "qed": { "w": 1, "LargerTheBest": { "LowerLimit": 0.1, "Target": 0.75, "r": 1 } }, "sa_score": { "w": 1, "SmallerTheBest": { "Target": 3, "UpperLimit": 7, "r": 1 } }, "vina_score": { "w": 1, "SmallerTheBest": { "Target": -12, "UpperLimit": -6, "r": 1 } } }, ... } ... )
Now:
ga = GA( ... costfunc = moldrug.fitness.Cost costfunc_kwargs = { ... "desirability": { "vina_score": { "SmallerTheBest": { "Target": -12, } } }, ... } ... )
The same from the CLI.
Fixed#
Small bug when the population has Individuals with the same cost. Better reproducibility.
Refactored changes.
Handled properly in case
receptor_pdbqt_path = None.Convert
self.crem_db_pathto absolute path inmoldrug.utils.Localandmoldrug.utils.GA.
3.5.0 - 2023.10.26#
Fixed#
Refactored changes for for meeko-0.5.0.
Update docs.
Added#
A new environmental variable
MOLDRUG_VERBOSEcan be set to ask for more debug information.
Changed#
From
csftotomlpackage configuration.Structure of the repo. Now
src/moldrug.Improve moldrug-Dashboard.
Removed#
Support for conda package. Some dependencies had not ben installed properly. Fix in the future.
3.4.0 - 2023.03.10#
Added#
Continuation option to the command line.
moldrug.cli.CommandLineHelperclass to work with the parameters passed through the command line.checkpointoption tomoldrug.utils.GA.moldrug-Dashboard add-on. This is not included on the package itself, but could be used online or locally. In the case of locally you must check Streamlit, the requirements.txt and the app script.
retunr_moloption toutils.to_dataframe
Changed#
The warnings are not printed for:
moldrug.fitness._vinadockandmoldrug.constraintconf.generate_conformers. Now, only at the end of a moldrug simulation, a note will be printed if theerror.tar.gzfile is created.moldrug.utils.rundoes not print extra info if the command fails. It only raises the correspondingRuntimeError.moldrug.fitness.__vinadockbymoldrug.fitness._vinadock.Remove conformers that clash with the protein in case of score_only, for local_only vina will handle the clash.
Fixed#
Small bug during the initialization of the population with multiple
seed_mol. Nowseed_molwith the same amount of elements aspopsizeare not submitted to mutations, only evaluation.Problems with default parameters definition on the command line. Parameters with default values by the
typeof run defined in the configuration file do not need to be redefined anymore;moldrug.cliwill guess those.
3.3.0 - 2022.12.21#
Changed#
moldrug.utils.__tar_errors__bymoldrug.utils.tar_errors.The default value of
moldrug.utils.tar_errorsiserrorinstead of.error.moldrug.constraintconf.generate_conformersoutputs warnings and errors toerrorinstead of.error.moldrug.fitness._vinadockoutputs warnings and errors toerrorinstead of.error.
Fixed#
The use of
moldrug.utils.tar_errorsinside ofmoldrug.utils.Localandmoldrug.utils.GA.Clean code.
3.2.5 - 2022.12.20#
Fixed#
Improve docs.
Cleaning of temporal files in
/tmpdirectory. Now it is created temporal directories in the working directory with the pattern:.costfunc_moldrug_XXXXXXXX.Cleaning errors. Now all warnings and errors are saved in
.errordirectory and at the end they are compressed to `error.tar.gz``.
3.2.2 - 2022.12.12#
Fixed#
Bug: The initial individual was not printed properly.
Removed#
Redundant code in
moldrug.utils.GA
3.2.0 - 2022.11.28#
Fixed#
Bug: The output error name when constraint fails has an
idxprefix. E.g.33_conf_27_error.pbz2now is:idx_33_conf_27_error.pbz2. Now it is easy to delete all of these files at the end of the simulation if they are not needed. (on the last version the naming was not changing)
Removed#
moldrug.fitness.is_inside_boxBug:
constraint_check_inside_boxoption for the cost functions ofmoldrug.fitness
3.1.0 - 2022.11.28#
Added#
moldrug.constraintconf.gen_aligned_confIn case that
moldrug.constraintconf.generate_conformersfails withrdkit.Chem.AllChem.ConstrainedEmbedwill try withmoldrug.constraintconf.gen_aligned_conf`.moldrug.fitness.is_inside_boxconstraint_check_inside_boxarguments to the cost functions ofmoldrug.fitness. If the coordinates of the constraint conformation are outside the box; use always local_only, by default False
Changed#
The output error name when constraint fails has an idx prefix. E.g. 33_conf_27_error.pbz2 now is:
idx_33_conf_27_error.pbz2. Now it is easy to delete all of these files at the end of the simulation if they are not needed.
Fixed#
Clean code.
Improve docs.
3.0.3 - 2022.11.26#
Added#
Warning in case
moldrug.utils.Localormoldrug.utils.GAare called with a different moldrug as they were initialized.
Changed#
Convert to absolute path
receptor_pdbqt_pathandvina_executable(in case that it points to a file) inside ofmoldruf.fitness.__vinadock.
Fixed#
Bug for hydrogen coordinates when constrain docking was used.
Improve docs.
3.0.1 - 2022.11.24#
Fixed#
Cleaning code.
Sort the initial population based on the cost attribute when it is saved on disk.
Improve docs.
3.0.0 - 2022.11.23#
Changed#
Name of
moldrug.fitness.get_mol_costtomoldrug.fitness.__get_mol_costfunction.The class
moldrug.utils.GAdoes not have any more the methodroulette_wheel_selectionanymore; now is part of a function that could be called frommoldrug.utilsmaxformax_confinmoldrug.constraintconf.constraintconf()function.Entrance point constraintconf was changed to constraintconf_moldrug and now it is linked to
moldrug.cli.__constraintconf_cmdinsteadmoldrug.constrainconf.constraintconf_cmd.Name of the function
moldrug.fitness.vinadocknow ismoldrug.fitness.__vinadock.Name of the function
moldrug.cli.moldrug_cmdnow ismoldrug.cli.__moldrug_cmd.
Fixed#
Cleaning the code.
If
vina_executableis provided (to any cost function) it represents a path. It will try to convert to an absolute path. Previously relative paths to the executable were not understood properly.Improve docs.
Added#
ad4mapin all the cost functions of themoldrug.fitnessmodule. This parameter specifies the path where the ad4 map files are. To use this feature you must have the AutoDcok Vina v1.2.3 of above. Now you can use the force fields of AD4 inside of Vina. Future releases will extend the integration with these versions.moldrug.utils.to_dataframe. This function was previously isolated as a method of the classmoldrug.utils.GA; now it could also be called as a function.kept_gensattribute to theIndividuals inside ofmoldrug.utils.GA. This is a set that contains the generations for which the Individual was conserved.acceptanceattribute tomoldrug.utils.GA. This is a dictionary that has as a keyword the generation ID, and as values a dictionary with keywords:accepted(number of generated molecules accepted on the current generation) andgenerated(number of total molecules generated)Print
Accepted rate= accepted / generatedduring running.Add hydrogens before creating
pdbtfile withmeekowhen constrain dockingiused.seed_molofmoldrug.utils.GAnow could be a list (or iterable in a general way) of RDKit molecules. This feature could be used to combine several moldrug runs and create a final run with this combined population.seed_molfrom the command line could be a valid SMILES, a list of valid SMILES or a list of paths to the_pop.pbz2binary files. In the last case, all the populations will be combined and sorted based on the cost attribute. If the result population is less thatpopsizenew structures will be generated to complete the initial population. The individuals of this initial population will be reinitialized and the cost function will be calculated.
2.1.12 - 2022.09.29#
Added#
score_onlybool parameter tomoldrug.fitness.get_mol_cost. Print the starting date when moldrug is called from the command line.
Removed#
Type Hints
intfor attributeidxonmoldrug.utils.Individual.
Changed#
If AutoDock-Vina fails inside
moldrug.fitness.vinadock; give aspdbqtthe string “VinaFailed”.If the molecule has a molecular weight higher than
wt_cutoffwhenmoldrug.fitness.CostOnlyVina(ormoldrug.fitness.CostMultiReceptorsOnlyVina) is called; thepdbqtattribute of the returned Individual will be the string “TooHeavy” (or the list of strings List[“TooHeavy”])
2.1.7 - 2022.09.02#
Fixed#
Bug on
moldrug.fitness.vinadockduring searching of MCS betweenIndividual.molandconstraint_ref. Before was needed to manually specify the atom IDs of theseed_molthat matchconstraint_ref, now it is not needed anymore.Bug during handling exception in
moldrug.constraintconf.generate_conformers.Bug during handling exception
moldrug.constraintconf.generate_conformersinmoldrug.fitness.vinadock.Bug(s) when constraint docking is used on different Vina versions. The output of vina is not the same and therefore
moldrug.fitness.vinadockfailed.
Changed#
In case
constraint = Trueinmoldrug.fitness.vinadock,ref_smiwill be the MCF betweenindividual.molandconstraint_refinstead of the SMILES string ofconstraint_refwhenmoldrug.constrainconf.generate_conformersis internally called.
Added#
moldrug.fitness.get_mol_costfunction.Attribute
genIDto the generated individuals during amoldrug.utils.GArun.
2.1.0 - 2022.08.30#
Fixed#
Bug during the calculation of probabilities when costs are larger numbers.
Expose hidden error if some Exception occurred during the parallel run.
Added#
moldrug.constrainconfmodule.Raise
ValueErrorifref_smiis invalid inmoldrug.utils.constrainconf.generate_conformers.
Changed#
In case
constraint = Trueinmoldrug.fitness.vinadock,ref_smiwill be the SMILES string ofconstraint_refwhenmoldrug.constrainconf.generate_conformersis internally called. This is in order to avoid errors whenmoldrug.utils.constrainconf.generate_conformerstries to guessref_smi` based on MCS and fails, see this RDKit bug. The workaround for constraint docking is explained here: Constraint Docking.moldrug.fitness.generate_conformersdoes not fail. In case ofException, it returns the samemolwithout conformers and writes the error in a log file into the working directory.The attribute name
bestcostbybest_costofmoldrug.utils.GA.The functions
duplicate_conformers,get_mcs,generate_conformers,constraintconfandconstraintconf_cmdand the classProteinLigandClashFilterwere moved frommoldrug.fitnessmodule tomoldrug.constrainconfmodule.Entrance point
constraintconfnow it is linked tomoldrug.constrainconf.constraintconf_cmdinsteadmoldrug.fitness.constraintconf_cmd.
2.0.0 - 2022.08.25#
Added#
The functions
duplicate_conformers,get_mcs,generate_conformers,constraintconfandconstraintconf_cmdand the classProteinLigandClashFilter. The code was borrowed from Pat Walters. It is used if constraint docking is needed.constraintconfcan be called from the command line.moldrug.fitness.vinadock()a simple wrapper around vina. This function will be used for all the implemented cost functions inside of the modulemoldrug.fitness. It could be used for constraint docking.moldrug.data.constraintref. This module is used for testing in case constraint docking is needed. It has two MolBlock strings:r_6lu7andr_x0161. That could be easily converted into RDKit molecules.from rdkit import Chem from moldrug.data import constraintref mol = Chem.MolFromMolBlock(constraintref.r_x0161)
This molecule is needed for the keyword argument
constraint_refof the functions of themoldrug.fitnessmodule in case of constraint docking is used.Constraint docking capability in all implemented cost functions of the module
moldrug.fitness.moldrug.data.receptor_pdb. This module is similar tomoldrug.data.receptor_pdbqtbut inpdbformat.Documentation and tutorials.
Changed#
moldrug.utils.make_sdfonly will create thesdffile based on thepdbqtattribute. Ifpdbqtis a list, it will work as the previous version works withpdbqtsattribute.Name of the module
moldrug.data.receptorstomoldrug.data.receptor_pdbqt.Name of keyword argument
receptor_pathtoreceptor_pdbqt_pathon the cost functions:moldrug.fitness.Costandmoldrug.fitness.CostOnlyVina.Name of keyword arguments
receptor_path,vina_score_types,boxcentersandboxsizestoreceptor_pdbqt_path,vina_score_type,boxcenterandboxsizerespectively on the cost functions:moldrug.fitness.CostMultiReceptorsandmoldrug.fitness.CostMultiReceptorsOnlyVina.smilesattribute inmoldrug.utils.Individualnow it is always without explicit Hs, despite if the mol attribute has them.
1.1.0 - 2022.08.23#
Changed#
moldrug.utils.Individualnow is a hashable object.moldrug.utils.GA.SawIndividualsnow is asetinstead of alistmoldrug.utils.update_reactant_zonesets the keywordsmatchValencesandringMatchesRingOnlyto True on rdFMCS.FindMCS. This prevents undesired effects. E.g:from moldrug.utils import update_reactant_zone from rdkit import Chem mol1 = Chem.MolFromSmiles('c1ccccc1') mol2 = Chem.MolFromSmiles('CCCN') update_reactant_zone(parent=mol1,offspring=mol2 parent_replace_ids = [0,2])
Before the results was:
([0, 2, 3], []). Now it is:([0, 1, 2, 3], []). The behavior was because theringMatchesRingOnlyis set toFalseby default inside RDKit.
Removed#
moldrug.utils.timeit. No longer needed.
Added#
Documentation.
1.0.2 - 2022.08.08#
Removed#
Popenoption inmoldrug.utils.run.
Changed#
RuntimeErrorbywarnings.warnwhen vina run fails and save every error asidx_error.pbz2. Whereidxis the index of the failed individual.Print format when
mutatefails inside ofmoldrug.utils.GA
Added#
Print moldrug’s version when the command line is used.
1.0.0 - 2022.07.30#
Fixed#
Hidden
RuntimeErrorinmoldrug.fitnessmodule.Bug printing number of generations in final info.
Removed#
Unused code in
moldrug.homemodule.3D conformation in the
molattribute of the Individual during initialization.The use of
grow_molin the initialization of the populating whenget_similar = True. Now the population is initialized withmutate_moland the same set of crem parameters used during the searching.The automatic addition of Hs in the case where
min_sizeand/ormax_sizewere equal to zero. Now if your intention is to work with the hydrogens, you must provide a SMILES with the explicit Hs. In the future, the input could be just an RDKit mol. Now you must specify if you would like to add explicit Hs to the molecule with the keywordAddHs; the default is False and is used for bothmoldrug.utils.GAandmoldrug.utils.Local.
Added#
Handling vina RuntimeError and keeping track of debug. This feature is used to identify what is the error. In the future will be removed.
Two new fitness functions:
moldrug.fitness.CostOnlyVinaandmoldrug.fitness.CostMultiReceptorsOnlyVina. They only use the information of Vina scoring function. See the docs for more info about it.Tracking of atom indexes during generations in order to use
protected_idsandreplace_idsoptions ofmutate_molfunction of CReM. Before it was not possible; the use of these features generated undesired solutions because the indexes are not static over generations. Even so, there are still some problems for symmetric molecules. We are working on it.
Changed#
The whole moldrug works base on the RDKit mol instead of the SMILES string:
moldrug.utils.Individualis now initializedmolinsteadsmiles. Now the SMILES string is generated internally, it still used as identifying for the instance.moldrug.utils.GAchangedsmilesforseed_moland it is not needed themolvariable any more.moldrug.utils.Localchangedmolforseed_molin the initialization variables.moldrug.utils.confgenchanged smiles formolvariable.
0.1.0 - 2022-07-25#
Fixed#
Minor code cleaning.
Bug during the import of user custom cost function.
Added#
outdiroption for the command line.User custom desirability.
0.0.4 - 2022-07-21#
Fixed#
Minor compatibility issue with Python 3.8 (issue #4).
Problem with the user’s custom cost function supplied on the command line.
Localclass compatible with the command line.Minor code cleaning.
Better code covered during testing