Dr. Thomas Engel - Faculty for Chemistry and Pharmacy

Chemoinformatics - Basic Concepts and Methods

Contents

Foreword
List of Contributors
1. Introduction
2. Principles of Molecular Representations
3. Computer Processing of Chemical Structure Information
4. Representation of Chemical Reactions
5. The Data
6. Databases and Data Sources in Chemistry
7. Searching Chemical Structures
8. Computational Chemistry 8.1 Empirical Approaches to the Calculation of Properties
8.2 Molecular Mechanics
8.3 Molecular Dynamics
8.4 Quantum Mechanics
9. Modeling and Prediction of Properties (QSPR/QSAR)
10. Calculation of Structure Descriptors
11. Data Analysis and Data Handling (QSPR/QSAR) 11.1 Methods for Multivariate Data Analysis
11.2 Artificial Neural Networks (ANNs)
11.3 Deep and Shallow Neural Networks
12. QSAR/QSPR Revisited
13. Bioinformatics
14. Future Directions
Answers Section
Index


1. Introduction

(Thomas Engel and Johann Gasteiger)

1.1 The Rationale for the Books
1.2 The Objectives of Chemoinformatics
1.3 Learning in Chemoinformatics
1.4 Outline of the Book
1.5 The Scope of the Book
1.6 Teaching Chemoinformatics

2. Principles of Molecular Representations

(Thomas Engel)

2.1 Introduction
2.2 Chemical Nomenclature
2.3 Chemical Notations
2.4 Mathematical Notations
2.5 Specific Types of Chemical Structures
2.6 Spatial Representation of Structures
2.7 Molecular Surfaces

3. Computer Processing of Chemical Structure Information

(Thomas Engel)

3.1 Introduction
3.2 Standard File Formats for Chemical Structure Information
3.3 Input and Output of Chemical Structures
3.4 Processing Constitutional Information
3.5 Processing 3D Structure Information
3.6 Visualization of Molecular Models
3.7 Calculation of Molecular Surfaces
3.8 Chemoinformatic Toolkits andWorkflow Environments

4. Representation of Chemical Reactions

(Oliver Sacher and Johann Gasteiger)

4.1 Introduction
4.2 Reaction Equation
4.3 Reaction Types
4.4 Reaction Center and Reaction Mechanisms
4.5 Chemical Reactivity
4.6 Learning from Reaction Information
4.7 Building of Reaction Databases
4.8 Reaction Center Perception
4.9 Reaction Classification
4.10 Stereochemistry of Reactions
4.11 Reaction Networks

5. The Data

(Jarosław Tomczak, Giorgi Lekishvili)

5.1 Introduction
5.2 Data Types
5.3 Storage and Manipulation of Data
5.4 Conclusions

6. Databases and Data Sources in Chemistry

(Engelbert Zass and Thomas Engel)

6.1 Introduction
6.2 Chemical Literature and Databases
6.3 Major Chemical Database Systems
6.4 Compound Databases
6.5 Databases with Properties of Compounds
6.6 Reaction Databases
6.7 Bibliographic and Citation Databases
6.8 Full-Text Databases
6.9 Architecture of a Structure-Searchable Database

7. Searching Chemical Structures

(Nikolay Kochev, Valentin Monev, and Ivan Bangov)

7.1 Introduction
7.2 Full Structure Search
7.3 Substructure Search
7.4 Similarity Search
7.5 Three-Dimensional Structure Search Methods
7.6 Sequence Searching in Protein and Nucleic Acid Databases
7.7 Summary

8. Computational Chemistry

8.1 Empirical Approaches to the Calculation of Properties

(Johann Gasteiger)

8.1.1 Introduction
8.1.2 Additivity of Atomic Contributions
8.1.3 Attenuation Models


8.2 Molecular Mechanics

(Harald Lanig)

8.2.1 Introduction
8.2.2 No Force Field Calculation without Atom Types
8.2.3 The Functional Form of Common Force Fields
8.2.4 Available Force Fields

8.3 Molecular Dynamics

(Harald Lanig)

8.3.1 Introduction
8.3.2 The Continuous Movement of Molecules
8.3.3 Methods
8.3.4 Constant Energy, Temperature, or Pressure?
8.3.5 Long-Range Forces
8.3.6 Application of Molecular Dynamics Techniques
8.3.7 Future Perspectives

8.4 Quantum Mechanics

(Tim Clark)

8.4.1 Hückel Molecular Orbital Theory
8.4.2 Semiempirical MO Theory
8.4.3 Ab Initio Molecular Orbital Theory
8.4.4 Density FunctionalTheory
8.4.5 Properties from Quantum Mechanical Calculations
8.4.6 Quantum Mechanical Techniques for Very Large Molecules
8.4.7 The Future of Quantum Mechanical Methods in Chemoinformatics

9. Modeling and Prediction of Properties (QSPR/QSAR)

(Johann Gasteiger)

10. Calculation of Structure Descriptors

(Lothar Terfloth and Johann Gasteiger)

10.1 Introduction
10.2 Structure Descriptors for Classification and Similarity Searching
10.3 Structure Descriptors for QuantitativeModeling
10.4 Descriptors That Are Not Calculated from the Chemical Structure
10.5 Summary and Outlook

11. Data Analysis and Data Handling (QSPR/QSAR)
11.1 Methods for Multivariate Data Analysis

(Kurt Varmuza)

11.1.1 Introduction into Multivariate Data Analysis
11.1.2 Basics of Statistical Data Evaluation
11.1.3 Multivariate Data
11.1.4 Evaluation of Empirical Models
11.1.5 Exploration: Analyzing the Independent Variables
11.1.6 Calibration: Building a QuantitativeModel
11.1.7 Classification: Discriminating Samples

11.2 Artificial Neural Networks (ANNs)

(Jure Zupan)

11.2.1 How to Learn a New Method?
11.2.2 Multivariate Representation of Data
11.2.3 Overview of Artificial Neural Networks (ANNs)
11.2.4 Error Back-Propagation ANNs
11.2.5 Kohonen and Counter-Propagation ANN
11.2.6 Training of the ANN: Adapting theWeights
11.2.7 Controlling Model Complexity and Optimizing Predictivity
11.2.8 Few General Remarks about ANNs

11.3 Deep and Shallow Neural Networks

(Dave Winkler)

11.3.1 Drug Design in the Era of Big Data and Artificial Intelligence (AI)
11.3.2 Deep Learning
11.3.3 Controlling Model Complexity and Optimizing Predictivity Using Regularization
11.3.4 Universal ApproximationTheorem
11.3.5 Do QSAR Models Generated by Neural Networks Meet the Requirements of the Universal ApproximationTheorem?
11.3.6 Comparison of the Performance of Deep and Shallow Regularized Neural Networks on Drug Datasets
11.3.7 A Few General Remarks about Neural Networks for Drug Discovery

12. QSAR/QSPR Revisited

(Alexander Golbraikh and Alexander Tropsha)

12.1 Best Practices of QSAR Modeling
12.2 The Data Science of QSAR Modeling

13. Bioinformatics

(Heinrich Sticht)

13.1 Introduction
13.2 Sequence Databases
13.3 Searching Sequence Databases
13.4 Characterization of Protein Families
13.5 Homology Modeling

14. Future Directions

(Johann Gasteiger)

14.1 Access to Chemical Information
14.2 Representation of Chemical Compounds
14.3 Representation of Chemical Reactions
14.4 Learning from Chemical Information
14.5 Training in Chemoinformatics