ETKDG Refinement
The Experimental Torsion Knowledge Distance Geometry (ETKDG) refinement is what distinguishes this algorithm from plain distance geometry. It uses experimentally observed torsion angle preferences from the Cambridge Structural Database (CSD) to guide conformer geometry toward chemically realistic conformations.
The Key Insight
Plain distance geometry produces geometrically valid structures, but the torsion angles around rotatable bonds are essentially random. The ETKDG approach adds a torsion preference force field that biases the conformer toward experimentally observed dihedral angles.
CSD Torsion Pattern Library
sci-form includes 837 SMARTS patterns with associated Fourier coefficients, derived from the Cambridge Structural Database analysis by Guba et al.
Pattern Categories
| Category | Count | Description |
|---|---|---|
| v2 patterns | 365 | General torsion patterns |
| Macrocycle patterns | 472 | Patterns specific to macrocyclic systems |
Fourier Representation
Each pattern encodes the preferred torsion angle distribution as a 6-term Fourier series:
where:
is the amplitude for the -th Fourier component is the sign is the dihedral angle
The coefficients
Pattern Matching Priority
For each rotatable bond, patterns are matched in order:
- CSD patterns — first-match-wins among the 837 SMARTS
- Basic knowledge — fallback rules for common chemical environments
TIP
Pattern matching uses a first-match-wins strategy. More specific patterns are listed before general ones, so a pattern for "amide C-N" will match before a generic "any C-N" pattern.
Basic Knowledge Torsion Rules
When no CSD pattern matches a rotatable bond, these rules provide reasonable defaults:
| Environment | Rule | |
|---|---|---|
| Ring bond (4-member) | Flat | |
| Ring bond (5-member) | Flat | |
| Ring bond (6-member) | Flat | |
| Double bond | Planar | |
| Amide C-N | Planar preference | |
| Ester C-O | Planar preference | |
| Aromatic-X | Semi-planar | |
| SP3-SP3 | Staggered | |
| Ether/Amine | Soft staggered | |
| Biaryl | Semi-planar |
ETKDG 3D Force Field Components
The complete ETKDG 3D force field combines torsion preferences with structural constraints:
1. Torsion Contributions (from CSD or basic knowledge)
For each matched torsion:
Computed via Chebyshev recurrence for efficiency — only one
2. UFF Inversions (Out-of-Plane)
For SP2 centers with 3 heavy neighbors:
where
Three permutations are evaluated per improper center, cycling through the neighbor triple:
3. Distance Constraints
Maintain bond lengths and angles via flat-bottom potentials:
| Bond type | ||
|---|---|---|
| 1-2 (bonds) | 100 | 0.01 Å |
| 1-3 (improper) | 100 | varies |
| Long-range | 10 | varies |
4. Linear Angle Constraints
For SP atoms (triple bonds, allenes), maintain 180° angle:
Optimization
The ETKDG 3D force field is minimized with a single BFGS pass:
- Maximum iterations: 300
- No restarts — this is a refinement step, not a global optimization
- Early skip: if initial energy <
, skip entirely - Result: final 3D coordinates ready for validation
Example: Butane Torsion
For butane (CCCC), the central C-C bond matches a CSD pattern with staggered preference (
The torsion force field naturally drives the dihedral toward the anti (180°) or gauche (±60°) conformations, which match experimental observation.