Skip to content

Validation

After minimization, each candidate conformer must pass a series of geometric validation checks. If any check fails, the conformer is rejected and the pipeline retries with a new random embedding.

Validation Pipeline

Check 1: Energy Per Atom

After bounds force field minimization, the energy per atom must be below a threshold:

EtotalN<0.05

High energy indicates the optimizer couldn't satisfy the distance constraints — typically from a bad initial embedding where atoms are placed too close together or too far apart.

Check 2: Tetrahedral Centers

For each carbon or nitrogen atom that:

  • Has degree 4 (4 neighbors)
  • Is in 2 or more SSSR rings
  • Is NOT in any 3-membered ring

The volume of the tetrahedron formed by its 4 neighbors must be substantial:

Volume Computation

For center atom C with neighbors A,B,D,E:

  1. Compute normalized direction vectors from each neighbor to the center:
v^AC=xCxA|xCxA|
  1. Compute 4 triple products (one for each triple of neighbors):
V1=v^AC(v^BC×v^DC)
  1. All 4 triple products must exceed the threshold:
|Vk|>{0.50normal atoms0.25atoms in small rings
  1. Additionally, the center must be inside the tetrahedron — all 4 face normals must point outward relative to the center (tolerance: 0.30).

Check 3: Chiral Volume Signs

For atoms with @ or @@ stereo specification:

sign(Vactual)=sign(Vexpected)

where:

  • @ (counterclockwise) → volume should be positive
  • @@ (clockwise) → volume should be negative

The volume is V=v1(v2×v3) with vectors computed from the chiral center to its neighbors, in the order specified by the SMILES.

A 20% tolerance is applied — the absolute volume must be at least 20% of the target:

|V|>0.2|Vtarget|

Check 4: Planarity

For SP2 centers (C=C, C=O, aromatic C, amide N), the out-of-plane energy is computed using UFF inversions:

Eoop=Kimpropers(1sinY)

where Y is the Wilson out-of-plane angle.

Rejection criterion:

Eoop>Nimproper×0.7

This accepts conformers where most SP2 centers are planar, rejecting only those with severely distorted geometry.

Additionally, SP-hybridized atoms (triple bonds, allenes, linear) are checked for linearity:

θactual>175°

Check 5: Double-Bond Geometry

For each double bond, the substituents on each side must not be linear with the bond axis:

cosθ+1>103

where θ is the angle between a substituent and the double bond vector. A value near 180° (cos θ ≈ −1) would mean the substituent is collinear with the double bond — a physically impossible configuration.

Summary of Thresholds

CheckThresholdWhat It Catches
Energy/atom< 0.05Bad embedding, stuck optimization
Tetrahedral volume> 0.50 (0.25 small ring)Collapsed ring junctions
Chiral sign±20% toleranceWrong stereochemistry
Planarity< 0.7 × NimproperPuckered SP2 centers
Double bondcosθ+1>103Linear substituents

Retry Budget

The pipeline allows up to 10N total retry iterations, where N is the number of atoms. For a typical drug-like molecule with 30 atoms, this gives 300 attempts — which is almost always sufficient.

In practice, most molecules succeed within 1–5 attempts. Molecules with many chiral centers or strained ring systems may need 10–50 attempts.

Released under the MIT License.