Skip to content

NMR Spectroscopy — HOSE Codes & Karplus Equation

sci-form predicts ¹H and ¹³C NMR chemical shifts from the molecular graph using HOSE codes (Hierarchical Organization of Spherical Environments), and estimates vicinal J-coupling constants via the Karplus equation.

Pipeline

Theory

HOSE Codes — Chemical Environment Fingerprint

A HOSE code encodes the spherical chemical environment around each atom by performing a breadth-first traversal of the molecular graph out to radius R (default 4 bonds):

  1. Start at the central atom (sphere 0).
  2. At each sphere r=1,2,,R, record the neighbor atoms sorted by atomic number, bond order, and connectivity count.
  3. Concatenate these strings separated by sphere delimiters ( and ).

Example for the carbonyl carbon in acetone (CH₃COCH₃):

C-6;C;=O,C/C,C//H,H,H,H,H,H/

Reading: central C → sphere 1 has C=O and C–C → sphere 2 has C and C → sphere 3 has H×3 and H×3.

The resulting string is matched against a reference database of >10,000 shifts to find similar chemical environments and predict the shift by analogy.

Chemical Shift Prediction

The predicted shift for atom i is a combination of:

  1. Hybridization base value — sp³, sp², sp (alkyne), and aromatic carbons/hydrogens have distinct chemical shift ranges.
  2. Electronegativity correction — adjacent electronegative atoms (O, N, F, halogens) deshield via inductive effects:
δind=jneighborskα(χjχref)
  1. Ring current effect — aromatic rings cause through-space shielding/deshielding:
δring={+1.5ppmH in aromatic ring0.5ppmH above aromatic ring (anisotropy)
  1. Functional group corrections — carbonyl (C=O), carboxylic acid, aldehyde, ether, amine corrections applied by pattern matching.

Reference Ranges

NucleusEnvironmentδ range (ppm)
¹Hsp³ C–H (methyl)0.8–1.0
¹Hsp³ C–H (CH₂, CH)1.2–2.0
¹Hsp² C–H (alkene)4.8–6.5
¹Haromatic C–H6.5–8.5
¹Haldehyde C–H9.0–10.5
¹HO–H (alcohol)1.0–5.0
¹HO–H (carboxylic acid)10–13
¹³Csp³ (alkyl)10–40
¹³Csp³ with O/N40–90
¹³Csp² alkene/aromatic100–160
¹³Ccarbonyl C=O (ketone)195–215
¹³Ccarboxylic COOH160–185

Karplus Equation — ³J(H,H) Vicinal Coupling

Vicinal (three-bond) ¹H–¹H coupling constants are predicted using the Karplus equation:

3J(ϕ)=Acos2ϕBcosϕ+C

with parameters A=7.76, B=1.10, C=1.40 Hz (Altona–Sundaralingam, JACS 1972).

When 3D coordinates are available, the actual H–C–C–H dihedral angle ϕ is computed from the positions. When only topology is available, a free-rotation average is used:

3J=02π(Acos2ϕBcosϕ+C)dϕ2π=A2+C5.3Hz

Geminal ²J and Long-Range ⁴J Couplings

  • ²J(H,H): 10 to +15 Hz depending on hybridization and substituents; sci-form uses topology-based estimates (sp³ ~12 Hz, sp² ~+3 Hz).
  • ⁴J(H,H): typically < 3 Hz; included for zig-zag (W-shaped) paths.

Spectrum Generation — Pascal's Triangle Splitting

For ¹H nuclei with adjacent vicinal couplings J1,J2,, the multiplet pattern is built by iteratively convolving the peak position with first-order splittings:

doublet(δ,J):{δJ2,δ+J2}relative heights1:1triplet(δ,J):{δJ,δ,δ+J}relative heights1:2:1quartet(δ,J):{δ3J2,δJ2,δ+J2,δ+3J2}relative heights1:3:3:1

Lorentzian Broadening

Each multiplet line is broadened by a Lorentzian:

S(δ)=kwkγ/π(δδk)2+γ2

where wk are Pascal's triangle weights and γ is the half-width (default 0.01 ppm for ¹H, 0.5 ppm for ¹³C).

Parameters

predict_nmr_shifts

ParameterTypeDescription
smiles&strSMILES string

predict_nmr_couplings

ParameterTypeDescription
smiles&strSMILES string
positions&[[f64;3]]3D coordinates (Å); pass &[] for topological estimate

compute_nmr_spectrum

ParameterTypeDefaultDescription
smiles&strSMILES string
nucleus&str"1H" or "13C"
gammaf640.01Lorentzian HWHM in ppm
ppm_minf64-2.0 ("1H")Spectral window start
ppm_maxf6414.0 ("1H")Spectral window end
n_pointsusize2000Grid resolution

Output Types

NmrShiftResult

FieldTypeDescription
h_shiftsVec<ChemicalShift>Predicted ¹H shifts
c_shiftsVec<ChemicalShift>Predicted ¹³C shifts
notesVec<String>Caveats

ChemicalShift

FieldTypeDescription
atom_indexusizeAtom index in molecule
elementu8Atomic number (1 or 6)
shift_ppmf64Predicted chemical shift (ppm)
environmentStringEnvironment classification
confidencef64Confidence 0.0–1.0

JCoupling

FieldTypeDescription
h1_indexusizeFirst H atom index
h2_indexusizeSecond H atom index
j_hzf64Coupling constant (Hz)
n_bondsusizeNumber of bonds (2J, 3J, 4J)
coupling_typeStringType name (e.g., "vicinal_3J")

NmrSpectrum

FieldTypeDescription
ppm_axisVec<f64>Chemical shift grid (ppm)
intensitiesVec<f64>Broadened spectrum intensities
peaksVec<NmrPeak>Discrete multiplet lines
nucleusString"1H" or "13C"
gammaf64Broadening HWHM (ppm)

NmrPeak

FieldTypeDescription
shift_ppmf64Peak position (ppm)
intensityf64Relative intensity
atom_indexusizeSource atom index
multiplicityStringe.g., "s", "d", "t", "q", "m"

API

Rust

rust
use sci_form::{embed, predict_nmr_shifts, predict_nmr_couplings, compute_nmr_spectrum};

// Shifts only (no 3D needed)
let shifts = predict_nmr_shifts("CCO").unwrap();
for h in &shifts.h_shifts {
    println!("H#{}: {:.2} ppm  ({})", h.atom_index, h.shift_ppm, h.environment);
}

// With 3D coordinates for accurate Karplus coupling
let conf = embed("CC", 42);
let pos: Vec<[f64;3]> = conf.coords.chunks(3).map(|c| [c[0],c[1],c[2]]).collect();
let couplings = predict_nmr_couplings("CC", &pos).unwrap();
for j in &couplings {
    println!("{}J(H{},H{}): {:.2} Hz  ({})",
        j.n_bonds, j.h1_index, j.h2_index, j.j_hz, j.coupling_type);
}

// Full ¹H spectrum
let spec = compute_nmr_spectrum("CCO", "1H", 0.01, -2.0, 12.0, 2000).unwrap();
println!("{} ¹H peaks", spec.peaks.len());

Python

python
from sci_form import nmr_shifts, nmr_couplings, nmr_spectrum, embed

# Chemical shifts
shifts = nmr_shifts("CCO")
for h in shifts.h_shifts:
    print(f"H#{h.atom_index}: {h.shift_ppm:.2f} ppm  [{h.environment}]")
for c in shifts.c_shifts:
    print(f"C#{c.atom_index}: {c.shift_ppm:.2f} ppm  [{c.environment}]")

# Optional: accurate J with 3D
conf = embed("CC", seed=42)
couplings = nmr_couplings("CC", conf.coords)
for j in couplings:
    print(f"{j.n_bonds}J(H{j.h1_index},H{j.h2_index}) = {j.j_hz:.2f} Hz")

# Full ¹H spectrum
spec = nmr_spectrum("CCO", nucleus="1H", gamma=0.01, ppm_min=-2.0, ppm_max=12.0)

import matplotlib.pyplot as plt
plt.plot(spec.ppm_axis, spec.intensities)
plt.xlabel("δ (ppm)")
plt.gca().invert_xaxis()
plt.show()

TypeScript/WASM

typescript
import init, { predict_nmr_shifts, compute_nmr_spectrum } from 'sci-form-wasm';
await init();

const shifts = JSON.parse(predict_nmr_shifts('CCO'));
console.log('1H shifts:', shifts.h_shifts.map((h: any) =>
  `H#${h.atom_index}: ${h.shift_ppm.toFixed(2)} ppm`).join(', '));

const spec = JSON.parse(compute_nmr_spectrum('CCO', '1H', 0.01, -2.0, 12.0, 2000));
const maxI = Math.max(...spec.intensities);
const ppm  = spec.ppm_axis[spec.intensities.indexOf(maxI)];
console.log(`Most intense peak at δ ${ppm.toFixed(2)} ppm`);

Limitations and Caveats

  • Predictions are empirical; typical accuracy is ±0.3 ppm for ¹H and ±5 ppm for ¹³C.
  • Stereochemical effects (axial/equatorial, diastereotopic protons) are partially handled through 3D Karplus.
  • Solvent and temperature effects are not modeled.
  • ¹⁵N, ³¹P, ¹⁹F, and other heteronuclear shifts are not supported.
  • For HOSE code matching, rare environments with no database match fall back to hybridization-based defaults.

Released under the MIT License.