YAML-based training¶
The agedi train command accepts either a trajectory file or a YAML
configuration file as its first argument, so you can choose whichever workflow
suits your project.
Quick start¶
Copy the bundled template:
cp $(python -c "import agedi; import pathlib; print(pathlib.Path(agedi.__file__).parent / 'conf' / 'train.yaml')") my_train.yaml
Edit
my_train.yaml(at minimum setdata_path).Run training:
agedi train my_train.yaml
Override individual keys without editing the file:
agedi train my_train.yaml feature_size=128 epochs=200 noisers=CellPositions
Configuration file reference¶
A fully annotated template is reproduced below. Every key has a sensible default so you only need to set the values that differ from those defaults.
# ---------------------------------------------------------------------------
# Data
# ---------------------------------------------------------------------------
data_path: /path/to/train.traj # Required – ASE-readable file
# Optional separate dataset used exclusively for regressor (force-field) training.
# Structures here are only forwarded through the regressor loss, never the diffusion
# loss. Useful for non-equilibrium structures (e.g. from MD or NEB).
regressor_data_path: null # Optional – ASE-readable file
# ---------------------------------------------------------------------------
# Score-model architecture
# ---------------------------------------------------------------------------
model: PaiNN # Currently only PaiNN is supported
cutoff: 6.0 # Neighbour-list cutoff in Å
feature_size: 64 # Embedding / feature dimension
n_blocks: 4 # Number of interaction blocks
n_rbf: 30 # Number of radial basis functions
# ---------------------------------------------------------------------------
# Diffusion / noiser configuration
# ---------------------------------------------------------------------------
noisers:
- CellPositions # One or more of:
# Positions : StandardNormal prior + Normal (gas-phase clusters)
# CellPositions : UniformCell prior + Normal (periodic bulk/surface)
# ConfinedCellPositions : UniformCellConfined prior + TruncatedNormal (Z-confined)
# Types : discrete atom-type diffusion
# SDE for position noisers.
# ve : Variance-Exploding SDE (default)
# vp : Variance-Preserving SDE
sde: ve
# Property conditioning (optional). Set to "none" to disable.
conditioning: none
conditioning_type: scalar # scalar | integer
# Z-confinement range [z_min, z_max] in Å – null to disable.
# Required when using the 'ConfinedCellPositions' noiser.
confinement: null
# Train a force-field alongside the diffusion score.
# Set to true when training data contains per-atom DFT forces and you want to
# use force-field guidance during sampling.
force_field: false
# Number of element-type classes for the Types noiser (excluding the absorbing
# state at index 0). When null, all distinct element types in the training data
# are used automatically. Only relevant when 'Types' is in noisers.
n_classes: null
# ---------------------------------------------------------------------------
# Dataset splits and augmentation
# ---------------------------------------------------------------------------
batch_size: 64
train_split: 0.9 # Fraction (float) or absolute count (int) for training
val_split: 0.1 # Fraction (float) or absolute count (int) for validation
mask: none # Masking strategy: none | MaskFixed
canonical_cell: false # Store cells in canonical lower-triangular form
repeat: null # Number of repetition levels (null = disabled)
repeat_epoch: null # Epochs between repetition-level increases
# ---------------------------------------------------------------------------
# Optimiser
# ---------------------------------------------------------------------------
lr: 0.0001
lr_factor: 0.95
lr_patience: 100
weight_decay: 0.0
eps: 0.00001
guidance_weight: -1.0
# ---------------------------------------------------------------------------
# Trainer / logging
# ---------------------------------------------------------------------------
epochs: -1 # -1 = unlimited (stop by max_time or manually)
max_time: 24 # Wall-clock limit in hours (null = no limit)
gradient_clip_val: 10.0
logger: tensorboard # tensorboard | wandb
log_dir: logs
project: agedi # WandB project name
name: agedi # WandB run display name
log_interval: 10
progress_bar: false
Noiser selection¶
The noisers list controls what is diffused. Choose based on your system:
Noiser |
Prior |
Distribution |
Use case |
|---|---|---|---|
|
StandardNormal |
Normal |
Gas-phase (molecules, clusters) |
|
UniformCell |
Normal |
Periodic bulk / surface (default) |
|
UniformCellConfined |
TruncatedNormal |
Surface overlayer/adsorbate |
You can combine position and type noisers, e.g.:
noisers:
- CellPositions
- Types
Example: surface system with Z-confinement¶
data_path: PdO_training_data.traj
noisers:
- ConfinedCellPositions
confinement: [2.0, 10.0]
mask: MaskFixed
max_time: 3 # hours
feature_size: 64
n_blocks: 4
Train with:
agedi train surface.yaml
Override the time limit on the fly:
agedi train surface.yaml max_time=6
Using train_from_config from Python¶
The same YAML file can be used directly from Python:
from agedi import train_from_config
diffusion, dataset, trainer = train_from_config("my_train.yaml")
Programmatic overrides are also supported by passing a dict:
from agedi import train_from_config
cfg = {
"data_path": "train.traj",
"noisers": ["CellPositions"],
"feature_size": 128,
"max_time": 6,
}
diffusion, dataset, trainer = train_from_config(cfg)