YAML-based training =================== The ``agedi train`` command accepts either a trajectory file **or** a YAML configuration file as its first argument, so you can choose whichever workflow suits your project. Quick start ----------- 1. Copy the bundled template:: cp $(python -c "import agedi; import pathlib; print(pathlib.Path(agedi.__file__).parent / 'conf' / 'train.yaml')") my_train.yaml 2. Edit ``my_train.yaml`` (at minimum set ``data_path``). 3. Run training:: agedi train my_train.yaml 4. Override individual keys without editing the file:: agedi train my_train.yaml feature_size=128 epochs=200 noisers=CellPositions Configuration file reference ----------------------------- A fully annotated template is reproduced below. Every key has a sensible default so you only need to set the values that differ from those defaults. .. code-block:: yaml # --------------------------------------------------------------------------- # Data # --------------------------------------------------------------------------- data_path: /path/to/train.traj # Required – ASE-readable file # Optional separate dataset used exclusively for regressor (force-field) training. # Structures here are only forwarded through the regressor loss, never the diffusion # loss. Useful for non-equilibrium structures (e.g. from MD or NEB). regressor_data_path: null # Optional – ASE-readable file # --------------------------------------------------------------------------- # Score-model architecture # --------------------------------------------------------------------------- model: PaiNN # Currently only PaiNN is supported cutoff: 6.0 # Neighbour-list cutoff in Å feature_size: 64 # Embedding / feature dimension n_blocks: 4 # Number of interaction blocks n_rbf: 30 # Number of radial basis functions # --------------------------------------------------------------------------- # Diffusion / noiser configuration # --------------------------------------------------------------------------- noisers: - CellPositions # One or more of: # Positions : StandardNormal prior + Normal (gas-phase clusters) # CellPositions : UniformCell prior + Normal (periodic bulk/surface) # ConfinedCellPositions : UniformCellConfined prior + TruncatedNormal (Z-confined) # Types : discrete atom-type diffusion # SDE for position noisers. # ve : Variance-Exploding SDE (default) # vp : Variance-Preserving SDE sde: ve # Property conditioning (optional). Set to "none" to disable. conditioning: none conditioning_type: scalar # scalar | integer # Z-confinement range [z_min, z_max] in Å – null to disable. # Required when using the 'ConfinedCellPositions' noiser. confinement: null # Train a force-field alongside the diffusion score. # Set to true when training data contains per-atom DFT forces and you want to # use force-field guidance during sampling. force_field: false # Number of element-type classes for the Types noiser (excluding the absorbing # state at index 0). When null, all distinct element types in the training data # are used automatically. Only relevant when 'Types' is in noisers. n_classes: null # --------------------------------------------------------------------------- # Dataset splits and augmentation # --------------------------------------------------------------------------- batch_size: 64 train_split: 0.9 # Fraction (float) or absolute count (int) for training val_split: 0.1 # Fraction (float) or absolute count (int) for validation mask: none # Masking strategy: none | MaskFixed canonical_cell: false # Store cells in canonical lower-triangular form repeat: null # Number of repetition levels (null = disabled) repeat_epoch: null # Epochs between repetition-level increases # --------------------------------------------------------------------------- # Optimiser # --------------------------------------------------------------------------- lr: 0.0001 lr_factor: 0.95 lr_patience: 100 weight_decay: 0.0 eps: 0.00001 guidance_weight: -1.0 # --------------------------------------------------------------------------- # Trainer / logging # --------------------------------------------------------------------------- epochs: -1 # -1 = unlimited (stop by max_time or manually) max_time: 24 # Wall-clock limit in hours (null = no limit) gradient_clip_val: 10.0 logger: tensorboard # tensorboard | wandb log_dir: logs project: agedi # WandB project name name: agedi # WandB run display name log_interval: 10 progress_bar: false Noiser selection ---------------- The ``noisers`` list controls what is diffused. Choose based on your system: .. list-table:: :header-rows: 1 :widths: 35 25 25 25 * - Noiser - Prior - Distribution - Use case * - ``Positions`` - StandardNormal - Normal - Gas-phase (molecules, clusters) * - ``CellPositions`` - UniformCell - Normal - Periodic bulk / surface (default) * - ``ConfinedCellPositions`` - UniformCellConfined - TruncatedNormal - Surface overlayer/adsorbate You can combine position and type noisers, e.g.: .. code-block:: yaml noisers: - CellPositions - Types Example: surface system with Z-confinement ------------------------------------------ .. code-block:: yaml data_path: PdO_training_data.traj noisers: - ConfinedCellPositions confinement: [2.0, 10.0] mask: MaskFixed max_time: 3 # hours feature_size: 64 n_blocks: 4 Train with: .. code-block:: console agedi train surface.yaml Override the time limit on the fly: .. code-block:: console agedi train surface.yaml max_time=6 Using ``train_from_config`` from Python ---------------------------------------- The same YAML file can be used directly from Python: .. code-block:: python from agedi import train_from_config diffusion, dataset, trainer = train_from_config("my_train.yaml") Programmatic overrides are also supported by passing a dict: .. code-block:: python from agedi import train_from_config cfg = { "data_path": "train.traj", "noisers": ["CellPositions"], "feature_size": 128, "max_time": 6, } diffusion, dataset, trainer = train_from_config(cfg)