Command-line interface
======================

AGeDi installs a CLI entrypoint named ``agedi``.

Discover commands
-----------------

.. code-block:: console

   agedi --help

Main commands
-------------

- ``agedi train``: train a diffusion model from a trajectory file or YAML config
- ``agedi sample``: sample structures from a saved training run
- ``agedi predict``: predict energies and forces for input structures (requires ``--force_field`` training)
- ``agedi inspect``: print ``hparams.yaml`` from a run directory

To get information about options for each use

.. code-block:: console

   agedi train --help

for ``train`` and likewise for ``sample``, ``predict``, and ``inspect``.

Training
--------

Choose one of the three position noisers to match your system type:

.. list-table:: Position noisers
   :header-rows: 1
   :widths: 30 25 25 20

   * - Noiser
     - Prior
     - Distribution
     - Use case
   * - ``Positions``
     - StandardNormal
     - Normal
     - Gas-phase (molecules, clusters)
   * - ``CellPositions``
     - UniformCell
     - Normal
     - Periodic bulk / surface (default)
   * - ``ConfinedCellPositions``
     - UniformCellConfined
     - TruncatedNormal
     - Surface overlayer/adsorbate

Minimal training example for a **surface system with Z-confinement**:

.. code-block:: console

   agedi train --noisers ConfinedCellPositions --mask MaskFixed --confinement 2 10 training_data.traj

Minimal training example for a **periodic bulk or surface** system:

.. code-block:: console

   agedi train --noisers CellPositions training_data.traj

Minimal training example for a **periodic bulk system with atomic types diffusion**:

.. code-block:: console

   agedi train --noisers CellPositions,Types training_data.traj

Minimal training example for a **gas-phase cluster**:

.. code-block:: console

   agedi train --noisers Positions training_data.traj

Important options:

- ``--max_time_minutes/-t`` or ``--epochs/-e``: stopping criteria (use ``-T`` to specify time in hours instead of minutes)
- ``--noisers``: ``CellPositions`` (default), ``ConfinedCellPositions``, ``Positions``, ``Types``.
  Accepts a comma-separated list to specify multiple noisers in one flag (e.g. ``--noisers ConfinedCellPositions,Types``),
  or repeat the flag (e.g. ``--noisers ConfinedCellPositions --noisers Types``).
- ``--sde``: ``ve`` (default), ``vp``
- ``--mask MaskFixed``: freezes atoms tagged with ASE ``FixAtoms``
- ``--confinement zmin zmax``: z-direction confinement bounds (required for ``ConfinedCellPositions``)
- ``--n_classes N``: restrict the ``Types`` noiser vocabulary to the first *N* element types
  (sorted by atomic number); defaults to all distinct types found in the training data
- ``--canonical_cell``: store unit cells in canonical lower-triangular form
- ``--force_field``: train a force-field head jointly with the diffusion score (see below)

Continue training from a checkpoint
-------------------------------------

To resume an interrupted run or continue fine-tuning on new data, pass
``--checkpoint`` with either a run directory or a specific ``.ckpt`` file:

.. code-block:: console

   # Resume the last checkpoint of a previous run (same data)
   agedi train training_data.traj --checkpoint logs/agedi/version_0

   # Resume from a specific checkpoint file
   agedi train training_data.traj --checkpoint logs/agedi/version_0/checkpoints/best_model.ckpt

   # Fine-tune on new data starting from a previous checkpoint
   agedi train new_data.traj --checkpoint logs/agedi/version_0

In all cases the model architecture and weights are loaded from the checkpoint,
and the full training state (optimiser, LR-scheduler, epoch counter) is
restored.  Combine with ``--epochs`` or ``--max_time`` to control how long
the continued run should train.

When using a config file, set the ``checkpoint`` key:

.. code-block:: yaml

   data_path: training_data.traj
   checkpoint: logs/agedi/version_0   # or a .ckpt file path

From Python:

.. code-block:: python

   from agedi import train_from_atoms

   diffusion, dataset, trainer = train_from_atoms(
       data,
       checkpoint="logs/agedi/version_0",
       epochs=100,
   )


Sampling
--------

.. code-block:: console

   agedi sample logs/agedi/version_0 -f Pd2O2 --template_path template.traj --steps 500 --confinement 2 10

This samples using the ``last_model.ckpt`` checkpoint found in
``logs/agedi/version_0``. If you want to use a different checkpoint, you can
specify the exact path to it.

Important options:

- ``-f/--formula`` or ``-a/--n_atoms``
- ``--template_path`` for template-guided generation
- ``--steps``, ``--eps`` for reverse diffusion resolution
- ``--save_trajectory``: save the full reverse-diffusion trajectory for each sample
  (one file per sample rather than only the final structures)
- ``--print_timings``: print a per-stage timing breakdown after each sampling batch
  (useful for profiling GPU bottlenecks)
- ``--compile``: compile the reverse-diffusion step with ``torch.compile`` for faster
  GPU sampling; neighbor-list buffer sizes are estimated automatically (requires
  NVIDIA nvalchemiops)

Force-field guided training and sampling
-----------------------------------------

To also train a forces prediction head alongside the diffusion model, add the
``--force_field`` flag during training:

.. code-block:: console

   agedi train --noisers ConfinedCellPositions --mask MaskFixed --confinement 2 10 --force_field training_data.traj

The training data must contain DFT (or other source) per-atom forces
and total energy (e.g. loaded from a
VASP/GPAW calculation via ASE).  The force field is trained jointly with the
diffusion score.

**Regressor-only dataset**

You can optionally supply a second dataset that is used *exclusively* to train
the force-field head — its structures are never passed through the diffusion
loss.  This is useful when you have non-equilibrium structures (e.g. from MD
or NEB calculations) that would be unsuitable as diffusion training targets
but contain valuable force/energy information for the regressor:

.. code-block:: console

   agedi train --noisers ConfinedCellPositions --mask MaskFixed --confinement 2 10 --force_field training_data.traj

and in ``train.yaml``:

.. code-block:: yaml

   data_path: training_data.traj
   force_field: true
   regressor_data_path: nonequilibrium_data.traj

Or from Python:

.. code-block:: python

   from agedi import train_from_atoms

   diffusion, dataset, trainer = train_from_atoms(
       equilibrium_structures,
       force_field=True,
       regressor_data=nonequilibrium_structures,
   )

Once training is complete, force-field guidance can be used during sampling
via the ``--ff_guidance`` option:

.. code-block:: console

   agedi sample logs/agedi/version_0 -f Pd2O2 --ff_guidance 5.0

- ``--ff_guidance``: guidance scale (``0`` = disabled, ``> 0`` enables guidance).
  Higher values increase the influence of the predicted forces on the generated structures.
- ``--ff_zeta``: time-weight exponent (default ``3.0``).
  Higher values concentrate guidance near the end of the reverse trajectory.

In Python this is equivalent to:

.. code-block:: python

   from agedi.functional import load_diffusion, sample
   from agedi.diffusion import ForcefieldGuidanceConfig

   diffusion = load_diffusion("logs/agedi/version_0")
   structures = sample(
       diffusion,
       n_samples=10,
       formula="Pd2O2",
       ff_guidance=ForcefieldGuidanceConfig(
           guidance=5.0,
           zeta=3.0,
           force_threshold=0.05,   # max per-atom force (eV/Å) for post-diffusion relaxation
           max_extra_steps=0,      # number of extra relaxation steps after the trajectory
       ),
   )

``ForcefieldGuidanceConfig`` fields:

- ``guidance`` (float): guidance scale; ``0.0`` disables guidance entirely.
- ``zeta`` (float): time-weight exponent ``(1-t)**zeta``; default ``3.0``.
- ``force_threshold`` (float): convergence criterion (max per-atom force in eV/Å)
  for the optional post-diffusion relaxation; default ``0.05``.
- ``max_extra_steps`` (int): maximum extra relaxation steps performed after the main
  diffusion trajectory when ``guidance > 0``; default ``0`` (disabled).

Predicting energies and forces
-------------------------------

When the model has been trained with ``--force_field``, you can run energy
and force predictions on existing structures with ``agedi predict``:

.. code-block:: console

   agedi predict logs/agedi/version_0 structures.traj

This reads all structures from ``structures.traj``, runs the force-field
regressor, and writes the results (with predicted energies and forces
attached as an ASE ``SinglePointCalculator``) to ``predicted.traj`` in
the current directory.

Important options:

- ``-o/--output``: directory to save the output file (default: ``.``)
- ``--name``: base name for the output file (default: ``predicted``)
- ``-b/--batch_size``: number of structures per inference batch (default: ``64``)

In Python this is equivalent to:

.. code-block:: python

   from ase.io import read, write
   from agedi import load_diffusion, predict

   diffusion = load_diffusion("logs/agedi/version_0")
   structures = read("structures.traj", index=":")
   predicted = predict(diffusion, structures)
   write("predicted.traj", predicted)

Inspect run metadata
--------------------

.. code-block:: console

   agedi inspect logs/agedi/version_0

This prints the saved hyperparameters from the run directory (for example, the parsed contents of ``hparams.yaml``).

Logging options
--------------------

AGeDi saves TensorBoard logs by default. WandB can be saved instead
using the ``--logger wandb`` option when training.

To follow training use

.. code-block:: console

   tensorboard --logdir .

This hosts TensorBoard at localhost. Remember to forward a specific
port to your local machine if using HPC. You can use the ``--port
xxxx`` option for TensorBoard to host at this specific port.