Data science is a science because it involves experimentation. The best machine learning models, hyperparameters, and architectures cannot (typically) be known *a priori*. The data scientist must experiment to find the best approach.
The first requirement then is an evaluation framework. Before even starting to code, the data scientist will select one or more evaluation metrics. If a simple metric is not available, an evaluation approach should be defined. All experiments will be measured against this metric or approach.
Next, the data scientist must set up a system for tracking experiments conducted. Which model was used, with which hyperparameters, in which architectures? Configuration files can help both track experiments and quickly adjust the machine learning approach.
Configuration files are typically written in either YAML (most popular in ML), TOML or JSON.
For more power, consider one of these libraries or services.
- Hydra: lets you compose configs from the command line and keeps automatic run dirs
- OmegaConf: typed configs, works with Hydra
- MLflow: track configs + metrics + artifacts automatically
- [[Weights and Biases]]: like MLflow but for deep learning
- Sacred: lightweight experiment tracker
## YAML configuration file
[[YAML]] is a popular choice for writing configuration files.
Create `config.yml` with your configuration settings in the root of your project.
```YAML
experiment:
name: "bert-ft-lr2e-5"
seed: 42
model:
pretrained: "bert-base-uncased"
num_labels: 20
training:
batch_size: 16
epochs: 4
learning_rate: 2e-5
eval_strategy: "epoch"
data:
train_file: "./data/train.csv"
val_file: "./data/val.csv"
```
In your `train.py` file, load the configuration with the `yaml` library and read in configuration settings.
```python
import yaml
from pathlib import Path
def load_config(path: str):
with open(path, "r") as f:
return yaml.safe_load(f)
cfg = load_config("configs/experiment1.yaml")
lr = cfg["training"]["learning_rate"]
```
At the start of each run,
1. copy the config into the results folder so you always know exactly what parameters produced those results.
2. copy the config under `configs/` (renaming with the timestamp and experiment name).
Each experiment config is a snapshot of how you trained, allowing fast repeats of previous experiments.
```
.
config.yaml # current run
results/
+-- 2025-08-28-bert-ft-lr2e-5/
+-- config.yaml
+-- metrics.json
+-- logs.txt
+-- checkpoint.pt
configs/
+-- 2025-08-28-bert-ft-lr2e-5.yaml # renamed and archived
```
Use `argparse` to run `training.py` from the command line and easily swap configs.
```python
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--config", required=True)
args = parser.parse_args()
cfg = load_config(args.config)
```
Run the script with
```bash
python train.py --config config.yaml
```
## TOML configuration file
[[TOML]] is another popular format for configuration files.
Create a `config.toml` file in the root of your directory.
```toml
[llm]
model = "gemma3"
base_url = "http://localhost:11434/v1/chat/completions"
[chunking]
chunk_size = 100 # Number of words per chunk
```
In `train.py` read the configuration file and settings with the `tomli` library.
```python
import tomli
# load config file
with open(config_file, "rb") as f:
config = tomli.load(f)
# typical configuration
model = config["llm"]["model"]
base_url = config["llm"]["base_url"]
# configuration with fallback values
chunk_size = config.get("chunking", {}).get("chunk_size", 500)
```
The same patterns used for YAML configuration files (copying configuration files on run, passing configuration files with the command line) can also be used for TOML configuration files.