Skip to content

Configuration

For repeatable pipelines, collect your parameters into a YAML config file. Commands can be invoked with just a config file, just CLI options, or both. When both are provided, CLI options override config file values.

Config file structure

The config file has working_dir at the top level and parameters nested under command-specific keys:

# crowd.yaml
working_dir: "./scratch/crowd"

search:
  terms:
    - crowd
    - street crowd
    - urban crowd
  suffixes:
    - photography
    - candid
  engines:
    - flickr
    - brave

fetch:
  to: images/search1
  min_size: 1024

extract_frames:
  from: videos
  to: images/frames
  keyframes: 3

extract_faces:
  from: images/*
  to: faces
  skip_partial: true
  engine: dlib

cluster:
  from: faces
  to: cluster
  min_cluster_size: 20
  min_samples: 5
  top: 10
  clean: true

select:
  from:
    - cluster/000
    - cluster/001
    - cluster/003
  to: curated

analyze:
  from: curated
  phash: true
  blur: true

detect:
  from: curated
  classes:
    - microphone
    - chair
  threshold: 0.2

review:
  from: curated

dedup:
  from: curated
  threshold: 8

augment:
  from: curated
  to: final/1024
  flip_x: true

frame:
  from: final/1024
  to: final/512
  width: 512
  height: 512

Running with a config file

Pass the config file as the first argument to any command:

dtst search crowd.yaml
dtst fetch crowd.yaml
dtst extract-frames crowd.yaml
dtst extract-faces crowd.yaml
dtst cluster crowd.yaml
dtst select crowd.yaml
dtst analyze crowd.yaml
dtst detect crowd.yaml
dtst review crowd.yaml
dtst dedup crowd.yaml
dtst augment crowd.yaml
dtst frame crowd.yaml

CLI overrides

CLI options override the corresponding config values. This is useful for one-off adjustments:

# Use dlib instead of the config's mediapipe
dtst extract-faces crowd.yaml --engine dlib

# Stricter dedup threshold than the config
dtst dedup crowd.yaml --threshold 4

# Preview any command without executing
dtst select crowd.yaml --dry-run

Command-specific keys use underscores (e.g. extract_faces, extract_frames, flip_x), matching Python parameter names. The CLI uses hyphens (e.g. extract-faces, --flip-x). See the CLI reference for the complete list of options per command.