-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Suggestions for supplementing the configuration file to achieve the best model. #787
Comments
Thank you @milaXT for raising this issue. Very much agree after working with you we can do a better job of documenting what config options give better models. I think a way we can achieve both is to have an additional callout box in the tutorial, that says something like "look at these config files (link to download) if you want to train better models, but know that these configs will take longer to run" Thanks also for documenting the key things we need to highlight for better performance.
Right, so we should have # SPECT_PARAMS: parameters for computing spectrograms
[vak.prep.spect_params]
# fft_size: size of window used for Fast Fourier Transform, in number of samples
fft_size = 1024
# step_size: size of step to take when computing spectra with FFT for spectrogram
# also known as hop size
step_size = 64
transform_type = "log_spect"
freq_cutoffs = [500, 8000] And then [vak.train.dataset.params]
# bigger windows = better.
# For frame classification models, prefer smaller batch sizes with bigger windows
window_size = 2000 I'm going to paste in the full toml we ended up using below, for reference # PREP: options for preparing dataset
[vak.prep]
# dataset_type: corresponds to the model family such as "frame classification" or "parametric umap"
dataset_type = "frame classification"
# input_type: input to model, either audio ("audio") or spectrogram ("spect")
input_type = "spect"
# data_dir: directory with data to use when preparing dataset
data_dir = "./fortrain2"
# output_dir: directory where dataset will be created (as a sub-directory within output_dir)
output_dir = "./prep"
# audio_format: format of audio, either wav or cbin
audio_format = "wav"
# annot_format: format of annotations
annot_format = "simple-seq"
# labelset: string or array with unique set of labels used in annotations
labelset = "abcde"
# train_dur: duration of training split in dataset, in seconds
train_dur = 1200
# val_dur: duration of validation split in dataset, in seconds
val_dur = 170
# test_dur: duration of test split in dataset, in seconds
test_dur = 350
# SPECT_PARAMS: parameters for computing spectrograms
[vak.prep.spect_params]
# fft_size: size of window used for Fast Fourier Transform, in number of samples
fft_size = 1024
# step_size: size of step to take when computing spectra with FFT for spectrogram
# also known as hop size
step_size = 64
transform_type = "log_spect"
# TRAIN: options for training model
[vak.train]
# root_results_dir: directory where results should be saved, as a sub-directory within `root_results_dir`
root_results_dir = "./result"
# batch_size: number of samples from dataset per batch fed into network
batch_size = 16
# num_epochs: number of training epochs, where an epoch is one iteration through all samples in training split
num_epochs = 10
# standardize_frames: if true, standardize (normalize) frames (input to neural network) per frequency bin, so mean of each is 0.0 and std is 1.0
# across the entire training split
standardize_frames = true
# val_step: step number on which to compute metrics with validation set, every time step % val_step == 0
# (a step is one batch fed through the network)
# saves a checkpoint if the monitored evaluation metric improves (which is model specific)
val_step = 1000
# ckpt_step: step number on which to save a checkpoint (as a backup, regardless of validation metrics)
ckpt_step = 500
# patience: number of validation steps to wait before stopping training early
# if the monitored evaluation metrics does not improve after `patience` validation steps,
# then we stop training
patience = 6
# num_workers: number of workers to use when loading data with multiprocessing
num_workers = 4
# device: name of device to run model on, one of "cuda", "cpu"
# dataset_path : path to dataset created by prep. This will be added when you run `vak prep`, you don't have to add it
# dataset.params = parameters used for datasets
# for a frame classification model, we use dataset classes with a specific `window_size`
[vak.train.dataset]
path = "prep/fortrain2-vak-frame-classification-dataset-generated-250305_124445"
[vak.train.dataset.params]
window_size = 2000
# To indicate the model to train, we use a "dotted key" with `model` followed by the string name of the model.
# This name must be a name within `vak.models` or added e.g. with `vak.model.decorators.model`
# We use another dotted key to indicate options for configuring the model, e.g. `TweetyNet.optimizer`
[vak.train.model.TweetyNet.optimizer]
# vak.train.model.TweetyNet.optimizer: we specify options for the model's optimizer in this table
# lr: the learning rate
lr = 0.001
# TweetyNet.network: we specify options for the model's network in this table
[vak.train.model.TweetyNet.network]
# hidden_size: the number of elements in the hidden state in the recurrent layer of the network
hidden_size = 256
# this sub-table configures the `lightning.pytorch.Trainer`
[vak.train.trainer]
# setting to 'gpu' means "train models on 'gpu' (not 'cpu')"
accelerator = "gpu"
# use the first GPU (numbering starts from 0)
devices = [0] |
@milaXT would you be willing to make a pull request adding this callout to the autoannotate.md page? The steps you'd follow are in the contributor's guide: https://vak.readthedocs.io/en/latest/development/contributors.html You'd want to add the callout right at this line after the set-up steps: vak/doc/get_started/autoannotate.md Line 79 in afc3f0c It should say something like this:
And then you'd add that file in the |
During the process of training the model, David @NickleDave helped me discover some flexible adjustments that could be made in the config file, which ultimately led to the model performing better. We believe that these adjustable parameters might need to be added to the config file to help others achieve better model performance:
The text was updated successfully, but these errors were encountered: