-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to reproduce training and evaluation as done in the paper? #150
Comments
Hello! First of all, thank you for your great work! I want to replicate your fine-tuning results, where you fine-tuned the T5-small model independently for each zero shot dataset. However, I don’t see the option in the training configuration to fit one of the zero-shot datasets to the model by simply providing a reference to the dataset, as is done in the evaluation configuration. So, I assume that I need to preprocess each dataset and convert it into Arrow arrays to make them suitable for the training pipeline, correct? If so, did you use the same hyperparameters (prediction horizon, etc.) for these datasets during fine-tuning as you did in the evaluation configuration? Thanks in advance for your response! |
Hi @ChernovAndrey! That's correct. You will need to preprocess the dataset yourself. For per-dataset fine-tuning, we used the same parameters for prediction length as the evaluation config. Note that for training, you would need the
Here the |
This code seems to work to build the TSMixup data locally for training: import datasets
from pathlib import Path
from typing import List, Optional, Union
import numpy as np
from gluonts.dataset.arrow import ArrowWriter
def convert_to_arrow(
path: Union[str, Path],
time_series: Union[List[np.ndarray], np.ndarray],
start_times: Optional[Union[List[np.datetime64], np.ndarray]] = None,
compression: str = "lz4",
):
if start_times is None:
# Set an arbitrary start time
start_times = [np.datetime64("2000-01-01 00:00", "s")] * len(time_series)
assert len(time_series) == len(start_times)
dataset = [
{"start": start, "target": ts} for ts, start in zip(time_series, start_times)
]
ArrowWriter(compression=compression).write_to_file(
dataset,
path=path,
)
# Get the HF dataset in their format
ds = datasets.load_dataset("autogluon/chronos_datasets", "m4_daily", split="train")
ds.set_format("numpy")
# Extract values
# start_times = [ds[i]['timestamp'] for i in range(len(ds))]
time_series_values = [ds[i]['target'] for i in range(len(ds))]
assert len(time_series_values) == len(ds)
convert_to_arrow("./tsmixup-data.arrow", time_series=time_series_values, start_times=None) |
Please check the updated README. We have also released an evaluation script and backtest configs to compute the WQL and MASE numbers as reported in the paper.
The scripts for training and evaluating Chronos models are included in the scripts folder, see also the README therein. The data used is available on the HuggingFace Hub.
The text was updated successfully, but these errors were encountered: