Skip to content

kanojikajino/luigine

Repository files navigation

Luigine

Library for developing an engine using luigi. Key features are:

  • auto-naming capability defines the output file name from task parameters.
  • load_output offers an interface to load the output object from a parental task.
  • Hyperparameter tuning task

Dependency

  • luigi
  • numpy
  • sklearn

Installation

pip install .

Example

See example.py.

There are four tasks in example.py:

  1. DataPreprocessing task prepares training and validation data sets.
  2. Train task, given the result of DataPreprocessing, trains a ridge regression model using the training data.
  3. PerformanceEvaluation task, given the results of the above two tasks, evaluates the model on the validation set.
  4. HyperparameterOptimization task optimizes the parameters included in PerformanceEvaluation task so as to minimize the validation loss.

The following command is used to run HyperparameterOptimization and evaluate the test score:

python example.py TestPerformanceEvaluation --working-dir example_working_dir

and the results (sqlite db) are stored under example_working_dir/OUTPUT/TestPerformanceEvaluation. Log is stored in ENGLOG/engine.log

How To Use AutoNamingTask

  1. All tasks must inherit AutoNamingTask, instead of luigi.Task.
  2. Task dependencies are described in requires in the same way as luigi, except that requires must return a list of task instances even if it depends on a single task.
  3. Task process should be described in run_task (instead of run), whose
    • input is input_list, a list of output Python objects of dependent tasks, and
    • output is Python objects of this task's computation results.
  4. The above Python objects are then processed by save_output to be pickled and gzipped.
    • If the user wants to choose other data formats, please implement save_output and load_output, where save_output is used to save the output objects of run_task, and load_output is used to load the Python objects for further processing.
    • A class variable output_ext specifies the file extension.
    • Please set it as output_ext = luigi.Parameter('[your file extension]')

Collaborators

  • Hiroshi Kajino
  • Takeshi Teshima

About

Utilities to develop an engine using luigi

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages