diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 96d07a60e..631211cd9 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.3","generation_timestamp":"2024-05-22T04:53:38","documenter_version":"1.4.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-06T02:27:12","documenter_version":"1.4.1"}} \ No newline at end of file diff --git a/dev/about_mlj/index.html b/dev/about_mlj/index.html index 55bdb1bd3..68ec3da4f 100644 --- a/dev/about_mlj/index.html +++ b/dev/about_mlj/index.html @@ -1,5 +1,5 @@ -About MLJ · MLJ

About MLJ

MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing over 180 machine learning models written in Julia and other languages. In particular MLJ wraps a large number of scikit-learn models.

MLJ is released under the MIT license.

Lightning tour

For help learning to use MLJ, see Learning MLJ.

A self-contained notebook and julia script of this demonstration is also available here.

The first code snippet below creates a new Julia environment MLJ_tour and installs just those packages needed for the tour. See Installation for more on creating a Julia environment for use with MLJ.

Julia installation instructions are here.

using Pkg
+About MLJ · MLJ

About MLJ

MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing over 180 machine learning models written in Julia and other languages. In particular MLJ wraps a large number of scikit-learn models.

MLJ is released under the MIT license.

Lightning tour

For help learning to use MLJ, see Learning MLJ.

A self-contained notebook and julia script of this demonstration is also available here.

The first code snippet below creates a new Julia environment MLJ_tour and installs just those packages needed for the tour. See Installation for more on creating a Julia environment for use with MLJ.

Julia installation instructions are here.

using Pkg
 Pkg.activate("MLJ_tour", shared=true)
 Pkg.add("MLJ")
 Pkg.add("MLJIteration")
@@ -54,4 +54,4 @@
       eprint={2012.15505},
       archivePrefix={arXiv},
       primaryClass={cs.LG}
-}
+}
diff --git a/dev/acceleration_and_parallelism/index.html b/dev/acceleration_and_parallelism/index.html index edc7ad4b8..97f76ced3 100644 --- a/dev/acceleration_and_parallelism/index.html +++ b/dev/acceleration_and_parallelism/index.html @@ -1,2 +1,2 @@ -Acceleration and Parallelism · MLJ

Acceleration and Parallelism

User-facing interface

To enable composable, extensible acceleration of core MLJ methods, ComputationalResources.jl is utilized to provide some basic types and functions to make implementing acceleration easy. However, ambitious users or package authors have the option to define their own types to be passed as resources to acceleration, which must be <:ComputationalResources.AbstractResource.

Methods which support some form of acceleration support the acceleration keyword argument, which can be passed a "resource" from ComputationalResources. For example, passing acceleration=CPUProcesses() will utilize Distributed's multiprocessing functionality to accelerate the computation, while acceleration=CPUThreads() will use Julia's PARTR threading model to perform acceleration.

The default computational resource is CPU1(), which is simply serial processing via CPU. The default resource can be changed as in this example: MLJ.default_resource(CPUProcesses()). The argument must always have type <:ComputationalResource.AbstractResource. To inspect the current default, use MLJ.default_resource().

Note

You cannot use CPUThreads() with models wrapping python code.

+Acceleration and Parallelism · MLJ

Acceleration and Parallelism

User-facing interface

To enable composable, extensible acceleration of core MLJ methods, ComputationalResources.jl is utilized to provide some basic types and functions to make implementing acceleration easy. However, ambitious users or package authors have the option to define their own types to be passed as resources to acceleration, which must be <:ComputationalResources.AbstractResource.

Methods which support some form of acceleration support the acceleration keyword argument, which can be passed a "resource" from ComputationalResources. For example, passing acceleration=CPUProcesses() will utilize Distributed's multiprocessing functionality to accelerate the computation, while acceleration=CPUThreads() will use Julia's PARTR threading model to perform acceleration.

The default computational resource is CPU1(), which is simply serial processing via CPU. The default resource can be changed as in this example: MLJ.default_resource(CPUProcesses()). The argument must always have type <:ComputationalResource.AbstractResource. To inspect the current default, use MLJ.default_resource().

Note

You cannot use CPUThreads() with models wrapping python code.

diff --git a/dev/adding_models_for_general_use/index.html b/dev/adding_models_for_general_use/index.html index bcf1f9811..36b4e8024 100644 --- a/dev/adding_models_for_general_use/index.html +++ b/dev/adding_models_for_general_use/index.html @@ -1,2 +1,2 @@ -Adding Models for General Use · MLJ
+Adding Models for General Use · MLJ
diff --git a/dev/api/index.html b/dev/api/index.html index ca5710ca6..498fee261 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,2 +1,2 @@ -Index of Methods · MLJ

Index of Methods

+Index of Methods · MLJ

Index of Methods

diff --git a/dev/benchmarking/index.html b/dev/benchmarking/index.html index 8bd7dd0ed..b021b6b3b 100644 --- a/dev/benchmarking/index.html +++ b/dev/benchmarking/index.html @@ -1,2 +1,2 @@ -Benchmarking · MLJ
+Benchmarking · MLJ
diff --git a/dev/common_mlj_workflows/index.html b/dev/common_mlj_workflows/index.html index 63748d3c5..d590c323e 100644 --- a/dev/common_mlj_workflows/index.html +++ b/dev/common_mlj_workflows/index.html @@ -1,10 +1,10 @@ -Common MLJ Workflows · MLJ

Common MLJ Workflows

This demo assumes you have certain packages in your active package environment. To activate a new environment, "MyNewEnv", with just these packages, do this in a new REPL session:

using Pkg
+Common MLJ Workflows · MLJ

Common MLJ Workflows

This demo assumes you have certain packages in your active package environment. To activate a new environment, "MyNewEnv", with just these packages, do this in a new REPL session:

using Pkg
 Pkg.activate("MyNewEnv")
 Pkg.add(["MLJ", "RDatasets", "DataFrames", "MLJDecisionTreeInterface",
     "MLJMultivariateStatsInterface", "NearestNeighborModels", "MLJGLMInterface",
     "Plots"])

The following starts MLJ and shows the current version of MLJ (you can also use Pkg.status()):

using MLJ
-MLJ_VERSION
v"0.20.5"

Data ingestion

import RDatasets
+MLJ_VERSION
v"0.20.6"

Data ingestion

import RDatasets
 channing = RDatasets.dataset("boot", "channing")
first(channing, 4) |> pretty
┌──────────────────────────────────┬───────┬───────┬───────┬───────┐
 │ Sex                              │ Entry │ Exit  │ Time  │ Cens  │
 │ CategoricalValue{String, UInt32} │ Int32 │ Int32 │ Int32 │ Int32 │
@@ -61,7 +61,7 @@
  "setosa"
  "setosa"
  "setosa"

Splitting data vertically after row shuffling:

channing_train, channing_test = partition(channing, 0.6, rng=123);

Or, if already horizontally split:

(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.6, multi=true, rng=123)
(((sepal_length = [6.7, 5.7, 7.2, 4.4, 5.6, 6.5, 4.4, 6.1, 5.4, 4.9  …  6.4, 5.5, 5.4, 4.8, 6.5, 4.9, 6.5, 6.7, 5.6, 6.4], sepal_width = [3.3, 2.8, 3.0, 2.9, 2.5, 3.0, 3.0, 2.9, 3.9, 2.5  …  3.1, 2.3, 3.7, 3.1, 3.0, 2.4, 2.8, 3.3, 2.9, 2.8], petal_length = [5.7, 4.1, 5.8, 1.4, 3.9, 5.2, 1.3, 4.7, 1.7, 4.5  …  5.5, 4.0, 1.5, 1.6, 5.5, 3.3, 4.6, 5.7, 3.6, 5.6], petal_width = [2.1, 1.3, 1.6, 0.2, 1.1, 2.0, 0.2, 1.4, 0.4, 1.7  …  1.8, 1.3, 0.2, 0.2, 1.8, 1.0, 1.5, 2.5, 1.3, 2.2]), (sepal_length = [6.0, 5.8, 6.7, 5.1, 5.0, 6.3, 5.7, 6.4, 6.1, 5.0  …  6.4, 6.8, 6.9, 6.1, 6.7, 5.0, 7.6, 6.3, 5.1, 5.0], sepal_width = [2.7, 2.6, 3.0, 3.8, 3.4, 2.8, 2.5, 3.2, 2.8, 3.5  …  2.7, 3.2, 3.1, 2.8, 2.5, 3.5, 3.0, 2.5, 3.8, 3.6], petal_length = [5.1, 4.0, 5.2, 1.9, 1.5, 5.1, 5.0, 4.5, 4.7, 1.6  …  5.3, 5.9, 5.4, 4.0, 5.8, 1.3, 6.6, 5.0, 1.6, 1.4], petal_width = [1.6, 1.2, 2.3, 0.4, 0.2, 1.5, 2.0, 1.5, 1.2, 0.6  …  1.9, 2.3, 2.1, 1.3, 1.8, 0.3, 2.1, 1.9, 0.2, 0.2])), (CategoricalValue{String, UInt32}["virginica", "versicolor", "virginica", "setosa", "versicolor", "virginica", "setosa", "versicolor", "setosa", "virginica"  …  "virginica", "versicolor", "setosa", "setosa", "virginica", "versicolor", "versicolor", "virginica", "versicolor", "virginica"], CategoricalValue{String, UInt32}["versicolor", "versicolor", "virginica", "setosa", "setosa", "virginica", "virginica", "versicolor", "versicolor", "setosa"  …  "virginica", "virginica", "virginica", "versicolor", "virginica", "setosa", "virginica", "virginica", "setosa", "setosa"]))

Reference: Model Search

Searching for a supervised model:

X, y = @load_boston
-ms = models(matching(X, y))
70-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+ms = models(matching(X, y))
70-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = ARDRegressor, package_name = MLJScikitLearnInterface, ... )
  (name = AdaBoostRegressor, package_name = MLJScikitLearnInterface, ... )
  (name = BaggingRegressor, package_name = MLJScikitLearnInterface, ... )
@@ -85,6 +85,7 @@
  package_name = "MLJModels",
  is_supervised = true,
  abstract_type = Probabilistic,
+ constructor = nothing,
  deep_properties = (),
  docstring = "```\nConstantRegressor\n```\n\nThis \"dummy\" probabilis...",
  fit_data_scitype = Tuple{Table, AbstractVector{Continuous}},
@@ -99,7 +100,7 @@
  iteration_parameter = nothing,
  load_path = "MLJModels.ConstantRegressor",
  package_license = "MIT",
- package_url = "https://github.com/alan-turing-institute/MLJModels.jl",
+ package_url = "https://github.com/JuliaAI/MLJModels.jl",
  package_uuid = "d491faf4-2d78-11e9-2867-c94bc002c0b7",
  predict_scitype = AbstractVector{ScientificTypesBase.Density{Continuous}},
  prediction_type = :probabilistic,
@@ -112,7 +113,7 @@
  transform_scitype = Unknown,
  input_scitype = Table,
  target_scitype = AbstractVector{Continuous},
- output_scitype = Unknown)
models("Tree")
28-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+ output_scitype = Unknown)
models("Tree")
28-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = ABODDetector, package_name = OutlierDetectionNeighbors, ... )
  (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
  (name = COFDetector, package_name = OutlierDetectionNeighbors, ... )
@@ -136,7 +137,7 @@
     matching(model, X, y) &&
     model.prediction_type == :deterministic &&
     model.is_pure_julia
-end;

Searching for an unsupervised model:

models(matching(X))
63-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+end;

Searching for an unsupervised model:

models(matching(X))
63-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = ABODDetector, package_name = OutlierDetectionNeighbors, ... )
  (name = ABODDetector, package_name = OutlierDetectionPython, ... )
  (name = AffinityPropagation, package_name = MLJScikitLearnInterface, ... )
@@ -161,6 +162,7 @@
  package_name = "MultivariateStats",
  is_supervised = true,
  abstract_type = Deterministic,
+ constructor = nothing,
  deep_properties = (),
  docstring = "```\nRidgeRegressor\n```\n\nA model type for construct...",
  fit_data_scitype =
@@ -254,8 +256,8 @@
   rng = Random._GLOBAL_RNG())

Bind the model and data together in a machine, which will additionally, store the learned parameters (fitresults) when fit:

mach = machine(tree, X, y)
untrained Machine; caches model-specific representations of data
   model: DecisionTreeClassifier(max_depth = 2, …)
   args: 
-    1:	Source @161 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @320 ⏎ AbstractVector{Multiclass{2}}
+    1:	Source @906 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @359 ⏎ AbstractVector{Multiclass{2}}
 

Split row indices into training and evaluation rows:

train, test = partition(eachindex(y), 0.7); # 70:30 split
([1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  131, 132, 133, 134, 135, 136, 137, 138, 139, 140], [141, 142, 143, 144, 145, 146, 147, 148, 149, 150  …  191, 192, 193, 194, 195, 196, 197, 198, 199, 200])

Fit on the train data set and evaluate on the test data set:

fit!(mach, rows=train)
 yhat = predict(mach, X[test,:])
 LogLoss(tol=1e-4)(yhat, y[test])
1.0788055664326648

Note LogLoss() has aliases log_loss and cross_entropy.

Predict on the new data set:

Xnew = (FL = rand(3), RW = rand(3), CL = rand(3), CW = rand(3), BD = rand(3))
@@ -328,14 +330,14 @@
 ┌───┬──────────────────────┬──────────────┬─────────────┐
 │   │ measure              │ operation    │ measurement │
 ├───┼──────────────────────┼──────────────┼─────────────┤
-│ A │ LogLoss(             │ predict      │ 4.79        │
+│ A │ LogLoss(             │ predict      │ 4.81        │
 │   │   tol = 2.22045e-16) │              │             │
 │ B │ Accuracy()           │ predict_mode │ 0.736       │
 └───┴──────────────────────┴──────────────┴─────────────┘
 ┌───┬───────────────────────┬─────────┐
 │   │ per_fold              │ 1.96*SE │
 ├───┼───────────────────────┼─────────┤
-│ A │ [5.1, 6.48, 3.01]     │ 2.42    │
+│ A │ [5.1, 6.48, 3.07]     │ 2.38    │
 │ B │ [0.696, 0.739, 0.769] │ 0.0513  │
 └───┴───────────────────────┴─────────┘
 

Changing a hyperparameter and re-evaluating:

tree.max_depth = 3
@@ -371,20 +373,20 @@
 mach =  machine(ols, X, y) |> fit!
trained Machine; caches model-specific representations of data
   model: LinearRegressor(fit_intercept = true, …)
   args: 
-    1:	Source @404 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @045 ⏎ AbstractVector{Continuous}
+    1:	Source @645 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @770 ⏎ AbstractVector{Continuous}
 

Get a named tuple representing the learned parameters, human-readable if appropriate:

fitted_params(mach)
(features = [:x1, :x2],
- coef = [0.9888952836446506, -2.004049829561238],
- intercept = 0.05769536670984801,)

Get other training-related information:

report(mach)
(stderror = [0.007604565930774463, 0.010189538030015845, 0.010246024389886282],
+ coef = [1.0019484491170485, -2.0124583022683673],
+ intercept = 0.05839955441025016,)

Get other training-related information:

report(mach)
(stderror = [0.007289046266540614, 0.009314702321351547, 0.009664751997931865],
  dof_residual = 97.0,
- vcov = [5.7829422995495684e-5 -4.839517622172482e-5 -4.824455318623773e-5; -4.839517622172482e-5 0.0001038266852651392 -9.317757798200271e-6; -4.824455318623773e-5 -9.317757798200271e-6 0.00010498101579814456],
- deviance = 0.08456127941358121,
+ vcov = [5.3130195475769666e-5 -4.737168085144333e-5 -4.924311372223852e-5; -4.737168085144333e-5 8.676367933539189e-5 1.3161433444949447e-5; -4.924311372223852e-5 1.3161433444949447e-5 9.340743118152798e-5],
+ deviance = 0.07678443409575168,
  coef_table = ──────────────────────────────────────────────────────────────────────────────
                   Coef.  Std. Error        t  Pr(>|t|)   Lower 95%   Upper 95%
 ──────────────────────────────────────────────────────────────────────────────
-(Intercept)   0.0576954  0.00760457     7.59    <1e-10   0.0426024   0.0727883
-x1            0.988895   0.0101895     97.05    <1e-97   0.968672    1.00912
-x2           -2.00405    0.010246    -195.59    <1e-99  -2.02439    -1.98371
+(Intercept)   0.0583996  0.00728905     8.01    <1e-11   0.0439328   0.0728663
+x1            1.00195    0.0093147    107.57    <1e-99   0.983461    1.02044
+x2           -2.01246    0.00966475  -208.23    <1e-99  -2.03164    -1.99328
 ──────────────────────────────────────────────────────────────────────────────,)

Basic fit/transform for unsupervised models

Load data:

X, y = @load_iris  # a table and a vector
 train, test = partition(eachindex(y), 0.97, shuffle=true, rng=123)
([125, 100, 130, 9, 70, 148, 39, 64, 6, 107  …  110, 59, 139, 21, 112, 144, 140, 72, 109, 41], [106, 147, 47, 5])

Instantiate and fit the model/machine:

PCA = @load PCA
 pca = PCA(maxoutdim=2)
@@ -392,12 +394,12 @@
 fit!(mach, rows=train)
trained Machine; caches model-specific representations of data
   model: PCA(maxoutdim = 2, …)
   args: 
-    1:	Source @151 ⏎ Table{AbstractVector{Continuous}}
+    1:	Source @053 ⏎ Table{AbstractVector{Continuous}}
 

Transform selected data bound to the machine:

transform(mach, rows=test);
(x1 = [-3.394282685448322, -1.5219827578765053, 2.53824745518522, 2.7299639893931382],
  x2 = [0.547245022374522, -0.36842368617126425, 0.5199299511335688, 0.3448466122232349],)

Transform new data:

Xnew = (sepal_length=rand(3), sepal_width=rand(3),
         petal_length=rand(3), petal_width=rand(3));
-transform(mach, Xnew)
(x1 = [4.4240113088483985, 4.903489857969179, 4.71363023881634],
- x2 = [-4.548271408319666, -5.023677889350307, -4.738428955775531],)

Inverting learned transformations

y = rand(100);
+transform(mach, Xnew)
(x1 = [4.60254619833418, 4.963408439322138, 4.73352667809396],
+ x2 = [-4.450747224690028, -4.340052887208079, -4.323758570369482],)

Inverting learned transformations

y = rand(100);
 stand = Standardizer()
 mach = machine(stand, y)
 fit!(mach)
@@ -460,13 +462,13 @@
   logger = nothing)

Bound the wrapped model to data:

mach = machine(tuned_forest, X, y)
untrained Machine; does not cache data
   model: ProbabilisticTunedModel(model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …)
   args: 
-    1:	Source @157 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @318 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @313 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @689 ⏎ AbstractVector{Multiclass{3}}
 

Fitting the resultant machine optimizes the hyperparameters specified in range, using the specified tuning and resampling strategies and performance measure (possibly a vector of measures), and retrains on all data bound to the machine:

fit!(mach)
trained Machine; does not cache data
   model: ProbabilisticTunedModel(model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …)
   args: 
-    1:	Source @157 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @318 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @313 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @689 ⏎ AbstractVector{Multiclass{3}}
 

Inspecting the optimal model:

F = fitted_params(mach)
(best_model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …),
  best_fitted_params = (fitresult = WrappedEnsemble(atom = DecisionTreeClassifier(max_depth = -1, …), …),),)
F.best_model
ProbabilisticEnsembleModel(
   model = DecisionTreeClassifier(
@@ -474,7 +476,7 @@
         min_samples_leaf = 1, 
         min_samples_split = 2, 
         min_purity_increase = 0.0, 
-        n_subfeatures = 3, 
+        n_subfeatures = 4, 
         post_prune = false, 
         merge_purity_threshold = 1.0, 
         display_depth = 5, 
@@ -487,12 +489,12 @@
   acceleration = CPU1{Nothing}(nothing), 
   out_of_bag_measure = Any[])

Inspecting details of tuning procedure:

r = report(mach);
 keys(r)
(:best_model, :best_history_entry, :history, :best_report, :plotting)
r.history[[1,end]]
2-element Vector{@NamedTuple{model::MLJEnsembles.ProbabilisticEnsembleModel{MLJDecisionTreeInterface.DecisionTreeClassifier}, measure::Vector{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.FussyMeasure{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasures._BrierLossType}, typeof(StatisticalMeasures.l2_check)}}}, measurement::Vector{Float64}, per_fold::Vector{Vector{Float64}}, evaluation::CompactPerformanceEvaluation{MLJEnsembles.ProbabilisticEnsembleModel{MLJDecisionTreeInterface.DecisionTreeClassifier}, Vector{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.FussyMeasure{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasures._BrierLossType}, typeof(StatisticalMeasures.l2_check)}}}, Vector{Float64}, Vector{typeof(predict)}, Vector{Vector{Float64}}, Vector{Vector{Vector{Float64}}}, CV}}}:
- (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.10485007407407392], per_fold = [[-0.0, -0.0, 0.134687111111111, 0.1530017777777773, 0.13752799999999973, 0.20388355555555532]], evaluation = CompactPerformanceEvaluation(0.105,))
- (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.11550755555555536], per_fold = [[0.008363555555555645, 0.0002675555555555853, 0.15716799999999978, 0.15550933333333286, 0.15517244444444409, 0.21656444444444425]], evaluation = CompactPerformanceEvaluation(0.116,))

Visualizing these results:

using Plots
+ (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.11061688888888872], per_fold = [[0.008769777777777862, 0.00018311111111112943, 0.13994577777777764, 0.15614133333333288, 0.14898399999999967, 0.20967733333333313]], evaluation = CompactPerformanceEvaluation(0.111,))
+ (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.12125549176954746], per_fold = [[0.02781777777777793, 0.007603555555555701, 0.19223187037037057, 0.1535252222222222, 0.1663280555555555, 0.18002646913580272]], evaluation = CompactPerformanceEvaluation(0.121,))

Visualizing these results:

using Plots
 plot(mach)

Predicting on new data using the optimized model trained on all data:

predict(mach, Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
- UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)

Constructing linear pipelines

Reference: Linear Pipelines

Constructing a linear (unbranching) pipeline with a learned target transformation/inverse transformation:

X, y = @load_reduced_ames
+ UnivariateFinite{Multiclass{3}}(setosa=>0.767, versicolor=>0.213, virginica=>0.02)

Constructing linear pipelines

Reference: Linear Pipelines

Constructing a linear (unbranching) pipeline with a learned target transformation/inverse transformation:

X, y = @load_reduced_ames
 KNN = @load KNNRegressor
 knn_with_target = TransformedTargetModel(model=KNN(K=3), transformer=Standardizer())
TransformedTargetModelDeterministic(
   model = KNNRegressor(
@@ -556,13 +558,13 @@
 ┌──────────────────────┬───────────┬─────────────┐
 │ measure              │ operation │ measurement │
 ├──────────────────────┼───────────┼─────────────┤
-│ LogLoss(             │ predict   │ 0.626       │
+│ LogLoss(             │ predict   │ 0.428       │
 │   tol = 2.22045e-16) │           │             │
 └──────────────────────┴───────────┴─────────────┘
 ┌────────────────────────────────────────────────┬─────────┐
 │ per_fold                                       │ 1.96*SE │
 ├────────────────────────────────────────────────┼─────────┤
-│ [3.89e-15, 3.89e-15, 0.278, 1.62, 1.56, 0.302] │ 0.663   │
+│ [3.89e-15, 3.89e-15, 0.294, 0.41, 1.56, 0.299] │ 0.51    │
 └────────────────────────────────────────────────┴─────────┘
 

Performance curves

Generate a plot of performance, as a function of some hyperparameter (building on the preceding example)

Single performance curve:

r = range(forest, :n, lower=1, upper=1000, scale=:log10)
 curve = learning_curve(mach,
@@ -573,7 +575,7 @@
                        verbosity=0)
(parameter_name = "n",
  parameter_scale = :log10,
  parameter_values = [1, 2, 3, 4, 5, 6, 7, 8, 10, 11  …  281, 324, 373, 429, 494, 569, 655, 754, 869, 1000],
- measurements = [16.820371581588002, 5.237112030897364, 2.0597068716412203, 2.0134506566622394, 2.011478973790339, 2.0469387705862574, 1.9635257776734945, 1.9175424519659632, 1.8807692408608805, 1.8956267402240952  …  1.227922996942082, 1.2317750828445333, 1.2242819711464847, 1.2278722009771288, 1.224882436137922, 1.229692343012722, 1.2290129800152094, 1.230548543545263, 1.2331632457059212, 1.2423449607020516],)
using Plots
+ measurements = [4.004850376568572, 4.1126732713223415, 4.067922726718731, 4.123999873775369, 4.150105956717014, 2.688089225524209, 2.715285824731319, 2.7309139415415857, 2.7444858783511297, 2.7476450089856033  …  1.269185619048552, 1.2786364928754186, 1.2725212042652867, 1.2789570911204242, 1.2797130430389276, 1.2768033472128724, 1.2644056972193418, 1.2598962094386172, 1.2612790173706743, 1.2557508210679436],)
using Plots
 plot(curve.parameter_values, curve.measurements,
      xlab=curve.parameter_name, xscale=curve.parameter_scale)

Multiple curves:

curve = learning_curve(mach,
                        range=r,
@@ -585,5 +587,5 @@
                        verbosity=0)
(parameter_name = "n",
  parameter_scale = :log10,
  parameter_values = [1, 2, 3, 4, 5, 6, 7, 8, 10, 11  …  281, 324, 373, 429, 494, 569, 655, 754, 869, 1000],
- measurements = [4.004850376568572 8.009700753137146 16.820371581588002 8.009700753137146; 4.004850376568572 8.040507294495367 9.087929700674836 8.040507294495367; … ; 1.2032433035747263 1.2341483592529663 1.2651934430346028 1.2739529214818293; 1.2088161637388406 1.2322191736815042 1.2677284135372628 1.2752290384276348],)
plot(curve.parameter_values, curve.measurements,
-     xlab=curve.parameter_name, xscale=curve.parameter_scale)

+ measurements = [4.004850376568572 8.009700753137146 16.820371581588002 9.611640903764574; 4.004850376568572 8.009700753137146 9.087929700674836 9.611640903764574; … ; 1.2099979316961877 1.2316766858863117 1.266241881645686 1.274322191002287; 1.214989736207193 1.2334567682916915 1.2684272251885533 1.2728908797309264],)
plot(curve.parameter_values, curve.measurements,
+     xlab=curve.parameter_name, xscale=curve.parameter_scale)

diff --git a/dev/composing_models/index.html b/dev/composing_models/index.html index 3c87b1fb8..208e8be61 100644 --- a/dev/composing_models/index.html +++ b/dev/composing_models/index.html @@ -1,2 +1,2 @@ -Composing Models · MLJ

Composing Models

Three common ways of combining multiple models together have out-of-the-box implementations in MLJ:

  • Linear Pipelines (Pipeline)- for unbranching chains that take the output of one model (e.g., dimension reduction, such as PCA) and make it the input of the next model in the chain (e.g., a classification model, such as EvoTreeClassifier). To include transformations of the target variable in a supervised pipeline model, see Target Transformations.
  • Homogeneous Ensembles (EnsembleModel) - for blending the predictions of multiple supervised models all of the same type, but which receive different views of the training data to reduce overall variance. The technique implemented here is known as observation bagging.
  • Model Stacking - (Stack) for combining the predictions of a smaller number of models of possibly different types, with the help of an adjudicating model.

Additionally, more complicated model compositions are possible using:

  • Learning Networks - "blueprints" for combining models in flexible ways; these are simple transformations of your existing workflows which can be "exported" to define new, stand-alone model types.
+Composing Models · MLJ

Composing Models

Three common ways of combining multiple models together have out-of-the-box implementations in MLJ:

  • Linear Pipelines (Pipeline)- for unbranching chains that take the output of one model (e.g., dimension reduction, such as PCA) and make it the input of the next model in the chain (e.g., a classification model, such as EvoTreeClassifier). To include transformations of the target variable in a supervised pipeline model, see Target Transformations.
  • Homogeneous Ensembles (EnsembleModel) - for blending the predictions of multiple supervised models all of the same type, but which receive different views of the training data to reduce overall variance. The technique implemented here is known as observation bagging.
  • Model Stacking - (Stack) for combining the predictions of a smaller number of models of possibly different types, with the help of an adjudicating model.

Additionally, more complicated model compositions are possible using:

  • Learning Networks - "blueprints" for combining models in flexible ways; these are simple transformations of your existing workflows which can be "exported" to define new, stand-alone model types.
diff --git a/dev/controlling_iterative_models/index.html b/dev/controlling_iterative_models/index.html index e1be01dd6..89bdf8582 100644 --- a/dev/controlling_iterative_models/index.html +++ b/dev/controlling_iterative_models/index.html @@ -1,5 +1,5 @@ -Controlling Iterative Models · MLJ

Controlling Iterative Models

Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.

Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.

In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the "warm restart" behavior described in Machines.

Basic use

As in Tuning Models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.

By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:

using MLJ
+Controlling Iterative Models · MLJ

Controlling Iterative Models

Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.

Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.

In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the "warm restart" behavior described in Machines.

Basic use

As in Tuning Models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.

By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:

using MLJ
 
 X, y = make_moons(100, rng=123, noise=0.5)
 EvoTreeClassifier = @load EvoTreeClassifier verbosity=0
@@ -49,8 +49,8 @@
  - rng: Random.MersenneTwister(123)
 , …)
   args:
-    1:	Source @278 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @320 ⏎ AbstractVector{Multiclass{2}}

As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means "Compute 5 more iterations" (in this case starting from none); Patience(2) means "Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.

Because iteration is implemented as a wrapper, the "self-iterating" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:

e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);
+    1:	Source @303 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @103 ⏎ AbstractVector{Multiclass{2}}

As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means "Compute 5 more iterations" (in this case starting from none); Patience(2) means "Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.

Because iteration is implemented as a wrapper, the "self-iterating" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:

e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);
 map(e.report_per_fold) do r
     r.n_iterations
 end
3-element Vector{Int64}:
@@ -79,8 +79,8 @@
 trained Machine; does not cache data
   model: DeterministicIteratedModel(model = DeterministicTunedModel(model = RidgeRegressor(lambda = 1.0, …), …), …)
   args:
-    1:	Source @848 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @345 ⏎ AbstractVector{Continuous}
julia> report(mach).model_report.best_modelRidgeRegressor(
+    1:	Source @948 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @268 ⏎ AbstractVector{Continuous}
julia> report(mach).model_report.best_modelRidgeRegressor(
   lambda = 0.4243170708090101,
   fit_intercept = true,
   penalize_intercept = false,
@@ -150,16 +150,18 @@
     verbosity > 1 && @info "learning rate: $r"
     wrapper.model.iteration_control = r
     return (learning_rates = rates,)
-end

API Reference

MLJIteration.IteratedModelFunction
IteratedModel(model=nothing,
-              controls=Any[Step(1), Patience(5), GL(2.0), TimeLimit(Dates.Millisecond(108000)), InvalidValue()],
-              retrain=false,
-              resampling=Holdout(),
-              measure=nothing,
-              weights=nothing,
-              class_weights=nothing,
-              operation=predict,
-              verbosity=1,
-              check_measure=true,
-              iteration_parameter=nothing,
-              cache=true)

Wrap the specified model <: Supervised in the specified iteration controls. Training a machine bound to the wrapper iterates a corresonding machine bound to model. Here model should support iteration.

To list all controls, do MLJIteration.CONTROLS. Controls are summarized at https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.

To make out-of-sample losses available to the controls, the machine bound to model is only trained on part of the data, as iteration proceeds. See details on training below. Specify retrain=true to ensure the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped.

Specify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). Otherwise, resampling must have type Holdout (eg, Holdout(fraction_train=0.8, rng=123)).

Assuming retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected. If retrain=false (default) and resampling is not nothing, then iterated_model behaves like the original model trained on a subset of the provided data.

Controlled iteration can be continued with new fit! calls (warm restart) by mutating a control, or by mutating the iteration parameter of model, which is otherwise ignored.

Training

Given an instance iterated_model of IteratedModel, calling fit!(mach) on a machine mach = machine(iterated_model, data...) performs the following actions:

  • Assuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy, which must have type Holdout.

  • A clone of the wrapped model, iterated_model.model, is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).

  • The iteration parameter of the clone is set to 0.

  • The specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).

  • Once a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false or resampling === nothing, in which case mach_production coincides with train_mach.

Prediction

Calling predict(mach, Xnew) returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.

Controls

A control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control

Callback(mach -> mach.model.eta = 1.05*mach.model.eta)

However, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.

Warm restarts

If iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

source

Controls

IterationControl.StepType
Step(; n=1)

An iteration control, as in, Step(2).

Train for n more iterations. Will never trigger a stop.

source
EarlyStopping.TimeLimitType
TimeLimit(; t=0.5)

An early stopping criterion for loss-reporting iterative algorithms.

Stopping is triggered after t hours have elapsed since the stopping criterion was initiated.

Any Julia built-in Real type can be used for t. Subtypes of Period may also be used, as in TimeLimit(t=Minute(30)).

Internally, t is rounded to nearest millisecond. ``

source
EarlyStopping.NumberLimitType
NumberLimit(; n=100)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered by n consecutive loss updates, excluding "training" loss updates.

If wrapped in a stopper::EarlyStopper, this is the number of calls to done!(stopper).

source
EarlyStopping.NumberSinceBestType
NumberSinceBest(; n=6)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the number of calls to the control, since the lowest value of the loss so far, is n.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.InvalidValueType
InvalidValue()

An early stopping criterion for loss-reporting iterative algorithms.

Stop if a loss (or training loss) is NaN, Inf or -Inf (or, more precisely, if isnan(loss) or isinf(loss) is true).

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.ThresholdType
Threshold(; value=0.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered as soon as the loss drops below value.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.GLType
GL(; alpha=2.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the (rescaled) generalization loss exceeds the threshold alpha.

Terminology. Suppose $E_1, E_2, ..., E_t$ are a sequence of losses, for example, out-of-sample estimates of the loss associated with some iterative machine learning algorithm. Then the generalization loss at time t, is given by

$GL_t = 100 (E_t - E_{opt}) \over |E_{opt}|$

where $E_{opt}$ is the minimum value of the sequence.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

source
EarlyStopping.PQType
PQ(; alpha=0.75, k=5, tol=eps(Float64))

A stopping criterion for training iterative supervised learners.

A stop is triggered when Prechelt's progress-modified generalization loss exceeds the threshold $PQ_T > alpha$, or if the training progress drops below $P_j ≤ tol$. Here k is the number of training (in-sample) losses used to estimate the training progress.

Context and explanation of terminology

The training progress at time $j$ is defined by

$P_j = 1000 |M - m|/|m|$

where $M$ is the mean of the last k training losses $F_1, F_2, …, F_k$ and $m$ is the minimum value of those losses.

The progress-modified generalization loss at time $t$ is then given by

$PQ_t = GL_t / P_t$

where $GL_t$ is the generalization loss at time $t$; see GL.

PQ will stop when the following are true:

  1. At least k training samples have been collected via done!(c::PQ, loss; training = true) or update_training(c::PQ, loss, state)
  2. The last update was an out-of-sample update. (done!(::PQ, loss; training=true) is always false)
  3. The progress-modified generalization loss exceeds the threshold $PQ_t > alpha$ OR the training progress stalls $P_j ≤ tol$.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

source
IterationControl.InfoType
Info(f=identity)

An iteration control, as in, Info(my_loss_function).

Log to Info the value of f(m), where m is the object being iterated. If IterativeControl.expose(m) has been overloaded, then log f(expose(m)) instead.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Warn, Error.

source
IterationControl.WarnType
Warn(predicate; f="")

An iteration control, as in, Warn(m -> length(m.cache) > 100, f="Memory low").

If predicate(m) is true, then log to Warn the value of f (or f(IterationControl.expose(m)) if f is a function). Here m is the object being iterated.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Info, Error.

source
IterationControl.ErrorType
Error(predicate; f="", exception=nothing))

An iteration control, as in, Error(m -> isnan(m.bias), f="Bias overflow!").

If predicate(m) is true, then log at the Error level the value of f (or f(IterationControl.expose(m)) if f is a function) and stop iteration at the end of the current control cycle. Here m is the object being iterated.

Specify exception=... to throw an immediate execption, without waiting to the end of the control cycle.

See also Info, Warn.

source
IterationControl.CallbackType
Callback(f=_->nothing, stop_if_true=false, stop_message=nothing, raw=false)

An iteration control, as in, Callback(m->put!(v, my_loss_function(m)).

Call f(IterationControl.expose(m)), where m is the object being iterated, unless raw=true, in which case call f(m) (guaranteed if expose has not been overloaded.) If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithNumberDoType
WithNumberDo(f=n->@info("number: $n"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithNumberDo(n->put!(my_channel, n)).

Call f(n + 1), where n is the number of complete control cycles. of the control (so, n = 1, 2, 3, ..., unless control is wrapped in a IterationControl.skip)`.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithIterationsDoType
WithIterationsDo(f=x->@info("iterations: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithIterationsDo(x->put!(my_channel, x)).

Call f(x), where x is the current number of model iterations (generally more than the number of control cycles). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithLossDoType
WithLossDo(f=x->@info("loss: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithLossDo(x->put!(my_losses, x)).

Call f(loss), where loss is current loss.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithTrainingLossesDoType
WithTrainingLossesDo(f=v->@info("training: $v"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithTrainingLossesDo(v->put!(my_losses, last(v)).

Call f(training_losses), where training_losses is the vector of most recent batch of training losses.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithEvaluationDoType
WithEvaluationDo(f=x->@info("evaluation: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithEvaluationDo(x->put!(my_channel, x)).

Call f(x), where x is the latest performance evaluation, as returned by evaluate!(train_mach, resampling=..., ...). Not valid if resampling=nothing. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithFittedParamsDoType
WithFittedParamsDo(f=x->@info("fitted_params: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithFittedParamsDo(x->put!(my_channel, x)).

Call f(x), where x = fitted_params(mach) is the fitted parameters of the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithReportDoType
WithReportDo(f=x->@info("report: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithReportDo(x->put!(my_channel, x)).

Call f(x), where x = report(mach) is the report associated with the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithModelDoType
WithModelDo(f=x->@info("model: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithModelDo(x->put!(my_channel, x)).

Call f(x), where x is the model associated with the training machine; f may mutate x, as in f(x) = (x.learning_rate *= 0.9). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithMachineDoType
WithMachineDo(f=x->@info("machine: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithMachineDo(x->put!(my_channel, x)).

Call f(x), where x is the training machine in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.SaveType
Save(filename="machine.jls")

An iteration control, as in, Save("run3/machine.jls").

Save the current state of the machine being iterated to disk, using the provided filename, decorated with a number, as in "run3/machine42.jls". The default behaviour uses the Serialization module but this can be changed by setting the method=save_fn(::String, ::Any) argument where save_fn is any serialization method. For more on what is meant by "the machine being iterated", see IteratedModel.

source

Control wrappers

IterationControl.skipFunction
IterationControl.skip(control, predicate=1)

An iteration control wrapper.

If predicate is an integer, k: Apply control on every k calls to apply the wrapped control, starting with the kth call.

If predicate is a function: Apply control as usual when predicate(n + 1) is true but otherwise skip. Here n is the number of control cycles applied so far.

source
IterationControl.louderFunction
IterationControl.louder(control, by=1)

Wrap control to make in more (or less) verbose. The same as control, but as if the global verbosity were increased by the value by.

source
IterationControl.with_state_doFunction
IterationControl.with_state_do(control,
-                              f=x->@info "$(typeof(control)) state: $x")

Wrap control to give access to it's internal state. Acts exactly like control except that f is called on the internal state of control. If f is not specified, the control type and state are logged to Info at every update (useful for debugging new controls).

Warning. The internal state of a control is not yet considered part of the public interface and could change between in any pre 1.0 release of IterationControl.jl.

source
+end

API Reference

MLJIteration.IteratedModelFunction
IteratedModel(model;
+    controls=MLJIteration.DEFAULT_CONTROLS,
+    resampling=Holdout(),
+    measure=nothing,
+    retrain=false,
+    advanced_options...,
+)

Wrap the specified supervised model in the specified iteration controls. Here model should support iteration, which is true if (iteration_parameter(model) is different from nothing.

Available controls: Step(), Info(), Warn(), Error(), Callback(), WithLossDo(), WithTrainingLossesDo(), WithNumberDo(), Data(), Disjunction(), GL(), InvalidValue(), Never(), NotANumber(), NumberLimit(), NumberSinceBest(), PQ(), Patience(), Threshold(), TimeLimit(), Warmup(), WithIterationsDo(), WithEvaluationDo(), WithFittedParamsDo(), WithReportDo(), WithMachineDo(), WithModelDo(), CycleLearningRate() and Save().

Important

To make out-of-sample losses available to the controls, the wrapped model is only trained on part of the data, as iteration proceeds. The user may want to force retraining on all data after controlled iteration has finished by specifying retrain=true. See also "Training", and the retrain option, under "Extended help" below.

Extended help

Options

  • controls=Any[Step(1), Patience(5), GL(2.0), TimeLimit(Dates.Millisecond(108000)), InvalidValue()]: Controls are summarized at https://JuliaAI.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.

  • resampling=Holdout(fraction_train=0.7): The default resampling holds back 30% of data for computing an out-of-sample estimate of performance (the "loss") for loss-based controls such as WithLossDo. Specify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). If the model does not report a training loss, you can use resampling=InSample() instead. Otherwise, resampling must have type Holdout or be a vector with one element of the form (train_indices, test_indices).

  • measure=nothing: StatisticalMeasures.jl compatible measure for estimating model performance (the "loss", but the orientation is immaterial - i.e., this could be a score). Inferred by default. Ignored if resampling=nothing.

  • retrain=false: If retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected ("learned"). That is, the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped. This is typically desired if wrapping the iterated model further, or when inserting in a pipeline or other composite model. If retrain=false (default) and resampling isa Holdout, then iterated_model behaves like the original model trained on a subset of the provided data.

  • weights=nothing: per-observation weights to be passed to measure where supported; if unspecified, these are understood to be uniform.

  • class_weights=nothing: class-weights to be passed to measure where supported; if unspecified, these are understood to be uniform.

  • operation=nothing: Operation, such as predict or predict_mode, for computing target values, or proxy target values, for consumption by measure; automatically inferred by default.

  • check_measure=true: Specify false to override checks on measure for compatibility with the training data.

  • iteration_parameter=nothing: A symbol, such as :epochs, naming the iteration parameter of model; inferred by default. Note that the actual value of the iteration parameter in the supplied model is ignored; only the value of an internal clone is mutated during training the wrapped model.

  • cache=true: Whether or not model-specific representations of data are cached in between iteration parameter increments; specify cache=false to prioritize memory over speed.

Training

Training an instance iterated_model of IteratedModel on some data (by binding to a machine and calling fit!, for example) performs the following actions:

  • Assuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy.

  • A clone of the wrapped model, model is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).

  • The iteration parameter of the clone is set to 0.

  • The specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).

  • Once a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false (true by default) or resampling === nothing, in which case mach_production coincides with train_mach.

Prediction

Calling predict(mach, Xnew) in the example above returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.

Controls that mutate parameters

A control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control

Callback(mach -> mach.model.eta = 1.05*mach.model.eta)

However, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.

Warm restarts

In the following example, the second fit! call will not restart training of the internal train_mach, assuming model supports warm restarts:

iterated_model = IteratedModel(
+    model,
+    controls = [Step(1), NumberLimit(100)],
+)
+mach = machine(iterated_model, X, y)
+fit!(mach) # train for 100 iterations
+iterated_model.controls = [Step(1), NumberLimit(50)],
+fit!(mach) # train for an *extra* 50 iterations

More generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

source

Controls

IterationControl.StepType
Step(; n=1)

An iteration control, as in, Step(2).

Train for n more iterations. Will never trigger a stop.

source
EarlyStopping.TimeLimitType
TimeLimit(; t=0.5)

An early stopping criterion for loss-reporting iterative algorithms.

Stopping is triggered after t hours have elapsed since the stopping criterion was initiated.

Any Julia built-in Real type can be used for t. Subtypes of Period may also be used, as in TimeLimit(t=Minute(30)).

Internally, t is rounded to nearest millisecond. ``

source
EarlyStopping.NumberLimitType
NumberLimit(; n=100)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered by n consecutive loss updates, excluding "training" loss updates.

If wrapped in a stopper::EarlyStopper, this is the number of calls to done!(stopper).

source
EarlyStopping.NumberSinceBestType
NumberSinceBest(; n=6)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the number of calls to the control, since the lowest value of the loss so far, is n.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.InvalidValueType
InvalidValue()

An early stopping criterion for loss-reporting iterative algorithms.

Stop if a loss (or training loss) is NaN, Inf or -Inf (or, more precisely, if isnan(loss) or isinf(loss) is true).

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.ThresholdType
Threshold(; value=0.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered as soon as the loss drops below value.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.GLType
GL(; alpha=2.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the (rescaled) generalization loss exceeds the threshold alpha.

Terminology. Suppose $E_1, E_2, ..., E_t$ are a sequence of losses, for example, out-of-sample estimates of the loss associated with some iterative machine learning algorithm. Then the generalization loss at time t, is given by

$GL_t = 100 (E_t - E_{opt}) \over |E_{opt}|$

where $E_{opt}$ is the minimum value of the sequence.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

source
EarlyStopping.PQType
PQ(; alpha=0.75, k=5, tol=eps(Float64))

A stopping criterion for training iterative supervised learners.

A stop is triggered when Prechelt's progress-modified generalization loss exceeds the threshold $PQ_T > alpha$, or if the training progress drops below $P_j ≤ tol$. Here k is the number of training (in-sample) losses used to estimate the training progress.

Context and explanation of terminology

The training progress at time $j$ is defined by

$P_j = 1000 |M - m|/|m|$

where $M$ is the mean of the last k training losses $F_1, F_2, …, F_k$ and $m$ is the minimum value of those losses.

The progress-modified generalization loss at time $t$ is then given by

$PQ_t = GL_t / P_t$

where $GL_t$ is the generalization loss at time $t$; see GL.

PQ will stop when the following are true:

  1. At least k training samples have been collected via done!(c::PQ, loss; training = true) or update_training(c::PQ, loss, state)
  2. The last update was an out-of-sample update. (done!(::PQ, loss; training=true) is always false)
  3. The progress-modified generalization loss exceeds the threshold $PQ_t > alpha$ OR the training progress stalls $P_j ≤ tol$.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

source
IterationControl.InfoType
Info(f=identity)

An iteration control, as in, Info(my_loss_function).

Log to Info the value of f(m), where m is the object being iterated. If IterativeControl.expose(m) has been overloaded, then log f(expose(m)) instead.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Warn, Error.

source
IterationControl.WarnType
Warn(predicate; f="")

An iteration control, as in, Warn(m -> length(m.cache) > 100, f="Memory low").

If predicate(m) is true, then log to Warn the value of f (or f(IterationControl.expose(m)) if f is a function). Here m is the object being iterated.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Info, Error.

source
IterationControl.ErrorType
Error(predicate; f="", exception=nothing))

An iteration control, as in, Error(m -> isnan(m.bias), f="Bias overflow!").

If predicate(m) is true, then log at the Error level the value of f (or f(IterationControl.expose(m)) if f is a function) and stop iteration at the end of the current control cycle. Here m is the object being iterated.

Specify exception=... to throw an immediate execption, without waiting to the end of the control cycle.

See also Info, Warn.

source
IterationControl.CallbackType
Callback(f=_->nothing, stop_if_true=false, stop_message=nothing, raw=false)

An iteration control, as in, Callback(m->put!(v, my_loss_function(m)).

Call f(IterationControl.expose(m)), where m is the object being iterated, unless raw=true, in which case call f(m) (guaranteed if expose has not been overloaded.) If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithNumberDoType
WithNumberDo(f=n->@info("number: $n"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithNumberDo(n->put!(my_channel, n)).

Call f(n + 1), where n is the number of complete control cycles. of the control (so, n = 1, 2, 3, ..., unless control is wrapped in a IterationControl.skip)`.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithIterationsDoType
WithIterationsDo(f=x->@info("iterations: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithIterationsDo(x->put!(my_channel, x)).

Call f(x), where x is the current number of model iterations (generally more than the number of control cycles). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithLossDoType
WithLossDo(f=x->@info("loss: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithLossDo(x->put!(my_losses, x)).

Call f(loss), where loss is current loss.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithTrainingLossesDoType
WithTrainingLossesDo(f=v->@info("training: $v"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithTrainingLossesDo(v->put!(my_losses, last(v)).

Call f(training_losses), where training_losses is the vector of most recent batch of training losses.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithEvaluationDoType
WithEvaluationDo(f=x->@info("evaluation: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithEvaluationDo(x->put!(my_channel, x)).

Call f(x), where x is the latest performance evaluation, as returned by evaluate!(train_mach, resampling=..., ...). Not valid if resampling=nothing. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithFittedParamsDoType
WithFittedParamsDo(f=x->@info("fitted_params: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithFittedParamsDo(x->put!(my_channel, x)).

Call f(x), where x = fitted_params(mach) is the fitted parameters of the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithReportDoType
WithReportDo(f=x->@info("report: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithReportDo(x->put!(my_channel, x)).

Call f(x), where x = report(mach) is the report associated with the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithModelDoType
WithModelDo(f=x->@info("model: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithModelDo(x->put!(my_channel, x)).

Call f(x), where x is the model associated with the training machine; f may mutate x, as in f(x) = (x.learning_rate *= 0.9). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithMachineDoType
WithMachineDo(f=x->@info("machine: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithMachineDo(x->put!(my_channel, x)).

Call f(x), where x is the training machine in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.SaveType
Save(filename="machine.jls")

An iteration control, as in, Save("run3/machine.jls").

Save the current state of the machine being iterated to disk, using the provided filename, decorated with a number, as in "run3/machine42.jls". The default behaviour uses the Serialization module but this can be changed by setting the method=save_fn(::String, ::Any) argument where save_fn is any serialization method. For more on what is meant by "the machine being iterated", see IteratedModel.

source

Control wrappers

IterationControl.skipFunction
IterationControl.skip(control, predicate=1)

An iteration control wrapper.

If predicate is an integer, k: Apply control on every k calls to apply the wrapped control, starting with the kth call.

If predicate is a function: Apply control as usual when predicate(n + 1) is true but otherwise skip. Here n is the number of control cycles applied so far.

source
IterationControl.louderFunction
IterationControl.louder(control, by=1)

Wrap control to make in more (or less) verbose. The same as control, but as if the global verbosity were increased by the value by.

source
IterationControl.with_state_doFunction
IterationControl.with_state_do(control,
+                              f=x->@info "$(typeof(control)) state: $x")

Wrap control to give access to it's internal state. Acts exactly like control except that f is called on the internal state of control. If f is not specified, the control type and state are logged to Info at every update (useful for debugging new controls).

Warning. The internal state of a control is not yet considered part of the public interface and could change between in any pre 1.0 release of IterationControl.jl.

source
diff --git a/dev/correcting_class_imbalance/index.html b/dev/correcting_class_imbalance/index.html index e425a30bc..1910fcb9a 100644 --- a/dev/correcting_class_imbalance/index.html +++ b/dev/correcting_class_imbalance/index.html @@ -1,6 +1,6 @@ -Correcting Class Imbalance · MLJ

Correcting Class Imbalance

Oversampling and undersampling methods

Models providing oversampling or undersampling methods, to correct for class imbalance, are listed under Class Imbalance. In particular, several popular algorithms are provided by the Imbalance.jl package, which includes detailed documentation and tutorials.

Incorporating class imbalance in supervised learning pipelines

One or more oversampling/undersampling algorithms can be fused with an MLJ classifier using the BalancedModel wrapper. This creates a new classifier which can be treated like any other; resampling to correct for class imbalance, relevant only for training of the atomic classifier, is then carried out internally. If, for example, one applies cross-validation to the wrapped classifier (using evaluate!, say) then this means over/undersampling is then repeated for each training fold automatically.

Refer to the MLJBalancing.jl documentation for further details.

MLJBalancing.BalancedModelFunction
BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
-BalancedModel(model;  balancer1=balancer_model1, balancer2=balancer_model2, ...)

Given a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.

Operation

  • During training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.
  • During prediction, the balancers have no effect.

Arguments

  • model::Supervised: A classification model that implements the MLJModelInterface.
  • balancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.
  • balancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.
  • and so on for an arbitrary number of balancers.

Returns

  • An instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.

Example

using MLJ
+Correcting Class Imbalance · MLJ

Correcting Class Imbalance

Oversampling and undersampling methods

Models providing oversampling or undersampling methods, to correct for class imbalance, are listed under Class Imbalance. In particular, several popular algorithms are provided by the Imbalance.jl package, which includes detailed documentation and tutorials.

Incorporating class imbalance in supervised learning pipelines

One or more oversampling/undersampling algorithms can be fused with an MLJ classifier using the BalancedModel wrapper. This creates a new classifier which can be treated like any other; resampling to correct for class imbalance, relevant only for training of the atomic classifier, is then carried out internally. If, for example, one applies cross-validation to the wrapped classifier (using evaluate!, say) then this means over/undersampling is then repeated for each training fold automatically.

Refer to the MLJBalancing.jl documentation for further details.

MLJBalancing.BalancedModelFunction
BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
+BalancedModel(model;  balancer1=balancer_model1, balancer2=balancer_model2, ...)

Given a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.

Operation

  • During training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.
  • During prediction, the balancers have no effect.

Arguments

  • model::Supervised: A classification model that implements the MLJModelInterface.
  • balancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.
  • balancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.
  • and so on for an arbitrary number of balancers.

Returns

  • An instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.

Example

using MLJ
 using Imbalance
 
 # generate data
@@ -20,4 +20,4 @@
 
 # now this behaves as a unified model that can be trained, validated, fine-tuned, etc.
 mach = machine(balanced_model, X, y)
-fit!(mach)
source
+fit!(mach)
source
diff --git a/dev/evaluating_model_performance/index.html b/dev/evaluating_model_performance/index.html index 8c9f85598..2929f9259 100644 --- a/dev/evaluating_model_performance/index.html +++ b/dev/evaluating_model_performance/index.html @@ -1,5 +1,5 @@ -Evaluating Model Performance · MLJ

Evaluating Model Performance

MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.

In addition to hold-out and cross-validation, the user can specify an explicit list of train/test pairs of row indices for resampling, or define new resampling strategies.

For simultaneously evaluating multiple models, see Comparing models of different type and nested cross-validation.

For externally logging the outcomes of performance evaluation experiments, see Logging Workflows

Evaluating against a single measure

julia> using MLJ
julia> X = (a=rand(12), b=rand(12), c=rand(12));
julia> y = X.a + 2X.b + 0.05*rand(12);
julia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)()RidgeRegressor( +Evaluating Model Performance · MLJ

Evaluating Model Performance

MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.

In addition to hold-out and cross-validation, the user can specify an explicit list of train/test pairs of row indices for resampling, or define new resampling strategies.

For simultaneously evaluating multiple models, see Comparing models of different type and nested cross-validation.

For externally logging the outcomes of performance evaluation experiments, see Logging Workflows

Evaluating against a single measure

julia> using MLJ
julia> X = (a=rand(12), b=rand(12), c=rand(12));
julia> y = X.a + 2X.b + 0.05*rand(12);
julia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)()RidgeRegressor( lambda = 1.0, bias = true)
julia> cv = CV(nfolds=3)CV( nfolds = 3, @@ -13,18 +13,18 @@ ┌──────────┬───────────┬─────────────┐ │ measure │ operation │ measurement │ ├──────────┼───────────┼─────────────┤ -│ LPLoss( │ predict │ 0.198 │ +│ LPLoss( │ predict │ 0.2 │ │ p = 2) │ │ │ └──────────┴───────────┴─────────────┘ ┌───────────────────────┬─────────┐ │ per_fold │ 1.96*SE │ ├───────────────────────┼─────────┤ -│ [0.104, 0.318, 0.172] │ 0.152 │ +│ [0.249, 0.133, 0.219] │ 0.0837 │ └───────────────────────┴─────────┘

Alternatively, instead of applying evaluate to a model + data, one may call evaluate! on an existing machine wrapping the model in data:

julia> mach = machine(model, X, y)untrained Machine; caches model-specific representations of data
   model: RidgeRegressor(lambda = 1.0, …)
   args:
-    1:	Source @270 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @221 ⏎ AbstractVector{Continuous}
julia> evaluate!(mach, resampling=cv, measure=l2, verbosity=0)PerformanceEvaluation object with these fields: + 1: Source @196 ⏎ Table{AbstractVector{Continuous}} + 2: Source @870 ⏎ AbstractVector{Continuous}
julia> evaluate!(mach, resampling=cv, measure=l2, verbosity=0)PerformanceEvaluation object with these fields: model, measure, operation, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold, @@ -33,13 +33,13 @@ ┌──────────┬───────────┬─────────────┐ │ measure │ operation │ measurement │ ├──────────┼───────────┼─────────────┤ -│ LPLoss( │ predict │ 0.198 │ +│ LPLoss( │ predict │ 0.2 │ │ p = 2) │ │ │ └──────────┴───────────┴─────────────┘ ┌───────────────────────┬─────────┐ │ per_fold │ 1.96*SE │ ├───────────────────────┼─────────┤ -│ [0.104, 0.318, 0.172] │ 0.152 │ +│ [0.249, 0.133, 0.219] │ 0.0837 │ └───────────────────────┴─────────┘

(The latter call is a mutating call as the learned parameters stored in the machine potentially change. )

Multiple measures

Multiple measures are specified as a vector:

julia> evaluate!(
            mach,
            resampling=cv,
@@ -54,18 +54,18 @@
 ┌───┬──────────────────────────────────────┬───────────┬─────────────┐
 │   │ measure                              │ operation │ measurement │
 ├───┼──────────────────────────────────────┼───────────┼─────────────┤
-│ A │ LPLoss(                              │ predict   │ 0.402       │
+│ A │ LPLoss(                              │ predict   │ 0.396       │
 │   │   p = 1)                             │           │             │
-│ B │ RootMeanSquaredError()               │ predict   │ 0.445       │
-│ C │ RootMeanSquaredLogProportionalError( │ predict   │ 0.215       │
+│ B │ RootMeanSquaredError()               │ predict   │ 0.447       │
+│ C │ RootMeanSquaredLogProportionalError( │ predict   │ 0.201       │
 │   │   offset = 1)                        │           │             │
 └───┴──────────────────────────────────────┴───────────┴─────────────┘
 ┌───┬───────────────────────┬─────────┐
 │   │ per_fold              │ 1.96*SE │
 ├───┼───────────────────────┼─────────┤
-│ A │ [0.299, 0.533, 0.373] │ 0.166   │
-│ B │ [0.322, 0.564, 0.415] │ 0.169   │
-│ C │ [0.117, 0.297, 0.19]  │ 0.126   │
+│ A │ [0.419, 0.337, 0.43]  │ 0.0702  │
+│ B │ [0.499, 0.364, 0.468] │ 0.0978  │
+│ C │ [0.254, 0.168, 0.168] │ 0.0692  │
 └───┴───────────────────────┴─────────┘

Custom measures can also be provided.

Specifying weights

Per-observation weights can be passed to measures. If a measure does not support weights, the weights are ignored:

julia> holdout = Holdout(fraction_train=0.8)Holdout(
   fraction_train = 0.8,
   shuffle = false,
@@ -76,7 +76,7 @@
            weights=weights,
        )┌ Warning: Sample weights ignored in evaluations of the following measures, as unsupported: 
 │ RSquared() 
-└ @ MLJBase ~/.julia/packages/MLJBase/hoZmq/src/resampling.jl:946
+└ @ MLJBase ~/.julia/packages/MLJBase/QyZZM/src/resampling.jl:946
 
Evaluating over 3 folds:  67%[================>        ]  ETA: 0:00:00
Evaluating over 3 folds: 100%[=========================] Time: 0:00:00
 PerformanceEvaluation object with these fields:
   model, measure, operation,
@@ -87,17 +87,17 @@
 ┌───┬────────────┬───────────┬─────────────┐
 │   │ measure    │ operation │ measurement │
 ├───┼────────────┼───────────┼─────────────┤
-│ A │ LPLoss(    │ predict   │ 0.295       │
+│ A │ LPLoss(    │ predict   │ 0.308       │
 │   │   p = 2)   │           │             │
-│ B │ RSquared() │ predict   │ 0.547       │
+│ B │ RSquared() │ predict   │ 0.226       │
 └───┴────────────┴───────────┴─────────────┘
 ┌───┬───────────────────────┬─────────┐
 │   │ per_fold              │ 1.96*SE │
 ├───┼───────────────────────┼─────────┤
-│ A │ [0.112, 0.429, 0.344] │ 0.227   │
-│ B │ [0.415, 0.563, 0.665] │ 0.174   │
+│ A │ [0.382, 0.231, 0.311] │ 0.105   │
+│ B │ [0.294, 0.495, -0.11] │ 0.427   │
 └───┴───────────────────────┴─────────┘

In classification problems, use class_weights=... to specify a class weight dictionary.

MLJBase.evaluate!Function
evaluate!(mach; resampling=CV(), measure=nothing, options...)

Estimate the performance of a machine mach wrapping a supervised model in data, using the specified resampling strategy (defaulting to 6-fold cross-validation) and measure, which can be a single measure or vector. Returns a PerformanceEvaluation object.

Available resampling strategies are CV, Holdout, InSample, StratifiedCV and TimeSeriesCV. If resampling is not an instance of one of these, then a vector of tuples of the form (train_rows, test_rows) is expected. For example, setting

resampling = [((1:100), (101:200)),
-              ((101:200), (1:100))]

gives two-fold cross-validation using the first 200 rows of data.

Any measure conforming to the StatisticalMeasuresBase.jl API can be provided, assuming it can consume multiple observations.

Although evaluate! is mutating, mach.model and mach.args are not mutated.

Additional keyword options

  • rows - vector of observation indices from which both train and test folds are constructed (default is all observations)

  • operation/operations=nothing - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified. For example, predict_mode will be used for a Multiclass target, if model is a probabilistic predictor, but measure is expects literal (point) target predictions. Operations actually applied can be inspected from the operation field of the object returned.

  • weights - per-sample Real weights for measures that support them (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).

  • class_weights - dictionary of Real per-class weights for use with measures that support these, in classification problems (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).

  • repeats::Int=1: set to a higher value for repeated (Monte Carlo) resampling. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.

  • acceleration=CPU1(): acceleration/parallelization option; can be any instance of CPU1, (single-threaded computation), CPUThreads (multi-threaded computation) or CPUProcesses (multi-process computation); default is default_resource(). These types are owned by ComputationalResources.jl.

  • force=false: set to true to force cold-restart of each training event

  • verbosity::Int=1 logging level; can be negative

  • check_measure=true: whether to screen measures for possible incompatibility with the model. Will not catch all incompatibilities.

  • per_observation=true: whether to calculate estimates for individual observations; if false the per_observation field of the returned object is populated with missings. Setting to false may reduce compute time and allocations.

  • logger - a logger object (see MLJBase.log_evaluation)

  • compact=false - if true, the returned evaluation object excludes these fields: fitted_params_per_fold, report_per_fold, train_test_rows.

See also evaluate, PerformanceEvaluation, CompactPerformanceEvaluation.

source
MLJBase.PerformanceEvaluationType
PerformanceEvaluation <: AbstractPerformanceEvaluation

Type of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model, and store other information ancillary to the computation.

If evaluate or evaluate! is called with the compact=true option, then a CompactPerformanceEvaluation object is returned instead.

When evaluate/evaluate! is called, a number of train/test pairs ("folds") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.

When displayed, a PerformanceEvaluation object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).

Fields

These fields are part of the public API of the PerformanceEvaluation struct.

  • model: model used to create the performance evaluation. In the case a tuning model, this is the best model found.

  • measure: vector of measures (metrics) used to evaluate performance

  • measurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())

  • operation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.

  • per_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.

  • per_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.

  • fitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.

  • report_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.

  • train_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.

  • resampling: the user-specified resampling strategy to generate the train/test pairs (or literal train/test pairs if that was directly specified).

  • repeats: the number of times the resampling strategy was repeated.

See also CompactPerformanceEvaluation.

source

User-specified train/test sets

Users can either provide an explicit list of train/test pairs of row indices for resampling, as in this example:

julia> fold1 = 1:6; fold2 = 7:12;
julia> evaluate!( + ((101:200), (1:100))]

gives two-fold cross-validation using the first 200 rows of data.

Any measure conforming to the StatisticalMeasuresBase.jl API can be provided, assuming it can consume multiple observations.

Although evaluate! is mutating, mach.model and mach.args are not mutated.

Additional keyword options

  • rows - vector of observation indices from which both train and test folds are constructed (default is all observations)

  • operation/operations=nothing - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified. For example, predict_mode will be used for a Multiclass target, if model is a probabilistic predictor, but measure is expects literal (point) target predictions. Operations actually applied can be inspected from the operation field of the object returned.

  • weights - per-sample Real weights for measures that support them (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).

  • class_weights - dictionary of Real per-class weights for use with measures that support these, in classification problems (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).

  • repeats::Int=1: set to a higher value for repeated (Monte Carlo) resampling. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.

  • acceleration=CPU1(): acceleration/parallelization option; can be any instance of CPU1, (single-threaded computation), CPUThreads (multi-threaded computation) or CPUProcesses (multi-process computation); default is default_resource(). These types are owned by ComputationalResources.jl.

  • force=false: set to true to force cold-restart of each training event

  • verbosity::Int=1 logging level; can be negative

  • check_measure=true: whether to screen measures for possible incompatibility with the model. Will not catch all incompatibilities.

  • per_observation=true: whether to calculate estimates for individual observations; if false the per_observation field of the returned object is populated with missings. Setting to false may reduce compute time and allocations.

  • logger - a logger object (see MLJBase.log_evaluation)

  • compact=false - if true, the returned evaluation object excludes these fields: fitted_params_per_fold, report_per_fold, train_test_rows.

See also evaluate, PerformanceEvaluation, CompactPerformanceEvaluation.

source
MLJBase.PerformanceEvaluationType
PerformanceEvaluation <: AbstractPerformanceEvaluation

Type of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model, and store other information ancillary to the computation.

If evaluate or evaluate! is called with the compact=true option, then a CompactPerformanceEvaluation object is returned instead.

When evaluate/evaluate! is called, a number of train/test pairs ("folds") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.

When displayed, a PerformanceEvaluation object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).

Fields

These fields are part of the public API of the PerformanceEvaluation struct.

  • model: model used to create the performance evaluation. In the case a tuning model, this is the best model found.

  • measure: vector of measures (metrics) used to evaluate performance

  • measurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())

  • operation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.

  • per_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.

  • per_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.

  • fitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.

  • report_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.

  • train_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.

  • resampling: the user-specified resampling strategy to generate the train/test pairs (or literal train/test pairs if that was directly specified).

  • repeats: the number of times the resampling strategy was repeated.

See also CompactPerformanceEvaluation.

source

User-specified train/test sets

Users can either provide an explicit list of train/test pairs of row indices for resampling, as in this example:

julia> fold1 = 1:6; fold2 = 7:12;
julia> evaluate!( mach, resampling = [(fold1, fold2), (fold2, fold1)], measures=[l1, l2], @@ -111,19 +111,19 @@ ┌───┬──────────┬───────────┬─────────────┐ │ │ measure │ operation │ measurement │ ├───┼──────────┼───────────┼─────────────┤ -│ A │ LPLoss( │ predict │ 0.492 │ +│ A │ LPLoss( │ predict │ 0.663 │ │ │ p = 1) │ │ │ -│ B │ LPLoss( │ predict │ 0.32 │ +│ B │ LPLoss( │ predict │ 0.57 │ │ │ p = 2) │ │ │ └───┴──────────┴───────────┴─────────────┘ ┌───┬────────────────┬─────────┐ │ │ per_fold │ 1.96*SE │ ├───┼────────────────┼─────────┤ -│ A │ [0.58, 0.404] │ 0.244 │ -│ B │ [0.438, 0.201] │ 0.329 │ -└───┴────────────────┴─────────┘

Or the user can define their own re-usable ResamplingStrategy objects; see Custom resampling strategies below.

Built-in resampling strategies

MLJBase.HoldoutType
holdout = Holdout(; fraction_train=0.7, shuffle=nothing, rng=nothing)

Instantiate a Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.

train_test_pairs(holdout, rows)

Returns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.

source
MLJBase.CVType
cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning.

train_test_pairs(cv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.StratifiedCVType
stratified_cv = StratifiedCV(; nfolds=6,
+│ A │ [0.621, 0.705] │ 0.115   │
+│ B │ [0.459, 0.68]  │ 0.306   │
+└───┴────────────────┴─────────┘

Or the user can define their own re-usable ResamplingStrategy objects; see Custom resampling strategies below.

Built-in resampling strategies

MLJBase.HoldoutType
holdout = Holdout(; fraction_train=0.7, shuffle=nothing, rng=nothing)

Instantiate a Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.

train_test_pairs(holdout, rows)

Returns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.

source
MLJBase.CVType
cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning.

train_test_pairs(cv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.StratifiedCVType
stratified_cv = StratifiedCV(; nfolds=6,
                                shuffle=false,
-                               rng=Random.GLOBAL_RNG)

Stratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).

train_test_pairs(stratified_cv, rows, y)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.

Unlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.

The stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.TimeSeriesCVType
tscv = TimeSeriesCV(; nfolds=4)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.

train_test_pairs(tscv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.

The first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.

Examples

julia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)
+                               rng=Random.GLOBAL_RNG)

Stratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).

train_test_pairs(stratified_cv, rows, y)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.

Unlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.

The stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.TimeSeriesCVType
tscv = TimeSeriesCV(; nfolds=4)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.

train_test_pairs(tscv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.

The first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.

Examples

julia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)
 3-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:
  (1:4, 5:6)
  (1:6, 7:8)
@@ -149,7 +149,7 @@
 _.per_observation = [missing]
 _.fitted_params_per_fold = [ … ]
 _.report_per_fold = [ … ]
-_.train_test_rows = [ … ]
source

Custom resampling strategies

To define a new resampling strategy, make relevant parameters of your strategy the fields of a new type MyResamplingStrategy <: MLJ.ResamplingStrategy, and implement one of the following methods:

MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows)
+_.train_test_rows = [ … ]
source

Custom resampling strategies

To define a new resampling strategy, make relevant parameters of your strategy the fields of a new type MyResamplingStrategy <: MLJ.ResamplingStrategy, and implement one of the following methods:

MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows)
 MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, y)
 MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, X, y)

Each method takes a vector of indices rows and returns a vector [(t1, e1), (t2, e2), ... (tk, ek)] of train/test pairs of row indices selected from rows. Here X, y are the input and target data (ignored in simple strategies, such as Holdout and CV).

Here is the code for the Holdout strategy as an example:

struct Holdout <: ResamplingStrategy
     fraction_train::Float64
@@ -181,4 +181,4 @@
     train, test = partition(rows, holdout.fraction_train,
                           shuffle=holdout.shuffle, rng=holdout.rng)
     return [(train, test),]
-end
+end diff --git a/dev/frequently_asked_questions/index.html b/dev/frequently_asked_questions/index.html index 4097aef98..ae63fd806 100644 --- a/dev/frequently_asked_questions/index.html +++ b/dev/frequently_asked_questions/index.html @@ -1,2 +1,2 @@ -FAQ · MLJ

Frequently Asked Questions

Julia already has a great machine learning toolbox, ScitkitLearn.jl. Why MLJ?

An alternative machine learning toolbox for Julia users is ScikitLearn.jl. Initially intended as a Julia wrapper for the popular python library scikit-learn, ML algorithms written in Julia can also implement the ScikitLearn.jl API. Meta-algorithms (systematic tuning, pipelining, etc) remain python wrapped code, however.

While ScikitLearn.jl provides the Julia user with access to a mature and large library of machine learning models, the scikit-learn API on which it is modeled, dating back to 2007, is not likely to evolve significantly in the future. MLJ enjoys (or will enjoy) several features that should make it an attractive alternative in the longer term:

  • One language. ScikitLearn.jl wraps Python code, which in turn wraps C code for performance-critical routines. A Julia machine learning algorithm that implements the MLJ model interface is 100% Julia. Writing code in Julia is almost as fast as Python and well-written Julia code runs almost as fast as C. Additionally, a single language design provides superior interoperability. For example, one can implement: (i) gradient-descent tuning of hyperparameters, using automatic differentiation libraries such as Flux.jl; and (ii) GPU performance boosts without major code refactoring, using CuArrays.jl.

  • Registry for model metadata. In ScikitLearn.jl the list of available models, as well as model metadata (whether a model handles categorical inputs, whether it can make probabilistic predictions, etc) must be gleaned from the documentation. In MLJ, this information is more structured and is accessible to MLJ via a searchable model registry (without the models needing to be loaded).

  • Flexible API for model composition. Pipelines in scikit-learn are more of an afterthought than an integral part of the original design. By contrast, MLJ's user-interaction API was predicated on the requirements of a flexible "learning network" API, one that allows models to be connected in essentially arbitrary ways (such as Wolpert model stacks). Networks can be built and tested in stages before being exported as first-class stand-alone models. Networks feature "smart" training (only necessary components are retrained after parameter changes) and will eventually be trainable using a DAG scheduler.

  • Clean probabilistic API. The scikit-learn API does not specify a universal standard for the form of probabilistic predictions. By fixing a probabilistic API along the lines of the skpro project, MLJ aims to improve support for Bayesian statistics and probabilistic graphical models.

  • Universal adoption of categorical data types. Python's scientific array library NumPy has no dedicated data type for representing categorical data (i.e., no type that tracks the pool of all possible classes). Generally, scikit-learn models deal with this by requiring data to be relabeled as integers. However, the naive user trains a model on relabeled categorical data only to discover that evaluation on a test set crashes their code because a categorical feature takes on a value not observed in training. MLJ mitigates such issues by insisting on the use of categorical data types, and by insisting that MLJ model implementations preserve the class pools. If, for example, a training target contains classes in the pool that do not appear in the training set, a probabilistic prediction will nevertheless predict a distribution whose support includes the missing class, but which is appropriately weighted with probability zero.

Finally, we note that a large number of ScikitLearn.jl models are now wrapped for use in MLJ.

+FAQ · MLJ

Frequently Asked Questions

Julia already has a great machine learning toolbox, ScitkitLearn.jl. Why MLJ?

An alternative machine learning toolbox for Julia users is ScikitLearn.jl. Initially intended as a Julia wrapper for the popular python library scikit-learn, ML algorithms written in Julia can also implement the ScikitLearn.jl API. Meta-algorithms (systematic tuning, pipelining, etc) remain python wrapped code, however.

While ScikitLearn.jl provides the Julia user with access to a mature and large library of machine learning models, the scikit-learn API on which it is modeled, dating back to 2007, is not likely to evolve significantly in the future. MLJ enjoys (or will enjoy) several features that should make it an attractive alternative in the longer term:

  • One language. ScikitLearn.jl wraps Python code, which in turn wraps C code for performance-critical routines. A Julia machine learning algorithm that implements the MLJ model interface is 100% Julia. Writing code in Julia is almost as fast as Python and well-written Julia code runs almost as fast as C. Additionally, a single language design provides superior interoperability. For example, one can implement: (i) gradient-descent tuning of hyperparameters, using automatic differentiation libraries such as Flux.jl; and (ii) GPU performance boosts without major code refactoring, using CuArrays.jl.

  • Registry for model metadata. In ScikitLearn.jl the list of available models, as well as model metadata (whether a model handles categorical inputs, whether it can make probabilistic predictions, etc) must be gleaned from the documentation. In MLJ, this information is more structured and is accessible to MLJ via a searchable model registry (without the models needing to be loaded).

  • Flexible API for model composition. Pipelines in scikit-learn are more of an afterthought than an integral part of the original design. By contrast, MLJ's user-interaction API was predicated on the requirements of a flexible "learning network" API, one that allows models to be connected in essentially arbitrary ways (such as Wolpert model stacks). Networks can be built and tested in stages before being exported as first-class stand-alone models. Networks feature "smart" training (only necessary components are retrained after parameter changes) and will eventually be trainable using a DAG scheduler.

  • Clean probabilistic API. The scikit-learn API does not specify a universal standard for the form of probabilistic predictions. By fixing a probabilistic API along the lines of the skpro project, MLJ aims to improve support for Bayesian statistics and probabilistic graphical models.

  • Universal adoption of categorical data types. Python's scientific array library NumPy has no dedicated data type for representing categorical data (i.e., no type that tracks the pool of all possible classes). Generally, scikit-learn models deal with this by requiring data to be relabeled as integers. However, the naive user trains a model on relabeled categorical data only to discover that evaluation on a test set crashes their code because a categorical feature takes on a value not observed in training. MLJ mitigates such issues by insisting on the use of categorical data types, and by insisting that MLJ model implementations preserve the class pools. If, for example, a training target contains classes in the pool that do not appear in the training set, a probabilistic prediction will nevertheless predict a distribution whose support includes the missing class, but which is appropriately weighted with probability zero.

Finally, we note that a large number of ScikitLearn.jl models are now wrapped for use in MLJ.

diff --git a/dev/generating_synthetic_data/index.html b/dev/generating_synthetic_data/index.html index a4badfa4a..8330b02eb 100644 --- a/dev/generating_synthetic_data/index.html +++ b/dev/generating_synthetic_data/index.html @@ -1,21 +1,21 @@ -Generating Synthetic Data · MLJ

Generating Synthetic Data

Here synthetic data means artificially generated data, with no reference to a "real world" data set. Not to be confused "fake data" obtained by resampling from a distribution fit to some actual real data.

MLJ has a set of functions - make_blobs, make_circles, make_moons and make_regression (closely resembling functions in scikit-learn of the same name) - for generating synthetic data sets. These are useful for testing machine learning models (e.g., testing user-defined composite models; see Composing Models)

Generating Gaussian blobs

MLJBase.make_blobsFunction
X, y = make_blobs(n=100, p=2; kwargs...)

Generate Gaussian blobs for clustering and classification problems.

Return value

By default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • centers=3: either a number of centers or a c x p matrix with c pre-determined centers,

  • cluster_std=1.0: the standard deviation(s) of each blob,

  • center_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
source
using MLJ, DataFrames
+Generating Synthetic Data · MLJ

Generating Synthetic Data

Here synthetic data means artificially generated data, with no reference to a "real world" data set. Not to be confused "fake data" obtained by resampling from a distribution fit to some actual real data.

MLJ has a set of functions - make_blobs, make_circles, make_moons and make_regression (closely resembling functions in scikit-learn of the same name) - for generating synthetic data sets. These are useful for testing machine learning models (e.g., testing user-defined composite models; see Composing Models)

Generating Gaussian blobs

MLJBase.make_blobsFunction
X, y = make_blobs(n=100, p=2; kwargs...)

Generate Gaussian blobs for clustering and classification problems.

Return value

By default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • centers=3: either a number of centers or a c x p matrix with c pre-determined centers,

  • cluster_std=1.0: the standard deviation(s) of each blob,

  • center_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
source
using MLJ, DataFrames
 X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
 dfBlobs = DataFrame(X)
 dfBlobs.y = y
-first(dfBlobs, 3)
3×4 DataFrame
Rowx1x2x3y
Float64Float64Float64Cat…
1-2.25936-2.054462.84731
2-1.3856-0.9681322.032151
35.8567-2.462317.299162
using VegaLite
-dfBlobs |> @vlplot(:point, x=:x1, y=:x2, color = :"y:n") 

svg

dfBlobs |> @vlplot(:point, x=:x1, y=:x3, color = :"y:n") 

svg

Generating concentric circles

MLJBase.make_circlesFunction
X, y = make_circles(n=100; kwargs...)

Generate n labeled points close to two concentric circles for classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0: standard deviation of the Gaussian noise added to the data,

  • factor=0.8: ratio of the smaller radius over the larger one,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_circles(100; noise=0.5, factor=0.3)
source
using MLJ, DataFrames
+first(dfBlobs, 3)
3×4 DataFrame
Rowx1x2x3y
Float64Float64Float64Cat…
13.794776.21848.190741
22.482786.8372610.22521
32.118431.247685.354172
using VegaLite
+dfBlobs |> @vlplot(:point, x=:x1, y=:x2, color = :"y:n") 

svg

dfBlobs |> @vlplot(:point, x=:x1, y=:x3, color = :"y:n") 

svg

Generating concentric circles

MLJBase.make_circlesFunction
X, y = make_circles(n=100; kwargs...)

Generate n labeled points close to two concentric circles for classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0: standard deviation of the Gaussian noise added to the data,

  • factor=0.8: ratio of the smaller radius over the larger one,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_circles(100; noise=0.5, factor=0.3)
source
using MLJ, DataFrames
 X, y = make_circles(100; noise=0.05, factor=0.3)
 dfCircles = DataFrame(X)
 dfCircles.y = y
-first(dfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
1-0.974784-0.1017971
2-0.2650140.1324890
3-0.10539-0.1771820
using VegaLite
-dfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :"y:n") 

svg

Sampling from two interleaved half-circles

MLJBase.make_moonsFunction
make_moons(n::Int=100; kwargs...)

Generates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0.1: standard deviation of the Gaussian noise added to the data,

  • xshift=1.0: horizontal translation of the second center with respect to the first one.

  • yshift=0.3: vertical translation of the second center with respect to the first one.

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_moons(100; noise=0.5)
source
using MLJ, DataFrames
+first(dfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
1-0.8661220.4096651
20.211735-0.9833941
30.493587-0.958241
using VegaLite
+dfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :"y:n") 

svg

Sampling from two interleaved half-circles

MLJBase.make_moonsFunction
make_moons(n::Int=100; kwargs...)

Generates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0.1: standard deviation of the Gaussian noise added to the data,

  • xshift=1.0: horizontal translation of the second center with respect to the first one.

  • yshift=0.3: vertical translation of the second center with respect to the first one.

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_moons(100; noise=0.5)
source
using MLJ, DataFrames
 X, y = make_moons(100; noise=0.05)
 dfHalfCircles = DataFrame(X)
 dfHalfCircles.y = y
-first(dfHalfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
12.039130.2393551
2-0.7237090.5716070
3-0.9777510.09522510
using VegaLite
-dfHalfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :"y:n") 

svg

Regression data generated from noisy linear models

MLJBase.make_regressionFunction
make_regression(n, p; kwargs...)

Generate Gaussian input features and a linear response with Gaussian noise, for use with regression models.

Return value

By default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.

Keywords

  • intercept=true: Whether to generate data from a model with intercept.

  • n_targets=1: Number of columns in the target.

  • sparse=0: Proportion of the generating weight vector that is sparse.

  • noise=0.1: Standard deviation of the Gaussian noise added to the response (target).

  • outliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)

  • as_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.

  • eltype=Float64: Element type for X and y. Must subtype AbstractFloat.

  • binary=false: Whether the target should be binarized (via a sigmoid).

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false).

Example

X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
source
using MLJ, DataFrames
+first(dfHalfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
10.849042-0.6491091
20.2489-0.2976561
30.002468660.1319941
using VegaLite
+dfHalfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :"y:n") 

svg

Regression data generated from noisy linear models

MLJBase.make_regressionFunction
make_regression(n, p; kwargs...)

Generate Gaussian input features and a linear response with Gaussian noise, for use with regression models.

Return value

By default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.

Keywords

  • intercept=true: Whether to generate data from a model with intercept.

  • n_targets=1: Number of columns in the target.

  • sparse=0: Proportion of the generating weight vector that is sparse.

  • noise=0.1: Standard deviation of the Gaussian noise added to the response (target).

  • outliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)

  • as_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.

  • eltype=Float64: Element type for X and y. Must subtype AbstractFloat.

  • binary=false: Whether the target should be binarized (via a sigmoid).

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false).

Example

X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
source
using MLJ, DataFrames
 X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
 dfRegression = DataFrame(X)
 dfRegression.y = y
-first(dfRegression, 3)
3×6 DataFrame
Rowx1x2x3x4x5y
Float64Float64Float64Float64Float64Float64
10.9314010.2546641.893990.444744-1.427150.385433
20.776942-0.133456-0.732868-0.6883630.574637-1.17219
3-0.109274-0.2091791.164330.160338-0.729026-0.380971
+first(dfRegression, 3)
3×6 DataFrame
Rowx1x2x3x4x5y
Float64Float64Float64Float64Float64Float64
1-0.1020320.3582670.2980441.504120.4145060.189413
21.36867-1.500251.79527-0.1612160.4474364.02934
30.618891-0.4510660.7864660.173420.7444192.1948
diff --git a/dev/getting_started/index.html b/dev/getting_started/index.html index 47d3c1caf..b1649c62b 100644 --- a/dev/getting_started/index.html +++ b/dev/getting_started/index.html @@ -1,5 +1,5 @@ -Getting Started · MLJ

Getting Started

For an outline of MLJ's goals and features, see About MLJ.

This page introduces some MLJ basics, assuming some familiarity with machine learning. For a complete list of other MLJ learning resources, see Learning MLJ.

MLJ collects together the functionality provided by mutliple packages. To learn how to install components separately, run using MLJ; @doc MLJ.

This section introduces only the most basic MLJ operations and concepts. It assumes MLJ has been successfully installed. See Installation if this is not the case.

Choosing and evaluating a model

The following code loads Fisher's famous iris data set as a named tuple of column vectors:

julia> using MLJ
julia> iris = load_iris();
julia> selectrows(iris, 1:3) |> pretty┌──────────────┬─────────────┬──────────────┬─────────────┬──────────────────────────────────┐ +Getting Started · MLJ

Getting Started

For an outline of MLJ's goals and features, see About MLJ.

This page introduces some MLJ basics, assuming some familiarity with machine learning. For a complete list of other MLJ learning resources, see Learning MLJ.

MLJ collects together the functionality provided by mutliple packages. To learn how to install components separately, run using MLJ; @doc MLJ.

This section introduces only the most basic MLJ operations and concepts. It assumes MLJ has been successfully installed. See Installation if this is not the case.

Choosing and evaluating a model

The following code loads Fisher's famous iris data set as a named tuple of column vectors:

julia> using MLJ
julia> iris = load_iris();
julia> selectrows(iris, 1:3) |> pretty┌──────────────┬─────────────┬──────────────┬─────────────┬──────────────────────────────────┐ │ sepal_length │ sepal_width │ petal_length │ petal_width │ target │ │ Float64 │ Float64 │ Float64 │ Float64 │ CategoricalValue{String, UInt32} │ │ Continuous │ Continuous │ Continuous │ Continuous │ Multiclass{3} │ @@ -24,7 +24,7 @@ │ 6.7 │ 3.3 │ 5.7 │ 2.1 │ │ 5.7 │ 2.8 │ 4.1 │ 1.3 │ │ 7.2 │ 3.0 │ 5.8 │ 1.6 │ -└──────────────┴─────────────┴──────────────┴─────────────┘

This call to unpack splits off any column with name == to :target into something called y, and all the remaining columns into X.

To list all models available in MLJ's model registry do models(). Listing the models compatible with the present data:

julia> models(matching(X,y))54-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+└──────────────┴─────────────┴──────────────┴─────────────┘

This call to unpack splits off any column with name == to :target into something called y, and all the remaining columns into X.

To list all models available in MLJ's model registry do models(). Listing the models compatible with the present data:

julia> models(matching(X,y))54-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = AdaBoostClassifier, package_name = MLJScikitLearnInterface, ... )
  (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
  (name = BaggingClassifier, package_name = MLJScikitLearnInterface, ... )
@@ -82,8 +82,8 @@
  OrderedFactor

We use the scitype function to check how MLJ is going to interpret given data. Our choice of encoding for y works for DecisionTreeClassifier, because we have:

julia> scitype(y)AbstractVector{Multiclass{3}} (alias for AbstractArray{Multiclass{3}, 1})

and Multiclass{3} <: Finite. If we would encode with integers instead, we obtain:

julia> yint = int.(y);
julia> scitype(yint)AbstractVector{Count} (alias for AbstractArray{Count, 1})

and using yint in place of y in classification problems will fail. See also Working with Categorical Data.

For more on scientific types, see Data containers and scientific types below.

Fit and predict

To illustrate MLJ's fit and predict interface, let's perform our performance evaluations by hand, but using a simple holdout set, instead of cross-validation.

Wrapping the model in data creates a machine which will store training outcomes:

julia> mach = machine(tree, X, y)untrained Machine; caches model-specific representations of data
   model: DecisionTreeClassifier(max_depth = -1, …)
   args:
-    1:	Source @556 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @539 ⏎ AbstractVector{Multiclass{3}}

Training and testing on a hold-out set:

julia> train, test = partition(eachindex(y), 0.7); # 70:30 split
julia> fit!(mach, rows=train);[ Info: Training machine(DecisionTreeClassifier(max_depth = -1, …), …).
julia> yhat = predict(mach, X[test,:]);
julia> yhat[3:5]3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}: + 1: Source @751 ⏎ Table{AbstractVector{Continuous}} + 2: Source @001 ⏎ AbstractVector{Multiclass{3}}

Training and testing on a hold-out set:

julia> train, test = partition(eachindex(y), 0.7); # 70:30 split
julia> fit!(mach, rows=train);[ Info: Training machine(DecisionTreeClassifier(max_depth = -1, …), …).
julia> yhat = predict(mach, X[test,:]);
julia> yhat[3:5]3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}: UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0) UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0) UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
julia> log_loss(yhat, y[test])2.4029102259411435

Note that log_loss and cross_entropy are aliases for LogLoss() (which can be passed an optional keyword parameter, as in LogLoss(tol=0.001)). For a list of all losses and scores, and their aliases, run measures().

Notice that yhat is a vector of Distribution objects, because DecisionTreeClassifier makes probabilistic predictions. The methods of the Distributions.jl package can be applied to such distributions:

julia> broadcast(pdf, yhat[3:5], "virginica") # predicted probabilities of virginica3-element Vector{Float64}:
@@ -115,11 +115,11 @@
   count = false)
julia> mach2 = machine(stand, v)untrained Machine; caches model-specific representations of data model: Standardizer(features = Symbol[], …) args: - 1: Source @294 ⏎ AbstractVector{Continuous}
julia> fit!(mach2)[ Info: Training machine(Standardizer(features = Symbol[], …), …). + 1: Source @679 ⏎ AbstractVector{Continuous}
julia> fit!(mach2)[ Info: Training machine(Standardizer(features = Symbol[], …), …). trained Machine; caches model-specific representations of data model: Standardizer(features = Symbol[], …) args: - 1: Source @294 ⏎ AbstractVector{Continuous}
julia> w = transform(mach2, v)4-element Vector{Float64}: + 1: Source @679 ⏎ AbstractVector{Continuous}
julia> w = transform(mach2, v)4-element Vector{Float64}: -1.161895003862225 -0.3872983346207417 0.3872983346207417 @@ -174,6 +174,7 @@ package_name = "DecisionTree", is_supervised = true, abstract_type = Probabilistic, + constructor = nothing, deep_properties = (), docstring = "```\nDecisionTreeClassifier\n```\n\nA model type for c...", fit_data_scitype = @@ -237,4 +238,4 @@ input_scitype = Table{<:Union{AbstractVector{<:Continuous}, AbstractVector{<:Count}, AbstractVector{<:OrderedFactor}}}, target_scitype = AbstractVector{<:Finite}, - output_scitype = Unknown)
julia> i.input_scitypeTable{<:Union{AbstractVector{<:Continuous}, AbstractVector{<:Count}, AbstractVector{<:OrderedFactor}}}
julia> i.target_scitypeAbstractVector{<:Finite} (alias for AbstractArray{<:Finite, 1})

This output indicates that any table with Continuous, Count or OrderedFactor columns is acceptable as the input X, and that any vector with element scitype <: Finite is acceptable as the target y.

For more on matching models to data, see Model Search.

Scalar scientific types

Models in MLJ will always apply the MLJ convention described in ScientificTypes.jl to decide how to interpret the elements of your container types. Here are the key features of that convention:

  • Any AbstractFloat is interpreted as Continuous.

  • Any Integer is interpreted as Count.

  • Any CategoricalValue x, is interpreted as Multiclass or OrderedFactor, depending on the value of isordered(x).

  • Strings and Chars are not interpreted as Multiclass or OrderedFactor (they have scitypes Textual and Unknown respectively).

  • In particular, integers (including Bools) cannot be used to represent categorical data. Use the preceding coerce operations to coerce to a Finite scitype.

  • The scientific types of nothing and missing are Nothing and Missing, native types we also regard as scientific.

Use coerce(v, OrderedFactor) or coerce(v, Multiclass) to coerce a vector v of integers, strings or characters to a vector with an appropriate Finite (categorical) scitype. See also Working with Categorical Data, and the ScientificTypes.jl documentation.

+ output_scitype = Unknown)

julia> i.input_scitypeTable{<:Union{AbstractVector{<:Continuous}, AbstractVector{<:Count}, AbstractVector{<:OrderedFactor}}}
julia> i.target_scitypeAbstractVector{<:Finite} (alias for AbstractArray{<:Finite, 1})

This output indicates that any table with Continuous, Count or OrderedFactor columns is acceptable as the input X, and that any vector with element scitype <: Finite is acceptable as the target y.

For more on matching models to data, see Model Search.

Scalar scientific types

Models in MLJ will always apply the MLJ convention described in ScientificTypes.jl to decide how to interpret the elements of your container types. Here are the key features of that convention:

  • Any AbstractFloat is interpreted as Continuous.

  • Any Integer is interpreted as Count.

  • Any CategoricalValue x, is interpreted as Multiclass or OrderedFactor, depending on the value of isordered(x).

  • Strings and Chars are not interpreted as Multiclass or OrderedFactor (they have scitypes Textual and Unknown respectively).

  • In particular, integers (including Bools) cannot be used to represent categorical data. Use the preceding coerce operations to coerce to a Finite scitype.

  • The scientific types of nothing and missing are Nothing and Missing, native types we also regard as scientific.

Use coerce(v, OrderedFactor) or coerce(v, Multiclass) to coerce a vector v of integers, strings or characters to a vector with an appropriate Finite (categorical) scitype. See also Working with Categorical Data, and the ScientificTypes.jl documentation.

diff --git a/dev/glossary/index.html b/dev/glossary/index.html index d52f62ea4..b21425eb3 100644 --- a/dev/glossary/index.html +++ b/dev/glossary/index.html @@ -1,2 +1,2 @@ -Glossary · MLJ

Glossary

Note: This glossary includes some detail intended mainly for MLJ developers.

Basics

hyperparameters

Parameters on which some learning algorithm depends, specified before the algorithm is applied, and where learning is interpreted in the broadest sense. For example, PCA feature reduction is a "preprocessing" transformation "learning" a projection from training data, governed by a dimension hyperparameter. Hyperparameters in our sense may specify configuration (eg, number of parallel processes) even when this does not affect the end-product of learning. (But we exclude verbosity level.)

model (object of abstract type Model)

Object collecting together hyperpameters of a single algorithm. Models are classified either as supervised or unsupervised models (eg, "transformers"), with corresponding subtypes Supervised <: Model and Unsupervised <: Model.

fitresult (type generally defined outside of MLJ)

Also known as "learned" or "fitted" parameters, these are "weights", "coefficients", or similar parameters learned by an algorithm, after adopting the prescribed hyper-parameters. For example, decision trees of a random forest, the coefficients and intercept of a linear model, or the projection matrices of a PCA dimension-reduction algorithm.

operation

Data-manipulating operations (methods) using some fitresult. For supervised learners, the predict, predict_mean, predict_median, or predict_mode methods; for transformers, the transform or inverse_transform method. An operation may also refer to an ordinary data-manipulating method that does not depend on a fit-result (e.g., a broadcasted logarithm) which is then called static operation for clarity. An operation that is not static is dynamic.

machine (object of type Machine)

An object consisting of:

  1. A model

  2. A fit-result (undefined until training)

  3. Training arguments (one for each data argument of the model's associated fit method). A training argument is data used for training (subsampled by specifying rows=... in fit!) but also in evaluation (subsampled by specifying rows=... in predict, predict_mean, etc). Generally, there are two training arguments for supervised models, and just one for unsupervised models. Each argument is either a Source node, wrapping concrete data supplied to the machine constructor, or a Node, in the case of a learning network (see below). Both kinds of nodes can be called with an optional rows=... keyword argument to (lazily) return concrete data.

In addition, machines store "report" metadata, for recording algorithm-specific statistics of training (eg, an internal estimate of generalization error, feature importances); and they cache information allowing the fit-result to be updated without repeating unnecessary information.

Machines are trained by calls to a fit! method which may be passed an optional argument specifying the rows of data to be used in training.

For more, see the Machines section.

Learning Networks and Composite Models

Note: Multiple machines in a learning network may share the same model, and multiple learning nodes may share the same machine.

source node (object of type Source)

A container for training data and point of entry for new data in a learning network (see below).

node (object of type Node)

Essentially a machine (whose arguments are possibly other nodes) wrapped in an associated operation (e.g., predict or inverse_transform). It consists primarily of:

  1. An operation, static or dynamic.
  2. A machine, or nothing if the operation is static.
  3. Upstream connections to other nodes, specified by a list of arguments (one for each argument of the operation). These are the arguments on which the operation "acts" when the node N is called, as in N().

learning network

A directed acyclic graph implicit in the connections of a collection of source(s) and nodes.

wrapper

Any model with one or more other models as hyper-parameters.

composite model

Any wrapper, or any learning network, "exported" as a model (see Composing Models).

+Glossary · MLJ

Glossary

Note: This glossary includes some detail intended mainly for MLJ developers.

Basics

hyperparameters

Parameters on which some learning algorithm depends, specified before the algorithm is applied, and where learning is interpreted in the broadest sense. For example, PCA feature reduction is a "preprocessing" transformation "learning" a projection from training data, governed by a dimension hyperparameter. Hyperparameters in our sense may specify configuration (eg, number of parallel processes) even when this does not affect the end-product of learning. (But we exclude verbosity level.)

model (object of abstract type Model)

Object collecting together hyperpameters of a single algorithm. Models are classified either as supervised or unsupervised models (eg, "transformers"), with corresponding subtypes Supervised <: Model and Unsupervised <: Model.

fitresult (type generally defined outside of MLJ)

Also known as "learned" or "fitted" parameters, these are "weights", "coefficients", or similar parameters learned by an algorithm, after adopting the prescribed hyper-parameters. For example, decision trees of a random forest, the coefficients and intercept of a linear model, or the projection matrices of a PCA dimension-reduction algorithm.

operation

Data-manipulating operations (methods) using some fitresult. For supervised learners, the predict, predict_mean, predict_median, or predict_mode methods; for transformers, the transform or inverse_transform method. An operation may also refer to an ordinary data-manipulating method that does not depend on a fit-result (e.g., a broadcasted logarithm) which is then called static operation for clarity. An operation that is not static is dynamic.

machine (object of type Machine)

An object consisting of:

  1. A model

  2. A fit-result (undefined until training)

  3. Training arguments (one for each data argument of the model's associated fit method). A training argument is data used for training (subsampled by specifying rows=... in fit!) but also in evaluation (subsampled by specifying rows=... in predict, predict_mean, etc). Generally, there are two training arguments for supervised models, and just one for unsupervised models. Each argument is either a Source node, wrapping concrete data supplied to the machine constructor, or a Node, in the case of a learning network (see below). Both kinds of nodes can be called with an optional rows=... keyword argument to (lazily) return concrete data.

In addition, machines store "report" metadata, for recording algorithm-specific statistics of training (eg, an internal estimate of generalization error, feature importances); and they cache information allowing the fit-result to be updated without repeating unnecessary information.

Machines are trained by calls to a fit! method which may be passed an optional argument specifying the rows of data to be used in training.

For more, see the Machines section.

Learning Networks and Composite Models

Note: Multiple machines in a learning network may share the same model, and multiple learning nodes may share the same machine.

source node (object of type Source)

A container for training data and point of entry for new data in a learning network (see below).

node (object of type Node)

Essentially a machine (whose arguments are possibly other nodes) wrapped in an associated operation (e.g., predict or inverse_transform). It consists primarily of:

  1. An operation, static or dynamic.
  2. A machine, or nothing if the operation is static.
  3. Upstream connections to other nodes, specified by a list of arguments (one for each argument of the operation). These are the arguments on which the operation "acts" when the node N is called, as in N().

learning network

A directed acyclic graph implicit in the connections of a collection of source(s) and nodes.

wrapper

Any model with one or more other models as hyper-parameters.

composite model

Any wrapper, or any learning network, "exported" as a model (see Composing Models).

diff --git a/dev/homogeneous_ensembles/index.html b/dev/homogeneous_ensembles/index.html index a93cb493a..71146e8b5 100644 --- a/dev/homogeneous_ensembles/index.html +++ b/dev/homogeneous_ensembles/index.html @@ -1,8 +1,8 @@ -Homogeneous Ensembles · MLJ

Homogeneous Ensembles

Although an ensemble of models sharing a common set of hyperparameters can be defined using the learning network API, MLJ's EnsembleModel model wrapper is preferred, for convenience and best performance. Examples of using EnsembleModel are given in this Data Science Tutorial.

When bagging decision trees, further randomness is normally introduced by subsampling features, when training each node of each tree (Ho (1995), Brieman and Cutler (2001)). A bagged ensemble of such trees is known as a Random Forest. You can see an example of using EnsembleModel to build a random forest in this Data Science Tutorial. However, you may also want to use a canned random forest model. Run models("RandomForest") to list such models.

MLJEnsembles.EnsembleModelFunction
EnsembleModel(model,
+Homogeneous Ensembles · MLJ

Homogeneous Ensembles

Although an ensemble of models sharing a common set of hyperparameters can be defined using the learning network API, MLJ's EnsembleModel model wrapper is preferred, for convenience and best performance. Examples of using EnsembleModel are given in this Data Science Tutorial.

When bagging decision trees, further randomness is normally introduced by subsampling features, when training each node of each tree (Ho (1995), Brieman and Cutler (2001)). A bagged ensemble of such trees is known as a Random Forest. You can see an example of using EnsembleModel to build a random forest in this Data Science Tutorial. However, you may also want to use a canned random forest model. Run models("RandomForest") to list such models.

MLJEnsembles.EnsembleModelFunction
EnsembleModel(model,
               atomic_weights=Float64[],
               bagging_fraction=0.8,
               n=100,
               rng=GLOBAL_RNG,
               acceleration=CPU1(),
-              out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

source
+ out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

source
diff --git a/dev/index.html b/dev/index.html index c52ea5b56..332228c44 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,5 +1,5 @@ -Home · MLJ
+Home · MLJ
+ Star

Model Browser

Reference Manual

Basics

Getting Started | Working with Categorical Data | Common MLJ Workflows | Machines | MLJ Cheatsheet

Data

Working with Categorical Data | Preparing Data | Generating Synthetic Data | OpenML Integration | Correcting Class Imbalance

Models

Model Search | Loading Model Code | Transformers and Other Unsupervised Models | Simple User Defined Models | List of Supported Models | Third Party Packages

Meta-algorithms

Evaluating Model Performance | Tuning Models | Composing Models | Controlling Iterative Models | Learning Curves| Correcting Class Imbalance | Thresholding Probabilistic Predictors

Composition

Composing Models | Linear Pipelines | Target Transformations | Homogeneous Ensembles | Model Stacking | Learning Networks| Correcting Class Imbalance

Integration

Logging Workflows | OpenML Integration

Customization and Extension

Simple User Defined Models | Quick-Start Guide to Adding Models | Adding Models for General Use | Composing Models | Internals | Modifying Behavior

Miscellaneous

Weights | Acceleration and Parallelism | Performance Measures

diff --git a/dev/internals/index.html b/dev/internals/index.html index 8ad50e52a..c1e7a3264 100644 --- a/dev/internals/index.html +++ b/dev/internals/index.html @@ -1,5 +1,5 @@ -Internals · MLJ

Internals

The machine interface, simplified

The following is a simplified description of the Machine interface. It predates the introduction of an optional data front-end for models (see Implementing a data front-end). See also the Glossary

The Machine type

mutable struct Machine{M<Model}
+Internals · MLJ
+end
diff --git a/dev/learning_curves/index.html b/dev/learning_curves/index.html index 18705cb5f..44bb6f1db 100644 --- a/dev/learning_curves/index.html +++ b/dev/learning_curves/index.html @@ -1,5 +1,5 @@ -Learning Curves · MLJ

Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot but generates the data needed to do so.

To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

using MLJ
+Learning Curves · MLJ

Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot but generates the data needed to do so.

To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

using MLJ
 X, y = @load_boston;
 
 atom = (@load RidgeRegressor pkg=MLJLinearModels)()
@@ -62,4 +62,4 @@
      ylab="Holdout estimate of RMS error")
 
 
learning_curve(model::Supervised, X, y; kwargs...)
-learning_curve(model::Supervised, X, y, w; kwargs...)

Plot a learning curve (or curves) directly, without first constructing a machine.

Summary of key-word options

  • resolution - number of points generated from range (number model evaluations); default is 30

  • acceleration - parallelization option for passing to evaluate!; an instance of CPU1, CPUProcesses or CPUThreads from the ComputationalResources.jl; default is default_resource()

  • acceleration_grid - parallelization option for distributing each performancde evaluation

  • rngs - for specifying random number generator(s) to be passed to the model (see above)

  • rng_name - name of the model hyper-parameter representing a random number generator (see above); possibly nested

Other key-word options are documented at TunedModel.

source
+learning_curve(model::Supervised, X, y, w; kwargs...)

Plot a learning curve (or curves) directly, without first constructing a machine.

Summary of key-word options

Other key-word options are documented at TunedModel.

source diff --git a/dev/learning_mlj/index.html b/dev/learning_mlj/index.html index 0f2e76f3e..b7f737678 100644 --- a/dev/learning_mlj/index.html +++ b/dev/learning_mlj/index.html @@ -1,2 +1,2 @@ -Learning MLJ · MLJ

Learning MLJ

MLJ Cheatsheet

See also Getting help and reporting problems.

The present document, although littered with examples, is primarily intended as a complete reference.

Where to start?

Completely new to Julia?

Julia's learning resources page | Learn X in Y minutes | HelloJulia

New to data science?

Julia Data Science

New to machine learning?

Introduction to Statistical Learning with Julia versions of the R labs here

Know some ML and just want MLJ basics?

Getting Started | Common MLJ Workflows

An ML practitioner transitioning from another platform?

MLJ for Data Scientists in Two Hours | MLJTutorial

Other resources

+Learning MLJ · MLJ

Learning MLJ

MLJ Cheatsheet

See also Getting help and reporting problems.

The present document, although littered with examples, is primarily intended as a complete reference.

Where to start?

Completely new to Julia?

Julia's learning resources page | Learn X in Y minutes | HelloJulia

New to data science?

Julia Data Science

New to machine learning?

Introduction to Statistical Learning with Julia versions of the R labs here

Know some ML and just want MLJ basics?

Getting Started | Common MLJ Workflows

An ML practitioner transitioning from another platform?

MLJ for Data Scientists in Two Hours | MLJTutorial

Other resources

diff --git a/dev/learning_networks/index.html b/dev/learning_networks/index.html index a9e5a0bbd..639e985c4 100644 --- a/dev/learning_networks/index.html +++ b/dev/learning_networks/index.html @@ -1,5 +1,5 @@ -Learning Networks · MLJ

Learning Networks

Below is a practical guide to the MLJ implementation of learning networks, which have been described more abstractly in the article:

Anthony D. Blaom and Sebastian J. Voller (2020): Flexible model composition in machine learning and its implementation in MLJ. Preprint, arXiv:2012.15505.

Learning networks, an advanced but powerful MLJ feature, are "blueprints" for combining models in flexible ways, beyond ordinary linear pipelines and simple model ensembles. They are simple transformations of your existing workflows which can be "exported" to define new, re-usable composite model types (models which typically have other models as hyperparameters).

Pipeline models (see Pipeline), and model stacks (see Stack) are both implemented internally as exported learning networks.

Note

While learning networks can be used for complex machine learning workflows, their main purpose is for defining new stand-alone model types, which behave just like any other model type: Instances can be evaluated, tuned, inserted into pipelines, etc. In serious applications, users are encouraged to export their learning networks, as explained under Exporting a learning network as a new model type below, after testing the network, using a small training dataset.

Learning networks by example

Learning networks are best explained by way of example.

Lazy computation

The core idea of a learning network is delayed or lazy computation. Instead of

X = 4
+Learning Networks · MLJ

Learning Networks

Below is a practical guide to the MLJ implementation of learning networks, which have been described more abstractly in the article:

Anthony D. Blaom and Sebastian J. Voller (2020): Flexible model composition in machine learning and its implementation in MLJ. Preprint, arXiv:2012.15505.

Learning networks, an advanced but powerful MLJ feature, are "blueprints" for combining models in flexible ways, beyond ordinary linear pipelines and simple model ensembles. They are simple transformations of your existing workflows which can be "exported" to define new, re-usable composite model types (models which typically have other models as hyperparameters).

Pipeline models (see Pipeline), and model stacks (see Stack) are both implemented internally as exported learning networks.

Note

While learning networks can be used for complex machine learning workflows, their main purpose is for defining new stand-alone model types, which behave just like any other model type: Instances can be evaluated, tuned, inserted into pipelines, etc. In serious applications, users are encouraged to export their learning networks, as explained under Exporting a learning network as a new model type below, after testing the network, using a small training dataset.

Learning networks by example

Learning networks are best explained by way of example.

Lazy computation

The core idea of a learning network is delayed or lazy computation. Instead of

X = 4
 Y = 3
 Z = 2*X
 W = Y + Z
@@ -10,10 +10,10 @@
 Z = 2*X
 W = Y + Z
 W()
11

In the first computation X, Y, Z and W are all bound to ordinary data. In the second, they are bound to objects called nodes. The special nodes X and Y constitute "entry points" for data, and are called source nodes. As the terminology suggests, we can imagine these objects as part of a "network" (a directed acyclic graph) which can aid conceptualization (but is less useful in more complicated examples):

The origin of a node

The source nodes on which a given node depends are called the origins of the node:

os = origins(W)
2-element Vector{Source}:
- Source @882 ⏎ `Count`
- Source @862 ⏎ `Count`
X in os
true

Re-using a network

The advantage of lazy evaluation is that we can change data at a source node to repeat the calculation with new data. One way to do this (discouraged in practice) is to use rebind!:

Z()
8
rebind!(X, 6) # demonstration only!
+ Source @317 ⏎ `Count`
+ Source @622 ⏎ `Count`
X in os
true

Re-using a network

The advantage of lazy evaluation is that we can change data at a source node to repeat the calculation with new data. One way to do this (discouraged in practice) is to use rebind!:

Z()
8
rebind!(X, 6) # demonstration only!
 Z()
12

However, if a node has a unique origin, then one instead calls the node on the new data one would like to rebind to that origin:

origins(Z)
1-element Vector{Source}:
- Source @862 ⏎ `Count`
Z(6)
12
Z(4)
8

This has the advantage that you don't need to locate the origin and rebind data directly, and the unique-origin restriction turns out to be sufficient for the applications to learning we have in mind.

Overloading functions for use on nodes

Several built-in function like * and + above are overloaded in MLJBase to work on nodes, as illustrated above. Others that work out-of-the-box include: MLJBase.matrix, MLJBase.table, vcat, hcat, mean, median, mode, first, last, as well as broadcasted versions of log, exp, mean, mode and median. A function like sqrt is not overloaded, so that Q = sqrt(Z) will throw an error. Instead, we do

Q = node(sqrt, Z)
+ Source @622 ⏎ `Count`
Z(6)
12
Z(4)
8

This has the advantage that you don't need to locate the origin and rebind data directly, and the unique-origin restriction turns out to be sufficient for the applications to learning we have in mind.

Overloading functions for use on nodes

Several built-in function like * and + above are overloaded in MLJBase to work on nodes, as illustrated above. Others that work out-of-the-box include: MLJBase.matrix, MLJBase.table, vcat, hcat, mean, median, mode, first, last, as well as broadcasted versions of log, exp, mean, mode and median. A function like sqrt is not overloaded, so that Q = sqrt(Z) will throw an error. Instead, we do

Q = node(sqrt, Z)
 Z()
12
Q()
3.4641016151377544

You can learn more about the node function under More on defining new nodes

A network that learns

To incorporate learning in a network of nodes MLJ:

  • Allows binding of machines to nodes instead of data

  • Generates "operation" nodes when calling an operation like predict or transform on a machine and node input data. Such nodes point to both a machine (storing learned parameters) and the node from which to fetch data for applying the operation (which, unlike the nodes seen so far, depend on learned parameters to generate output).

For an example of a learning network that actually learns, we first synthesize some training data X, y, and production data Xnew:

using MLJ
 X, y = make_blobs(cluster_std=10.0, rng=123)  # `X` is a table, `y` a vector
 Xnew, _ = make_blobs(3) # `Xnew` is a table with the same number of columns

We choose a model do some dimension reduction, and another to perform classification:

pca = (@load PCA pkg=MultivariateStats verbosity=0)()
@@ -22,40 +22,40 @@
 x = transform(mach1, Xs) # defines a new node because `Xs` is a node
 
 mach2 = machine(tree, x, ys)
-yhat = predict(mach2, x) # defines a new node because `x` is a node
Node @561 → DecisionTreeClassifier(…)
+yhat = predict(mach2, x) # defines a new node because `x` is a node
Node @416 → DecisionTreeClassifier(…)
   args:
-    1:	Node @215 → PCA(…)
+    1:	Node @807 → PCA(…)
   formula:
     predict(
       machine(DecisionTreeClassifier(max_depth = -1, …), …), 
       transform(
         machine(PCA(maxoutdim = 0, …), …), 
-        Source @840))

Note that mach1 and mach2 are not themselves nodes. They point to the nodes they need to call to get training data and they are in turn pointed to by other nodes. In fact, an interesting implementation detail is that an "ordinary" machine is not actually bound directly to data, but bound to data wrapped in source nodes.

machine(pca, Xnew).args[1] # `Xnew` is ordinary data
Source @877 ⏎ `Table{AbstractVector{Continuous}}`

Before calling a node, we need to fit! the node, to trigger training of all the machines on which it depends:

julia> fit!(yhat)   # can include same keyword options for `fit!(::Machine, ...)`[ Info: Training machine(PCA(maxoutdim = 0, …), …).
+        Source @900))

Note that mach1 and mach2 are not themselves nodes. They point to the nodes they need to call to get training data and they are in turn pointed to by other nodes. In fact, an interesting implementation detail is that an "ordinary" machine is not actually bound directly to data, but bound to data wrapped in source nodes.

machine(pca, Xnew).args[1] # `Xnew` is ordinary data
Source @621 ⏎ `Table{AbstractVector{Continuous}}`

Before calling a node, we need to fit! the node, to trigger training of all the machines on which it depends:

julia> fit!(yhat)   # can include same keyword options for `fit!(::Machine, ...)`[ Info: Training machine(PCA(maxoutdim = 0, …), …).
 [ Info: Training machine(DecisionTreeClassifier(max_depth = -1, …), …).
-Node @561 → DecisionTreeClassifier(…)
+Node @416 → DecisionTreeClassifier(…)
   args:
-    1:	Node @215 → PCA(…)
+    1:	Node @807 → PCA(…)
   formula:
     predict(
       machine(DecisionTreeClassifier(max_depth = -1, …), …),
       transform(
         machine(PCA(maxoutdim = 0, …), …),
-        Source @840))
julia> yhat()[1:2] # or `yhat(rows=2)`2-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: + Source @900))
julia> yhat()[1:2] # or `yhat(rows=2)`2-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>0.0, 3=>0.0) UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>0.0, 3=>0.0)

This last represents the prediction on the training data, because that's what resides at our source nodes. However, yhat has the unique origin X (because "training edges" in the complete associated directed graph are excluded for this purpose). We can therefore call yhat on our production data to get the corresponding predictions:

yhat(Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(1=>0.0, 2=>0.0, 3=>1.0)
  UnivariateFinite{Multiclass{3}}(1=>0.0, 2=>0.0, 3=>1.0)
  UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>0.0, 3=>0.0)

Training is smart, in the sense that mutating a hyper-parameter of some component model does not force retraining of upstream machines:

julia> tree.max_depth = 11
julia> fit!(yhat)[ Info: Not retraining machine(PCA(maxoutdim = 0, …), …). Use `force=true` to force. [ Info: Updating machine(DecisionTreeClassifier(max_depth = 1, …), …). -Node @561 → DecisionTreeClassifier(…) +Node @416 → DecisionTreeClassifier(…) args: - 1: Node @215 → PCA(…) + 1: Node @807 → PCA(…) formula: predict( machine(DecisionTreeClassifier(max_depth = 1, …), …), transform( machine(PCA(maxoutdim = 0, …), …), - Source @840))
julia> yhat(Xnew)3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: + Source @900))
julia> yhat(Xnew)3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: UnivariateFinite{Multiclass{3}}(1=>0.357, 2=>0.4, 3=>0.243) UnivariateFinite{Multiclass{3}}(1=>0.357, 2=>0.4, 3=>0.243) UnivariateFinite{Multiclass{3}}(1=>0.357, 2=>0.4, 3=>0.243)

Multithreaded training

A more complicated learning network may contain machines that can be trained in parallel. In that case, a call like the following may speed up training:

tree.max_depth = 2
@@ -67,15 +67,15 @@
 NetworkComposite
NetworkComposite (alias for Union{AnnotatorNetworkComposite, DeterministicNetworkComposite, DeterministicSupervisedDetectorNetworkComposite, DeterministicUnsupervisedDetectorNetworkComposite, IntervalNetworkComposite, JointProbabilisticNetworkComposite, ProbabilisticNetworkComposite, ProbabilisticSetNetworkComposite, ProbabilisticSupervisedDetectorNetworkComposite, ProbabilisticUnsupervisedDetectorNetworkComposite, StaticNetworkComposite, SupervisedAnnotatorNetworkComposite, SupervisedDetectorNetworkComposite, SupervisedNetworkComposite, UnsupervisedAnnotatorNetworkComposite, UnsupervisedDetectorNetworkComposite, UnsupervisedNetworkComposite})

We next make our learning network model-generic by substituting each model instance with the corresponding symbol representing a property (field) of the new model struct:

mach1 = machine(:preprocessor, Xs)   # <---- `pca` swapped out for `:preprocessor`
 x = transform(mach1, Xs)
 mach2 = machine(:classifier, x, ys)  # <---- `tree` swapped out for `:classifier`
-yhat = predict(mach2, x)
Node @449 → :classifier
+yhat = predict(mach2, x)
Node @090 → :classifier
   args:
-    1:	Node @707 → :preprocessor
+    1:	Node @544 → :preprocessor
   formula:
     predict(
       machine(:classifier, …), 
       transform(
         machine(:preprocessor, …), 
-        Source @840))

Incidentally, this network can be used as before except we must provide an instance of CompositeA in our fit! calls, to indicate what actual models the symbols are being substituted with:

composite_a = CompositeA(pca, ConstantClassifier())
+        Source @900))

Incidentally, this network can be used as before except we must provide an instance of CompositeA in our fit! calls, to indicate what actual models the symbols are being substituted with:

composite_a = CompositeA(pca, ConstantClassifier())
 fit!(yhat, composite=composite_a)
 yhat(Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(1=>0.33, 2=>0.33, 3=>0.34)
@@ -114,7 +114,7 @@
         weights = NearestNeighborModels.Uniform()))
mach = machine(composite_a, X, y) |> fit!
 predict(mach, X)[1:2]
2-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
- UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
report(mach).preprocessor
(features_fit = [:sepal_length, :petal_width, :petal_length, :sepal_width],)
fitted_params(mach).classifier
(tree = NearestNeighbors.KDTree{StaticArraysCore.SVector{4, Float64}, Distances.Euclidean, Float64, StaticArraysCore.SVector{4, Float64}}
+ UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
report(mach).preprocessor
(features_fit = [:sepal_length, :sepal_width, :petal_length, :petal_width],)
fitted_params(mach).classifier
(tree = NearestNeighbors.KDTree{StaticArraysCore.SVector{4, Float64}, Distances.Euclidean, Float64, StaticArraysCore.SVector{4, Float64}}
   Number of points: 150
   Dimensions: 4
   Metric: Distances.Euclidean(0.0)
@@ -385,16 +385,16 @@
 mach = machine(lasso_cv, X, y) |> fit!
 report(mach)
(lambda = 3.876061458625438e-5,
  lambda_range = NumericRange(3.876e-5 ≤ lambda ≤ 3.876; origin=1.938, unit=1.938; on log10 scale),)
fitted_params(mach)
(coefs = [:x1 => -0.0007356866733566412, :x2 => -0.0006394063840804443, :x3 => 0.005212228133484562],
- intercept = 0.02542322487622568,)

The learning network API

Two new julia types are part of learning networks: Source and Node, which share a common abstract supertype AbstractNode.

Formally, a learning network defines two labeled directed acyclic graphs (DAG's) whose nodes are Node or Source objects, and whose labels are Machine objects. We obtain the first DAG from directed edges of the form $N1 -> N2$ whenever $N1$ is an argument of $N2$ (see below). Only this DAG is relevant when calling a node, as discussed in the examples above and below. To form the second DAG (relevant when calling or calling fit! on a node) one adds edges for which $N1$ is training argument of the machine which labels $N1$. We call the second, larger DAG, the completed learning network (but note only edges of the smaller network are explicitly drawn in diagrams, for simplicity).

Source nodes

Only source nodes can reference concrete data. A Source object has a single field, data.

MLJBase.sourceMethod
Xs = source(X=nothing)

Define, a learning network Source object, wrapping some input data X, which can be nothing for purposes of exporting the network as stand-alone model. For training and testing the unexported network, appropriate vectors, tables, or other data containers are expected.

The calling behaviour of a Source object is this:

Xs() = X
+ intercept = 0.02542322487622568,)

The learning network API

Two new julia types are part of learning networks: Source and Node, which share a common abstract supertype AbstractNode.

Formally, a learning network defines two labeled directed acyclic graphs (DAG's) whose nodes are Node or Source objects, and whose labels are Machine objects. We obtain the first DAG from directed edges of the form $N1 -> N2$ whenever $N1$ is an argument of $N2$ (see below). Only this DAG is relevant when calling a node, as discussed in the examples above and below. To form the second DAG (relevant when calling or calling fit! on a node) one adds edges for which $N1$ is training argument of the machine which labels $N1$. We call the second, larger DAG, the completed learning network (but note only edges of the smaller network are explicitly drawn in diagrams, for simplicity).

Source nodes

Only source nodes can reference concrete data. A Source object has a single field, data.

MLJBase.sourceMethod
Xs = source(X=nothing)

Define, a learning network Source object, wrapping some input data X, which can be nothing for purposes of exporting the network as stand-alone model. For training and testing the unexported network, appropriate vectors, tables, or other data containers are expected.

The calling behaviour of a Source object is this:

Xs() = X
 Xs(rows=r) = selectrows(X, r)  # eg, X[r,:] for a DataFrame
-Xs(Xnew) = Xnew

See also: MLJBase.prefit, sources, origins, node.

source
MLJBase.rebind!Function
rebind!(s, X)

Attach new data X to an existing source node s. Not a public method.

source
MLJBase.sourcesFunction
sources(N::AbstractNode)

A vector of all sources referenced by calls N() and fit!(N). These are the sources of the ancestor graph of N when including training edges.

Not to be confused with origins(N), in which training edges are excluded.

See also: origins, source.

source
MLJBase.originsFunction
origins(N)

Return a list of all origins of a node N accessed by a call N(). These are the source nodes of ancestor graph of N if edges corresponding to training arguments are excluded. A Node object cannot be called on new data unless it has a unique origin.

Not to be confused with sources(N) which refers to the same graph but without the training edge deletions.

See also: node, source.

source

Nodes

MLJBase.NodeType
Node{T<:Union{Machine,Nothing}}

Type for nodes in a learning network that are not Source nodes.

The key components of a Node are:

  • An operation, which will either be static (a fixed function) or dynamic (such as predict or transform).

  • A Machine object, on which to dispatch the operation (nothing if the operation is static). The training arguments of the machine are generally other nodes, including Source nodes.

  • Upstream connections to other nodes, called its arguments, possibly including Source nodes, one for each data argument of the operation (typically there's just one).

When a node N is called, as in N(), it applies the operation on the machine (if there is one) together with the outcome of calls to its node arguments, to compute the return value. For details on a node's calling behavior, see node.

See also node, Source, origins, sources, fit!.

source
MLJBase.nodeFunction
J = node(f, mach::Machine, args...)

Defines a dynamic Node object J wrapping a dynamic operation f (predict, predict_mean, transform, etc), a nodal machine mach and arguments args. Its calling behaviour, which depends on the outcome of training mach (and, implicitly, on training outcomes affecting its arguments) is this:

J() = f(mach, args[1](), args[2](), ..., args[n]())
+Xs(Xnew) = Xnew

See also: MLJBase.prefit, sources, origins, node.

source
MLJBase.rebind!Function
rebind!(s, X)

Attach new data X to an existing source node s. Not a public method.

source
MLJBase.sourcesFunction
sources(N::AbstractNode)

A vector of all sources referenced by calls N() and fit!(N). These are the sources of the ancestor graph of N when including training edges.

Not to be confused with origins(N), in which training edges are excluded.

See also: origins, source.

source
MLJBase.originsFunction
origins(N)

Return a list of all origins of a node N accessed by a call N(). These are the source nodes of ancestor graph of N if edges corresponding to training arguments are excluded. A Node object cannot be called on new data unless it has a unique origin.

Not to be confused with sources(N) which refers to the same graph but without the training edge deletions.

See also: node, source.

source

Nodes

MLJBase.NodeType
Node{T<:Union{Machine,Nothing}}

Type for nodes in a learning network that are not Source nodes.

The key components of a Node are:

  • An operation, which will either be static (a fixed function) or dynamic (such as predict or transform).

  • A Machine object, on which to dispatch the operation (nothing if the operation is static). The training arguments of the machine are generally other nodes, including Source nodes.

  • Upstream connections to other nodes, called its arguments, possibly including Source nodes, one for each data argument of the operation (typically there's just one).

When a node N is called, as in N(), it applies the operation on the machine (if there is one) together with the outcome of calls to its node arguments, to compute the return value. For details on a node's calling behavior, see node.

See also node, Source, origins, sources, fit!.

source
MLJBase.nodeFunction
J = node(f, mach::Machine, args...)

Defines a dynamic Node object J wrapping a dynamic operation f (predict, predict_mean, transform, etc), a nodal machine mach and arguments args. Its calling behaviour, which depends on the outcome of training mach (and, implicitly, on training outcomes affecting its arguments) is this:

J() = f(mach, args[1](), args[2](), ..., args[n]())
 J(rows=r) = f(mach, args[1](rows=r), args[2](rows=r), ..., args[n](rows=r))
 J(X) = f(mach, args[1](X), args[2](X), ..., args[n](X))

Generally n=1 or n=2 in this latter case.

predict(mach, X::AbsractNode, y::AbstractNode)
 predict_mean(mach, X::AbstractNode, y::AbstractNode)
 predict_median(mach, X::AbstractNode, y::AbstractNode)
 predict_mode(mach, X::AbstractNode, y::AbstractNode)
 transform(mach, X::AbstractNode)
-inverse_transform(mach, X::AbstractNode)

Shortcuts for J = node(predict, mach, X, y), etc.

Calling a node is a recursive operation which terminates in the call to a source node (or nodes). Calling nodes on new data X fails unless the number of such nodes is one.

See also: Node, @node, source, origins.

source
MLJBase.@nodeMacro
@node f(...)

Construct a new node that applies the function f to some combination of nodes, sources and other arguments.

Important. An argument not in global scope is assumed to be a node or source.

Examples

julia> X = source(π)
+inverse_transform(mach, X::AbstractNode)

Shortcuts for J = node(predict, mach, X, y), etc.

Calling a node is a recursive operation which terminates in the call to a source node (or nodes). Calling nodes on new data X fails unless the number of such nodes is one.

See also: Node, @node, source, origins.

source
MLJBase.@nodeMacro
@node f(...)

Construct a new node that applies the function f to some combination of nodes, sources and other arguments.

Important. An argument not in global scope is assumed to be a node or source.

Examples

julia> X = source(π)
 julia> W = @node sin(X)
 julia> W()
 0
@@ -415,6 +415,6 @@
 julia> N = @node add(X1, 1, X2)
 julia> N()
 10
-

See also node

source
MLJBase.prefitFunction
MLJBase.prefit(model, verbosity, data...)

Returns a learning network interface (see below) for a learning network with source nodes that wrap data.

A user overloads MLJBase.prefit when exporting a learning network as a new stand-alone model type, of which model above will be an instance. See the MLJ reference manual for details.

A learning network interface is a named tuple declaring certain interface points in a learning network, to be used when "exporting" the network as a new stand-alone model type. Examples are

 (predict=yhat,)
+

See also node

source
MLJBase.prefitFunction
MLJBase.prefit(model, verbosity, data...)

Returns a learning network interface (see below) for a learning network with source nodes that wrap data.

A user overloads MLJBase.prefit when exporting a learning network as a new stand-alone model type, of which model above will be an instance. See the MLJ reference manual for details.

A learning network interface is a named tuple declaring certain interface points in a learning network, to be used when "exporting" the network as a new stand-alone model type. Examples are

 (predict=yhat,)
  (transform=Xsmall, acceleration=CPUThreads())
- (predict=yhat, transform=W, report=(loss=loss_node,))

Here yhat, Xsmall, W and loss_node are nodes in the network.

The keys of the learning network interface always one of the following:

  • The name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See "Operation keys" below.

  • :report, for exposing results of calling a node with no arguments in the composite model report. See "Including report nodes" below.

  • :fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See "Including fitted parameter nodes" below.

  • :acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.

Operation keys

If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

Including report nodes

If the key is :report, then the corresponding value must be a named tuple

 (k1=n1, k2=n2, ...)

whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

Including fitted parameter nodes

If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

source

See more on fitting nodes at fit! and fit_only!.

+ (predict=yhat, transform=W, report=(loss=loss_node,))

Here yhat, Xsmall, W and loss_node are nodes in the network.

The keys of the learning network interface always one of the following:

  • The name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See "Operation keys" below.

  • :report, for exposing results of calling a node with no arguments in the composite model report. See "Including report nodes" below.

  • :fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See "Including fitted parameter nodes" below.

  • :acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.

Operation keys

If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

Including report nodes

If the key is :report, then the corresponding value must be a named tuple

 (k1=n1, k2=n2, ...)

whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

Including fitted parameter nodes

If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

source

See more on fitting nodes at fit! and fit_only!.

diff --git a/dev/linear_pipelines/index.html b/dev/linear_pipelines/index.html index 6b59f17ad..b39e23d28 100644 --- a/dev/linear_pipelines/index.html +++ b/dev/linear_pipelines/index.html @@ -1,5 +1,5 @@ -Linear Pipelines · MLJ

Linear Pipelines

In MLJ a pipeline is a composite model in which models are chained together in a linear (non-branching) chain. For other arrangements, including custom architectures via learning networks, see Composing Models.

For purposes of illustration, consider a supervised learning problem with the following toy data:

using MLJ
+Linear Pipelines · MLJ

Linear Pipelines

In MLJ a pipeline is a composite model in which models are chained together in a linear (non-branching) chain. For other arrangements, including custom architectures via learning networks, see Composing Models.

For purposes of illustration, consider a supervised learning problem with the following toy data:

using MLJ
 X = (age    = [23, 45, 34, 25, 67],
      gender = categorical(['m', 'm', 'f', 'm', 'f']));
 y = [67.0, 81.5, 55.6, 90.0, 61.1]

We would like to train using a K-nearest neighbor model, but the model type KNNRegressor assumes the features are all Continuous. This can be fixed by first:

  • coercing the :age feature to have Continuous type by replacing X with coerce(X, :age=>Continuous)
  • standardizing continuous features and one-hot encoding the Multiclass features using the ContinuousEncoder model

However, we can avoid separately applying these preprocessing steps (two of which require fit! steps) by combining them with the supervised KKNRegressor model in a new pipeline model, using Julia's |> syntax:

KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels
@@ -41,4 +41,4 @@
 
 pipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer
 pipe2 = PCA |> LinearRegressor
-pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

  • prediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)

  • operation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)

  • cache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)

Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

source
+pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

source diff --git a/dev/list_of_supported_models/index.html b/dev/list_of_supported_models/index.html index 21f1e3b88..8a7ea1650 100644 --- a/dev/list_of_supported_models/index.html +++ b/dev/list_of_supported_models/index.html @@ -1,2 +1,2 @@ -List of Supported Models · MLJ

List of Supported Models

For a list of models organized around function ("classification", "regression", etc.), see the Model Browser.

MLJ provides access to a wide variety of machine learning models. We are always looking for help adding new models or testing existing ones. Currently available models are listed below; for the most up-to-date list, run using MLJ; models().

Indications of "maturity" in the table below are approximate, surjective, and possibly out-of-date. A decision to use or not use a model in a critical application should be based on a user's independent assessment.

  • experimental: indicates the package is fairly new and/or is under active development; you can help by testing these packages and making them more robust,
  • low: indicate a package that has reached a roughly stable form in terms of interface and which is unlikely to contain serious bugs. It may be missing some functionality found in similar packages. It has not benefited from a high level of use
  • medium: indicates the package is fairly mature but may benefit from optimizations and/or extra features; you can help by suggesting either,
  • high: indicates the package is very mature and functionalities are expected to have been fairly optimiser and tested.
PackageInterface PkgModelsMaturityNote
BetaML.jl-DecisionTreeClassifier, RandomForestClassifier, NeuralNetworkClassifier, PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeRegressor, RandomForestRegressor, NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor, GaussianMixtureRegressor, MultitargetGaussianMixtureRegressor, KMeansClusterer, KMedoidsClusterer, GaussianMixtureClusterer, SimpleImputer, GaussianMixtureImputer, RandomForestImputer, GeneralImputer, AutoEncodermedium
CatBoost.jl-CatBoostRegressor, CatBoostClassifierhigh
Clustering.jlMLJClusteringInterface.jlKMeans, KMedoids, DBSCAN, HierarchicalClusteringhigh²
DecisionTree.jlMLJDecisionTreeInterface.jlDecisionTreeClassifier, DecisionTreeRegressor, AdaBoostStumpClassifier, RandomForestClassifier, RandomForestRegressorhigh
EvoTrees.jl-EvoTreeRegressor, EvoTreeClassifier, EvoTreeCount, EvoTreeGaussian, EvoTreeMLEmediumtree-based gradient boosting models
EvoLinear.jl-EvoLinearRegressormediumlinear boosting models
GLM.jlMLJGLMInterface.jlLinearRegressor, LinearBinaryClassifier, LinearCountRegressormedium²
Imbalance.jl-RandomOversampler, RandomWalkOversampler, ROSE, SMOTE, BorderlineSMOTE1, SMOTEN, SMOTENC, RandomUndersampler, ClusterUndersampler, ENNUndersampler, TomekUndersampler,low
LIBSVM.jlMLJLIBSVMInterface.jlLinearSVC, SVC, NuSVC, NuSVR, EpsilonSVR, OneClassSVMhighalso via ScikitLearn.jl
LightGBM.jl-LGBMClassifier, LGBMRegressorhigh
Flux.jlMLJFlux.jlNeuralNetworkRegressor, NeuralNetworkClassifier, MultitargetNeuralNetworkRegressor, ImageClassifierlow
MLJBalancing.jl-BalancedBaggingClassifierlow
MLJLinearModels.jl-LinearRegressor, RidgeRegressor, LassoRegressor, ElasticNetRegressor, QuantileRegressor, HuberRegressor, RobustRegressor, LADRegressor, LogisticClassifier, MultinomialClassifiermedium
MLJModels.jl (built-in)-ConstantClassifier, ConstantRegressor, ContinuousEncoder, DeterministicConstantClassifier, DeterministicConstantRegressor, FeatureSelector, FillImputer, InteractionTransformer, OneHotEncoder, Standardizer, UnivariateBoxCoxTransformer, UnivariateDiscretizer, UnivariateFillImputer, UnivariateTimeTypeToContinuous, Standardizer, BinaryThreshholdPredictormedium
MLJText.jl-TfidfTransformer, BM25Transformer, CountTransformerlow
MultivariateStats.jlMLJMultivariateStatsInterface.jlLinearRegressor, MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor, PCA, KernelPCA, ICA, LDA, BayesianLDA, SubspaceLDA, BayesianSubspaceLDA, FactorAnalysis, PPCAhigh
NaiveBayes.jlMLJNaiveBayesInterface.jlGaussianNBClassifier, MultinomialNBClassifier, HybridNBClassifierlow
NearestNeighborModels.jl-KNNClassifier, KNNRegressor, MultitargetKNNClassifier, MultitargetKNNRegressorhigh
OneRule.jl-OneRuleClassifierexperimental
OutlierDetectionNeighbors.jl-ABODDetector, COFDetector, DNNDetector, KNNDetector, LOFDetectormedium
OutlierDetectionNetworks.jl-AEDetector, DSADDetector, ESADDetectormedium
OutlierDetectionPython.jl-ABODDetector, CBLOFDetector, CDDetector, COFDetector, COPODDetector, ECODDetector, GMMDetector, HBOSDetector, IForestDetector, INNEDetector, KDEDetector, KNNDetector, LMDDDetector, LOCIDetector, LODADetector, LOFDetector, MCDDetector, OCSVMDetector, PCADetector, RODDetector, SODDetector, SOSDetectorhigh
ParallelKMeans.jl-KMeansexperimental
PartialLeastSquaresRegressor.jl-PLSRegressor, KPLSRegressorexperimental
PartitionedLS.jl-PartLSlow
ScikitLearn.jlMLJScikitLearnInterface.jlARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressorhigh²
SIRUS.jl-StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressorlow
SymbolicRegression.jl-MultitargetSRRegressor, SRRegressorexperimental
TSVD.jlMLJTSVDInterface.jlTSVDTransformerhigh
XGBoost.jlMLJXGBoostInterface.jlXGBoostRegressor, XGBoostClassifier, XGBoostCounthigh

Notes

¹Models not in the MLJ registry are not included in integration tests. Consult package documentation to see how to load them. There may be issues loading these models simultaneously with other registered models.

²Some models are missing and assistance is welcome to complete the interface. Post a message on the Julia #mlj Slack channel if you would like to help, thanks!

+List of Supported Models · MLJ

List of Supported Models

For a list of models organized around function ("classification", "regression", etc.), see the Model Browser.

MLJ provides access to a wide variety of machine learning models. We are always looking for help adding new models or testing existing ones. Currently available models are listed below; for the most up-to-date list, run using MLJ; models().

Indications of "maturity" in the table below are approximate, surjective, and possibly out-of-date. A decision to use or not use a model in a critical application should be based on a user's independent assessment.

  • experimental: indicates the package is fairly new and/or is under active development; you can help by testing these packages and making them more robust,
  • low: indicate a package that has reached a roughly stable form in terms of interface and which is unlikely to contain serious bugs. It may be missing some functionality found in similar packages. It has not benefited from a high level of use
  • medium: indicates the package is fairly mature but may benefit from optimizations and/or extra features; you can help by suggesting either,
  • high: indicates the package is very mature and functionalities are expected to have been fairly optimiser and tested.
PackageInterface PkgModelsMaturityNote
BetaML.jl-DecisionTreeClassifier, RandomForestClassifier, NeuralNetworkClassifier, PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeRegressor, RandomForestRegressor, NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor, GaussianMixtureRegressor, MultitargetGaussianMixtureRegressor, KMeansClusterer, KMedoidsClusterer, GaussianMixtureClusterer, SimpleImputer, GaussianMixtureImputer, RandomForestImputer, GeneralImputer, AutoEncodermedium
CatBoost.jl-CatBoostRegressor, CatBoostClassifierhigh
Clustering.jlMLJClusteringInterface.jlKMeans, KMedoids, DBSCAN, HierarchicalClusteringhigh²
DecisionTree.jlMLJDecisionTreeInterface.jlDecisionTreeClassifier, DecisionTreeRegressor, AdaBoostStumpClassifier, RandomForestClassifier, RandomForestRegressorhigh
EvoTrees.jl-EvoTreeRegressor, EvoTreeClassifier, EvoTreeCount, EvoTreeGaussian, EvoTreeMLEmediumtree-based gradient boosting models
EvoLinear.jl-EvoLinearRegressormediumlinear boosting models
GLM.jlMLJGLMInterface.jlLinearRegressor, LinearBinaryClassifier, LinearCountRegressormedium²
Imbalance.jl-RandomOversampler, RandomWalkOversampler, ROSE, SMOTE, BorderlineSMOTE1, SMOTEN, SMOTENC, RandomUndersampler, ClusterUndersampler, ENNUndersampler, TomekUndersampler,low
LIBSVM.jlMLJLIBSVMInterface.jlLinearSVC, SVC, NuSVC, NuSVR, EpsilonSVR, OneClassSVMhighalso via ScikitLearn.jl
LightGBM.jl-LGBMClassifier, LGBMRegressorhigh
FeatureSelector.jl-FeatureSelector, RecursiveFeatureEliminationlow
Flux.jlMLJFlux.jlNeuralNetworkRegressor, NeuralNetworkClassifier, MultitargetNeuralNetworkRegressor, ImageClassifierlow
MLJBalancing.jl-BalancedBaggingClassifierlow
MLJLinearModels.jl-LinearRegressor, RidgeRegressor, LassoRegressor, ElasticNetRegressor, QuantileRegressor, HuberRegressor, RobustRegressor, LADRegressor, LogisticClassifier, MultinomialClassifiermedium
MLJModels.jl (built-in)-ConstantClassifier, ConstantRegressor, ContinuousEncoder, DeterministicConstantClassifier, DeterministicConstantRegressor, FillImputer, InteractionTransformer, OneHotEncoder, Standardizer, UnivariateBoxCoxTransformer, UnivariateDiscretizer, UnivariateFillImputer, UnivariateTimeTypeToContinuous, Standardizer, BinaryThreshholdPredictormedium
MLJText.jl-TfidfTransformer, BM25Transformer, CountTransformerlow
MultivariateStats.jlMLJMultivariateStatsInterface.jlLinearRegressor, MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor, PCA, KernelPCA, ICA, LDA, BayesianLDA, SubspaceLDA, BayesianSubspaceLDA, FactorAnalysis, PPCAhigh
NaiveBayes.jlMLJNaiveBayesInterface.jlGaussianNBClassifier, MultinomialNBClassifier, HybridNBClassifierlow
NearestNeighborModels.jl-KNNClassifier, KNNRegressor, MultitargetKNNClassifier, MultitargetKNNRegressorhigh
OneRule.jl-OneRuleClassifierexperimental
OutlierDetectionNeighbors.jl-ABODDetector, COFDetector, DNNDetector, KNNDetector, LOFDetectormedium
OutlierDetectionNetworks.jl-AEDetector, DSADDetector, ESADDetectormedium
OutlierDetectionPython.jl-ABODDetector, CBLOFDetector, CDDetector, COFDetector, COPODDetector, ECODDetector, GMMDetector, HBOSDetector, IForestDetector, INNEDetector, KDEDetector, KNNDetector, LMDDDetector, LOCIDetector, LODADetector, LOFDetector, MCDDetector, OCSVMDetector, PCADetector, RODDetector, SODDetector, SOSDetectorhigh
ParallelKMeans.jl-KMeansexperimental
PartialLeastSquaresRegressor.jl-PLSRegressor, KPLSRegressorexperimental
PartitionedLS.jl-PartLSlow
ScikitLearn.jlMLJScikitLearnInterface.jlARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressorhigh²
SIRUS.jl-StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressorlow
SymbolicRegression.jl-MultitargetSRRegressor, SRRegressorexperimental
TSVD.jlMLJTSVDInterface.jlTSVDTransformerhigh
XGBoost.jlMLJXGBoostInterface.jlXGBoostRegressor, XGBoostClassifier, XGBoostCounthigh

Notes

¹Models not in the MLJ registry are not included in integration tests. Consult package documentation to see how to load them. There may be issues loading these models simultaneously with other registered models.

²Some models are missing and assistance is welcome to complete the interface. Post a message on the Julia #mlj Slack channel if you would like to help, thanks!

diff --git a/dev/loading_model_code/index.html b/dev/loading_model_code/index.html index 701333456..cb945c4e7 100644 --- a/dev/loading_model_code/index.html +++ b/dev/loading_model_code/index.html @@ -1,12 +1,12 @@ -Loading Model Code · MLJ

Loading Model Code

Once the name of a model, and the package providing that model, have been identified (see Model Search) one can either import the model type interactively with @iload, as shown under Installation, or use @load as shown below. The @load macro works from within a module, a package or a function, provided the relevant package providing the MLJ interface has been added to your package environment. It will attempt to load the model type into the global namespace of the module in which @load is invoked (Main if invoked at the REPL).

In general, the code providing core functionality for the model (living in a package you should consult for documentation) may be different from the package providing the MLJ interface. Since the core package is a dependency of the interface package, only the interface package needs to be added to your environment.

For instance, suppose you have activated a Julia package environment my_env that you wish to use for your MLJ project; for example, you have run:

using Pkg
+Loading Model Code · MLJ

Loading Model Code

Once the name of a model, and the package providing that model, have been identified (see Model Search) one can either import the model type interactively with @iload, as shown under Installation, or use @load as shown below. The @load macro works from within a module, a package or a function, provided the relevant package providing the MLJ interface has been added to your package environment. It will attempt to load the model type into the global namespace of the module in which @load is invoked (Main if invoked at the REPL).

In general, the code providing core functionality for the model (living in a package you should consult for documentation) may be different from the package providing the MLJ interface. Since the core package is a dependency of the interface package, only the interface package needs to be added to your environment.

For instance, suppose you have activated a Julia package environment my_env that you wish to use for your MLJ project; for example, you have run:

using Pkg
 Pkg.activate("my_env", shared=true)

Furthermore, suppose you want to use DecisionTreeClassifier, provided by the DecisionTree.jl package. Then, to determine which package provides the MLJ interface you call load_path:

julia> load_path("DecisionTreeClassifier", pkg="DecisionTree")
 "MLJDecisionTreeInterface.DecisionTreeClassifier"

In this case, we see that the package required is MLJDecisionTreeInterface.jl. If this package is not in my_env (do Pkg.status() to check) you add it by running

julia> Pkg.add("MLJDecisionTreeInterface")

So long as my_env is the active environment, this action need never be repeated (unless you run Pkg.rm("MLJDecisionTreeInterface")). You are now ready to instantiate a decision tree classifier:

julia> Tree = @load DecisionTree pkg=DecisionTree
 julia> tree = Tree()

which is equivalent to

julia> import MLJDecisionTreeInterface.DecisionTreeClassifier
 julia> Tree = MLJDecisionTreeInterface.DecisionTreeClassifier
-julia> tree = Tree()

Tip. The specification pkg=... above can be dropped for the many models that are provided by only a single package.

API

StatisticalTraits.load_pathFunction
load_path(model_name::String, pkg=nothing)

Return the load path for model type with name model_name, specifying the algorithm=providing package name pkg to resolve name conflicts, if necessary.

load_path(proxy::NamedTuple)

Return the load path for the model whose name is proxy.name and whose algorithm-providing package has name proxy.package_name. For example, proxy could be any element of the vector returned by models().

load_path(model)

Return the load path of a model instance or type. Usually requires necessary model code to have been separately loaded. Supply strings as above if code is not loaded.

source
MLJModels.@loadMacro
@load ModelName pkg=nothing verbosity=0 add=false

Import the model type the model named in the first argument into the calling module, specfying pkg in the case of an ambiguous name (to packages providing a model type with the same name). Returns the model type.

Warning In older versions of MLJ/MLJModels, @load returned an instance instead.

To automatically add required interface packages to the current environment, specify add=true. For interactive loading, use @iload instead.

Examples

Tree = @load DecisionTreeRegressor
+julia> tree = Tree()

Tip. The specification pkg=... above can be dropped for the many models that are provided by only a single package.

API

StatisticalTraits.load_pathFunction
load_path(model_name::String, pkg=nothing)

Return the load path for model type with name model_name, specifying the algorithm=providing package name pkg to resolve name conflicts, if necessary.

load_path(proxy::NamedTuple)

Return the load path for the model whose name is proxy.name and whose algorithm-providing package has name proxy.package_name. For example, proxy could be any element of the vector returned by models().

load_path(model)

Return the load path of a model instance or type. Usually requires necessary model code to have been separately loaded. Supply strings as above if code is not loaded.

source
MLJModels.@loadMacro
@load ModelName pkg=nothing verbosity=0 add=false

Import the model type the model named in the first argument into the calling module, specfying pkg in the case of an ambiguous name (to packages providing a model type with the same name). Returns the model type.

Warning In older versions of MLJ/MLJModels, @load returned an instance instead.

To automatically add required interface packages to the current environment, specify add=true. For interactive loading, use @iload instead.

Examples

Tree = @load DecisionTreeRegressor
 tree = Tree()
 tree2 = Tree(min_samples_split=6)
 
 SVM = @load SVC pkg=LIBSVM
-svm = SVM()

See also @iload

source
MLJModels.@iloadMacro
@iload ModelName

Interactive alternative to @load. Provides user with an optioin to install (add) the required interface package to the current environment, and to choose the relevant model-providing package in ambiguous cases. See @load

source
+svm = SVM()

See also @iload

source
MLJModels.@iloadMacro
@iload ModelName

Interactive alternative to @load. Provides user with an optioin to install (add) the required interface package to the current environment, and to choose the relevant model-providing package in ambiguous cases. See @load

source
diff --git a/dev/logging_workflows/index.html b/dev/logging_workflows/index.html index fbce6b150..9fe7c2d58 100644 --- a/dev/logging_workflows/index.html +++ b/dev/logging_workflows/index.html @@ -1,2 +1,2 @@ -Logging Workflows · MLJ

Logging Workflows

MLflow integration

MLflow is a popular, language-agnostic, tool for externally logging the outcomes of machine learning experiments, including those carried out using MLJ.

MLJ logging examples are given in the MLJFlow.jl documentation. MLJ includes and re-exports all the methods of MLJFlow.jl, so there is no need to import MLJFlow.jl if using MLJ.

Warning

MLJFlow.jl is a new package still under active development and should be regarded as experimental. At this time, breaking changes to MLJFlow.jl will not necessarily trigger new breaking releases of MLJ.jl.

+Logging Workflows using MLflow · MLJ

Logging Workflows

MLflow integration

MLflow is a popular, language-agnostic, tool for externally logging the outcomes of machine learning experiments, including those carried out using MLJ.

MLJ logging examples are given in the MLJFlow.jl documentation. MLJ includes and re-exports all the methods of MLJFlow.jl, so there is no need to import MLJFlow.jl if using MLJ.

Warning

MLJFlow.jl is a new package still under active development and should be regarded as experimental. At this time, breaking changes to MLJFlow.jl will not necessarily trigger new breaking releases of MLJ.jl.

diff --git a/dev/machines/index.html b/dev/machines/index.html index 8ebcda5e9..3aa837e8f 100644 --- a/dev/machines/index.html +++ b/dev/machines/index.html @@ -1,13 +1,13 @@ -Machines · MLJ

Machines

Recall from Getting Started that a machine binds a model (i.e., a choice of algorithm + hyperparameters) to data (see more at Constructing machines below). A machine is also the object storing learned parameters. Under the hood, calling fit! on a machine calls either MLJBase.fit or MLJBase.update, depending on the machine's internal state (as recorded in private fields old_model and old_rows). These lower-level fit and update methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows keyword argument of fit! (all rows by default).

Warm restarts

If a model update method has been implemented for the model, calls to fit! will avoid redundant calculations for certain kinds of model mutations. The main use-case is increasing an iteration parameter, such as the number of epochs in a neural network. To test if SomeIterativeModel supports this feature, check iteration_parameter(SomeIterativeModel) is different from nothing.

tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()
+Machines · MLJ

Machines

Recall from Getting Started that a machine binds a model (i.e., a choice of algorithm + hyperparameters) to data (see more at Constructing machines below). A machine is also the object storing learned parameters. Under the hood, calling fit! on a machine calls either MLJBase.fit or MLJBase.update, depending on the machine's internal state (as recorded in private fields old_model and old_rows). These lower-level fit and update methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows keyword argument of fit! (all rows by default).

Warm restarts

If a model update method has been implemented for the model, calls to fit! will avoid redundant calculations for certain kinds of model mutations. The main use-case is increasing an iteration parameter, such as the number of epochs in a neural network. To test if SomeIterativeModel supports this feature, check iteration_parameter(SomeIterativeModel) is different from nothing.

tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()
 forest = EnsembleModel(model=tree, n=10);
 X, y = @load_iris;
 mach = machine(forest, X, y)
 fit!(mach, verbosity=2);
trained Machine; caches model-specific representations of data
   model: ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …)
   args: 
-    1:	Source @731 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @957 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @750 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @992 ⏎ AbstractVector{Multiclass{3}}
 

Generally, changing a hyperparameter triggers retraining on calls to subsequent fit!:

julia> forest.bagging_fraction = 0.5;
julia> fit!(mach, verbosity=2);[ Info: Updating machine(ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …). [ Info: Truncating existing ensemble.

However, for this iterative model, increasing the iteration parameter only adds models to the existing ensemble:

julia> forest.n = 15;
julia> fit!(mach, verbosity=2);[ Info: Updating machine(ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …). [ Info: Building on existing ensemble of length 10 @@ -18,7 +18,7 @@ fit!(mach)
trained Machine; caches model-specific representations of data
   model: PCA(maxoutdim = 0, …)
   args: 
-    1:	Source @251 ⏎ Table{AbstractVector{Continuous}}
+    1:	Source @666 ⏎ Table{AbstractVector{Continuous}}
 
julia> fitted_params(mach)(projection = [-0.36158967738145 0.6565398832858296 0.5809972798276162; 0.08226888989221415 0.7297123713264985 -0.5964180879380994; -0.8565721052905275 -0.175767403428653 -0.07252407548695988; -0.3588439262482158 -0.07470647013503479 -0.5490609107266099],)
julia> report(mach)(indim = 4, outdim = 3, tprincipalvar = 4.545608248041779, @@ -35,7 +35,7 @@ julia> fitted_params(mach).logistic_classifier (classes = CategoricalArrays.CategoricalValue{String,UInt32}["B", "O"], coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471], - intercept = 0.0883301599726305,)

See also report

source
MLJBase.reportMethod
report(mach)

Return the report for a machine mach that has been fit!, for example the coefficients in a linear model.

This is a named tuple and human-readable if possible.

If mach is a machine for a composite model, such as a model constructed using the pipeline syntax model1 |> model2 |> ..., then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)

julia> using MLJ
+ intercept = 0.0883301599726305,)

See also report

source
MLJBase.reportMethod
report(mach)

Return the report for a machine mach that has been fit!, for example the coefficients in a linear model.

This is a named tuple and human-readable if possible.

If mach is a machine for a composite model, such as a model constructed using the pipeline syntax model1 |> model2 |> ..., then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)

julia> using MLJ
 julia> @load LinearBinaryClassifier pkg=GLM
 julia> X, y = @load_crabs;
 julia> pipe = Standardizer() |> LinearBinaryClassifier();
@@ -46,7 +46,7 @@
  dof_residual = 195.0,
  stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],
  vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)
-

See also fitted_params

source

Training losses and feature importances

Training losses and feature importances, if reported by a model, will be available in the machine's report (see above). However, there are also direct access methods where supported:

training_losses(mach::Machine) -> vector_of_losses

Here vector_of_losses will be in historical order (most recent loss last). This kind of access is supported for model = mach.model if supports_training_losses(model) == true.

feature_importances(mach::Machine) -> vector_of_pairs

Here a vector_of_pairs is a vector of elements of the form feature => importance_value, where feature is a symbol. For example, vector_of_pairs = [:gender => 0.23, :height => 0.7, :weight => 0.1]. If a model does not support feature importances for some model hyperparameters, every importance_value will be zero. This kind of access is supported for model = mach.model if reports_feature_importances(model) == true.

If a model can report multiple types of feature importances, then there will be a model hyper-parameter controlling the active type.

Constructing machines

A machine is constructed with the syntax machine(model, args...) where the possibilities for args (called training arguments) are summarized in the table below. Here X and y represent inputs and target, respectively, and Xout is the output of a transform call. Machines for supervised models may have additional training arguments, such as a vector of per-observation weights (in which case supports_weights(model) == true).

model supertypemachine constructor callsoperation calls (first compulsory)
Deterministic <: Supervisedmachine(model, X, y, extras...)predict(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)
Probabilistic <: Supervisedmachine(model, X, y, extras...)predict(mach, Xnew), predict_mean(mach, Xnew), predict_median(mach, Xnew), predict_mode(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)
Unsupervised (except Static)machine(model, X)transform(mach, Xnew), inverse_transform(mach, Xout), predict(mach, Xnew)
Staticmachine(model)transform(mach, Xnews...), inverse_transform(mach, Xout)

All operations on machines (predict, transform, etc) have exactly one argument (Xnew or Xout above) after mach, the machine instance. An exception is a machine bound to a Static model, which can have any number of arguments after mach. For more on Static transformers (which have no training arguments) see Static transformers.

A machine is reconstructed from a file using the syntax machine("my_machine.jlso"), or machine("my_machine.jlso", args...) if retraining using new data. See Saving machines below.

Lowering memory demands

For large data sets, you may be able to save memory by suppressing data caching that some models perform to increase speed. To do this, specify cache=false, as in

machine(model, X, y, cache=false)

Constructing machines in learning networks

Instead of data X, y, etc, the machine constructor is provided Node or Source objects ("dynamic data") when building a learning network. See Learning Networks for more on this advanced feature.

Saving machines

Users can save and restore MLJ machines using any external serialization package by suitably preparing their Machine object, and applying a post-processing step to the deserialized object. This is explained under Using an arbitrary serializer below.

However, if a user is happy to use Julia's standard library Serialization module, there is a simplified workflow described first.

The usual serialization provisos apply. For example, when deserializing you need to have all code on which the serialization object depended loaded at the time of deserialization also. If a hyper-parameter happens to be a user-defined function, then that function must be defined at deserialization. And you should only deserialize objects from trusted sources.

Using Julia's native serializer

Training losses and feature importances

Training losses and feature importances, if reported by a model, will be available in the machine's report (see above). However, there are also direct access methods where supported:

training_losses(mach::Machine) -> vector_of_losses

Here vector_of_losses will be in historical order (most recent loss last). This kind of access is supported for model = mach.model if supports_training_losses(model) == true.

feature_importances(mach::Machine) -> vector_of_pairs

Here a vector_of_pairs is a vector of elements of the form feature => importance_value, where feature is a symbol. For example, vector_of_pairs = [:gender => 0.23, :height => 0.7, :weight => 0.1]. If a model does not support feature importances for some model hyperparameters, every importance_value will be zero. This kind of access is supported for model = mach.model if reports_feature_importances(model) == true.

If a model can report multiple types of feature importances, then there will be a model hyper-parameter controlling the active type.

Constructing machines

A machine is constructed with the syntax machine(model, args...) where the possibilities for args (called training arguments) are summarized in the table below. Here X and y represent inputs and target, respectively, and Xout is the output of a transform call. Machines for supervised models may have additional training arguments, such as a vector of per-observation weights (in which case supports_weights(model) == true).

model supertypemachine constructor callsoperation calls (first compulsory)
Deterministic <: Supervisedmachine(model, X, y, extras...)predict(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)
Probabilistic <: Supervisedmachine(model, X, y, extras...)predict(mach, Xnew), predict_mean(mach, Xnew), predict_median(mach, Xnew), predict_mode(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)
Unsupervised (except Static)machine(model, X)transform(mach, Xnew), inverse_transform(mach, Xout), predict(mach, Xnew)
Staticmachine(model)transform(mach, Xnews...), inverse_transform(mach, Xout)

All operations on machines (predict, transform, etc) have exactly one argument (Xnew or Xout above) after mach, the machine instance. An exception is a machine bound to a Static model, which can have any number of arguments after mach. For more on Static transformers (which have no training arguments) see Static transformers.

A machine is reconstructed from a file using the syntax machine("my_machine.jlso"), or machine("my_machine.jlso", args...) if retraining using new data. See Saving machines below.

Lowering memory demands

For large data sets, you may be able to save memory by suppressing data caching that some models perform to increase speed. To do this, specify cache=false, as in

machine(model, X, y, cache=false)

Constructing machines in learning networks

Instead of data X, y, etc, the machine constructor is provided Node or Source objects ("dynamic data") when building a learning network. See Learning Networks for more on this advanced feature.

Saving machines

Users can save and restore MLJ machines using any external serialization package by suitably preparing their Machine object, and applying a post-processing step to the deserialized object. This is explained under Using an arbitrary serializer below.

However, if a user is happy to use Julia's standard library Serialization module, there is a simplified workflow described first.

The usual serialization provisos apply. For example, when deserializing you need to have all code on which the serialization object depended loaded at the time of deserialization also. If a hyper-parameter happens to be a user-defined function, then that function must be defined at deserialization. And you should only deserialize objects from trusted sources.

Using Julia's native serializer

MLJModelInterface.saveFunction
MLJ.save(filename, mach::Machine)
 MLJ.save(io, mach::Machine)
 
 MLJBase.save(filename, mach::Machine)
@@ -64,7 +64,7 @@
 MLJ.save(io, mach)
 seekstart(io)
 predict_only_mach = machine(io)
-predict(predict_only_mach, X)
Only load files from trusted sources

Maliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.

See also serializable, machine.

source

Using an arbitrary serializer

Since machines contain training data, serializing a machine directly is not recommended. Also, the learned parameters of models implemented in a language other than Julia may not have persistent representations, which means serializing them is useless. To address these two issues, users:

To restore the original machine (minus training data) they:

MLJBase.serializableFunction
serializable(mach::Machine)

Returns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.

Any general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.

If using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.

A machine returned by serializable is characterized by the property mach.state == -1.

Example using JLSO

using MLJ
+predict(predict_only_mach, X)
Only load files from trusted sources

Maliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.

See also serializable, machine.

source

Using an arbitrary serializer

Since machines contain training data, serializing a machine directly is not recommended. Also, the learned parameters of models implemented in a language other than Julia may not have persistent representations, which means serializing them is useless. To address these two issues, users:

To restore the original machine (minus training data) they:

MLJBase.serializableFunction
serializable(mach::Machine)

Returns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.

Any general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.

If using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.

A machine returned by serializable is characterized by the property mach.state == -1.

Example using JLSO

using MLJ
 using JLSO
 Tree = @load DecisionTreeClassifier
 tree = Tree()
@@ -80,7 +80,7 @@
 restore!(loaded_mach)
 
 predict(loaded_mach, X)
-predict(mach, X)

See also restore!, MLJBase.save.

source
MLJBase.restore!Function
restore!(mach::Machine)

Restore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.

For an example see serializable.

source

Internals

For a supervised machine, the predict method calls a lower-level MLJBase.predict method, dispatched on the underlying model and the fitresult (see below). To see predict in action, as well as its unsupervised cousins transform and inverse_transform, see Getting Started.

Except for model, a Machine instance has several fields which the user should not directly access; these include:

The interested reader can learn more about machine internals by examining the simplified code excerpt in Internals.

API Reference

MLJBase.machineFunction
machine(model, args...; cache=true, scitype_check_level=1)

Construct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:

using MLJModels
+predict(mach, X)

See also restore!, MLJBase.save.

source
MLJBase.restore!Function
restore!(mach::Machine)

Restore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.

For an example see serializable.

source

Internals

For a supervised machine, the predict method calls a lower-level MLJBase.predict method, dispatched on the underlying model and the fitresult (see below). To see predict in action, as well as its unsupervised cousins transform and inverse_transform, see Getting Started.

Except for model, a Machine instance has several fields which the user should not directly access; these include:

The interested reader can learn more about machine internals by examining the simplified code excerpt in Internals.

API Reference

MLJBase.machineFunction
machine(model, args...; cache=true, scitype_check_level=1)

Construct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:

using MLJModels
 X, y = make_regression()
 
 PCA = @load PCA pkg=MultivariateStats
@@ -103,14 +103,14 @@
 X, y = make_blobs()
 mach = machine(:classifier, X, y)
 fit!(mach, composite=my_composite)

The last two lines are equivalent to

mach = machine(ConstantClassifier(), X, y)
-fit!(mach)

Delaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.

See also fit!, default_scitype_check_level, MLJBase.save, serializable.

source
StatsAPI.fit!Function
fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)

Fit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.

To attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!

source
fit!(N::Node;
+fit!(mach)

Delaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.

See also fit!, default_scitype_check_level, MLJBase.save, serializable.

source
StatsAPI.fit!Function
fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)

Fit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.

To attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!

source
fit!(N::Node;
      rows=nothing,
      verbosity=1,
      force=false,
-     acceleration=CPU1())

Train all machines required to call the node N, in an appropriate order, but parallelizing where possible using specified acceleration mode. These machines are those returned by machines(N).

Supported modes of acceleration: CPU1(), CPUThreads().

source
MLJBase.fit_only!Function
MLJBase.fit_only!(
+     acceleration=CPU1())

Train all machines required to call the node N, in an appropriate order, but parallelizing where possible using specified acceleration mode. These machines are those returned by machines(N).

Supported modes of acceleration: CPU1(), CPUThreads().

source
MLJBase.fit_only!Function
MLJBase.fit_only!(
     mach::Machine;
     rows=nothing,
     verbosity=1,
     force=false,
     composite=nothing,
-)

Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:

  • Ab initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.

  • Training update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.

  • No-operation. Leave existing learned parameters untouched. Do not increment mach.state.

If the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.

Training action logic

For the action to be a no-operation, either mach.frozen == true or or none of the following apply:

  • (i) mach has never been trained (mach.state == 0).

  • (ii) force == true.

  • (iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).

  • (iv) The specified rows have changed since the last retraining and mach.model does not have Static type.

  • (v) mach.model is a model and different from the last model used for training, but has the same type.

  • (vi) mach.model is a model but has a type different from the last model used for training.

  • (vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.

  • (viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.

In any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.

To freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).

Implementation details

The data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.

source
+)

Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:

If the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.

Training action logic

For the action to be a no-operation, either mach.frozen == true or or none of the following apply:

In any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.

To freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).

Implementation details

The data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.

source diff --git a/dev/mlj_cheatsheet/index.html b/dev/mlj_cheatsheet/index.html index 75ff8b375..bb00dea4e 100644 --- a/dev/mlj_cheatsheet/index.html +++ b/dev/mlj_cheatsheet/index.html @@ -1,5 +1,5 @@ -MLJ Cheatsheet · MLJ

MLJ Cheatsheet

Starting an interactive MLJ session

julia> using MLJ
julia> MLJ_VERSION # version of MLJ for this cheatsheetv"0.20.5"

Model search and code loading

info("PCA") retrieves registry metadata for the model called "PCA"

info("RidgeRegressor", pkg="MultivariateStats") retrieves metadata for "RidgeRegresssor", which is provided by multiple packages

doc("DecisionTreeClassifier", pkg="DecisionTree") retrieves the model document string for the classifier, without loading model code

models() lists metadata of every registered model.

models("Tree") lists models with "Tree" in the model or package name.

models(x -> x.is_supervised && x.is_pure_julia) lists all supervised models written in pure julia.

models(matching(X)) lists all unsupervised models compatible with input X.

models(matching(X, y)) lists all supervised models compatible with input/target X/y.

With additional conditions:

models() do model
+MLJ Cheatsheet · MLJ

MLJ Cheatsheet

Starting an interactive MLJ session

julia> using MLJ
julia> MLJ_VERSION # version of MLJ for this cheatsheetv"0.20.6"

Model search and code loading

info("PCA") retrieves registry metadata for the model called "PCA"

info("RidgeRegressor", pkg="MultivariateStats") retrieves metadata for "RidgeRegresssor", which is provided by multiple packages

doc("DecisionTreeClassifier", pkg="DecisionTree") retrieves the model document string for the classifier, without loading model code

models() lists metadata of every registered model.

models("Tree") lists models with "Tree" in the model or package name.

models(x -> x.is_supervised && x.is_pure_julia) lists all supervised models written in pure julia.

models(matching(X)) lists all unsupervised models compatible with input X.

models(matching(X, y)) lists all supervised models compatible with input/target X/y.

With additional conditions:

models() do model
     matching(model, X, y) &&
     model.prediction_type == :probabilistic &&
     model.is_pure_julia
@@ -12,4 +12,4 @@
               !=(:Time);
               rng=123)

Here, y is assigned the :Exit column, and X is assigned the rest, except :Time.

Splitting row indices into train/validation/test, with seeded shuffling:

train, valid, test = partition(eachindex(y), 0.7, 0.2, rng=1234) # for 70:20:10 ratio

For a stratified split:

train, test = partition(eachindex(y), 0.8, stratify=y)

Split a table or matrix X, instead of indices:

Xtrain, Xvalid, Xtest = partition(X, 0.5, 0.3, rng=123)

Simultaneous splitting (needs multi=true):

(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)

Getting data from OpenML:

table = OpenML.load(91)

Creating synthetic classification data:

X, y = make_blobs(100, 2)

(also: make_moons, make_circles, make_regression)

Creating synthetic regression data:

X, y = make_regression(100, 2)

Machine construction

Supervised case:

model = KNNRegressor(K=1)
 mach = machine(model, X, y)

Unsupervised case:

model = OneHotEncoder()
-mach = machine(model, X)

Fitting

The fit! function can be used to fit a machine (defaults shown):

fit!(mach, rows=1:100, verbosity=1, force=false)

Prediction

  • Supervised case: predict(mach, Xnew) or predict(mach, rows=1:100)

    For probabilistic models: predict_mode, predict_mean and predict_median.

  • Unsupervised case: W = transform(mach, Xnew) or inverse_transform(mach, W), etc.

Inspecting objects

info(ConstantRegressor()), info("PCA"), info("RidgeRegressor", pkg="MultivariateStats") gets all properties (aka traits) of registered models

schema(X) get column names, types and scitypes, and nrows, of a table X

scitype(X) gets the scientific type of X

fitted_params(mach) gets learned parameters of the fitted machine

report(mach) gets other training results (e.g. feature rankings)

Saving and retrieving machines using Julia serializer

MLJ.save("my_machine.jls", mach) to save machine mach (without data)

predict_only_mach = machine("my_machine.jls") to deserialize.

Performance estimation

evaluate(model, X, y, resampling=CV(), measure=rms)
evaluate!(mach, resampling=Holdout(), measure=[rms, mav])
evaluate!(mach, resampling=[(fold1, fold2), (fold2, fold1)], measure=rms)

Resampling strategies (resampling=...)

Holdout(fraction_train=0.7, rng=1234) for simple holdout

CV(nfolds=6, rng=1234) for cross-validation

StratifiedCV(nfolds=6, rng=1234) for stratified cross-validation

TimeSeriesSV(nfolds=4) for time-series cross-validation

InSample(): test set = train set

or a list of pairs of row indices:

[(train1, eval1), (train2, eval2), ... (traink, evalk)]

Tuning model wrapper

tuned_model = TunedModel(model; tuning=RandomSearch(), resampling=Holdout(), measure=…, range=…)

Ranges for tuning (range=...)

If r = range(KNNRegressor(), :K, lower=1, upper = 20, scale=:log)

then Grid() search uses iterator(r, 6) == [1, 2, 3, 6, 11, 20].

lower=-Inf and upper=Inf are allowed.

Non-numeric ranges: r = range(model, :parameter, values=…)

Instead of model, declare type: r = range(Char, :c; values=['a', 'b'])

Nested ranges: Use dot syntax, as in r = range(EnsembleModel(atom=tree), :(atom.max_depth), ...)

Specify multiple ranges, as in range=[r1, r2, r3]. For more range options do ?Grid or ?RandomSearch

Tuning strategies

RandomSearch(rng=1234) for basic random search

Grid(resolution=10) or Grid(goal=50) for basic grid search

Also available: LatinHyperCube, Explicit (built-in), MLJTreeParzenTuning, ParticleSwarm, AdaptiveParticleSwarm (3rd-party packages)

Learning curves

For generating a plot of performance against parameter specified by range:

curve = learning_curve(mach, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)
curve = learning_curve(model, X, y, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)

If using Plots.jl:

plot(curve.parameter_values, curve.measurements, xlab=curve.parameter_name, xscale=curve.parameter_scale)

Controlling iterative models

Requires: using MLJIteration

iterated_model = IteratedModel(model=…, resampling=Holdout(), measure=…, controls=…, retrain=false)

Controls

Increment training: Step(n=1)

Stopping: TimeLimit(t=0.5) (in hours), NumberLimit(n=100), NumberSinceBest(n=6), NotANumber(), Threshold(value=0.0), GL(alpha=2.0), PQ(alpha=0.75, k=5), Patience(n=5)

Logging: Info(f=identity), Warn(f=""), Error(predicate, f="")

Callbacks: Callback(f=mach->nothing), WithNumberDo(f=n->@info(n)), WithIterationsDo(f=i->@info("num iterations: $i")), WithLossDo(f=x->@info("loss: $x")), WithTrainingLossesDo(f=v->@info(v))

Snapshots: Save(filename="machine.jlso")

Wraps: MLJIteration.skip(control, predicate=1), IterationControl.with_state_do(control)

Performance measures (metrics)

Do measures() to get full list.

Do measures("log") to list measures with "log" in doc-string.

Transformers

Built-ins include: Standardizer, OneHotEncoder, UnivariateBoxCoxTransformer, FeatureSelector, FillImputer, UnivariateDiscretizer, ContinuousEncoder, UnivariateTimeTypeToContinuous

Externals include: PCA (in MultivariateStats), KMeans, KMedoids (in Clustering).

models(m -> !m.is_supervised) to get full list

Ensemble model wrapper

EnsembleModel(model; weights=Float64[], bagging_fraction=0.8, rng=GLOBAL_RNG, n=100, parallel=true, out_of_bag_measure=[])

Target transformation wrapper

TransformedTargetModel(model; target=Standardizer())

Pipelines

pipe = (X -> coerce(X, :height=>Continuous)) |> OneHotEncoder |> KNNRegressor(K=3)
  • Unsupervised:

    pipe = Standardizer |> OneHotEncoder

  • Concatenation:

    pipe1 |> pipe2 or model |> pipe or pipe |> model, etc.

Advanced model composition techniques

See the Composing Models section of the MLJ manual.

+mach = machine(model, X)

Fitting

The fit! function can be used to fit a machine (defaults shown):

fit!(mach, rows=1:100, verbosity=1, force=false)

Prediction

  • Supervised case: predict(mach, Xnew) or predict(mach, rows=1:100)

    For probabilistic models: predict_mode, predict_mean and predict_median.

  • Unsupervised case: W = transform(mach, Xnew) or inverse_transform(mach, W), etc.

Inspecting objects

info(ConstantRegressor()), info("PCA"), info("RidgeRegressor", pkg="MultivariateStats") gets all properties (aka traits) of registered models

schema(X) get column names, types and scitypes, and nrows, of a table X

scitype(X) gets the scientific type of X

fitted_params(mach) gets learned parameters of the fitted machine

report(mach) gets other training results (e.g. feature rankings)

Saving and retrieving machines using Julia serializer

MLJ.save("my_machine.jls", mach) to save machine mach (without data)

predict_only_mach = machine("my_machine.jls") to deserialize.

Performance estimation

evaluate(model, X, y, resampling=CV(), measure=rms)
evaluate!(mach, resampling=Holdout(), measure=[rms, mav])
evaluate!(mach, resampling=[(fold1, fold2), (fold2, fold1)], measure=rms)

Resampling strategies (resampling=...)

Holdout(fraction_train=0.7, rng=1234) for simple holdout

CV(nfolds=6, rng=1234) for cross-validation

StratifiedCV(nfolds=6, rng=1234) for stratified cross-validation

TimeSeriesSV(nfolds=4) for time-series cross-validation

InSample(): test set = train set

or a list of pairs of row indices:

[(train1, eval1), (train2, eval2), ... (traink, evalk)]

Tuning model wrapper

tuned_model = TunedModel(model; tuning=RandomSearch(), resampling=Holdout(), measure=…, range=…)

Ranges for tuning (range=...)

If r = range(KNNRegressor(), :K, lower=1, upper = 20, scale=:log)

then Grid() search uses iterator(r, 6) == [1, 2, 3, 6, 11, 20].

lower=-Inf and upper=Inf are allowed.

Non-numeric ranges: r = range(model, :parameter, values=…)

Instead of model, declare type: r = range(Char, :c; values=['a', 'b'])

Nested ranges: Use dot syntax, as in r = range(EnsembleModel(atom=tree), :(atom.max_depth), ...)

Specify multiple ranges, as in range=[r1, r2, r3]. For more range options do ?Grid or ?RandomSearch

Tuning strategies

RandomSearch(rng=1234) for basic random search

Grid(resolution=10) or Grid(goal=50) for basic grid search

Also available: LatinHyperCube, Explicit (built-in), MLJTreeParzenTuning, ParticleSwarm, AdaptiveParticleSwarm (3rd-party packages)

Learning curves

For generating a plot of performance against parameter specified by range:

curve = learning_curve(mach, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)
curve = learning_curve(model, X, y, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)

If using Plots.jl:

plot(curve.parameter_values, curve.measurements, xlab=curve.parameter_name, xscale=curve.parameter_scale)

Controlling iterative models

Requires: using MLJIteration

iterated_model = IteratedModel(model=…, resampling=Holdout(), measure=…, controls=…, retrain=false)

Controls

Increment training: Step(n=1)

Stopping: TimeLimit(t=0.5) (in hours), NumberLimit(n=100), NumberSinceBest(n=6), NotANumber(), Threshold(value=0.0), GL(alpha=2.0), PQ(alpha=0.75, k=5), Patience(n=5)

Logging: Info(f=identity), Warn(f=""), Error(predicate, f="")

Callbacks: Callback(f=mach->nothing), WithNumberDo(f=n->@info(n)), WithIterationsDo(f=i->@info("num iterations: $i")), WithLossDo(f=x->@info("loss: $x")), WithTrainingLossesDo(f=v->@info(v))

Snapshots: Save(filename="machine.jlso")

Wraps: MLJIteration.skip(control, predicate=1), IterationControl.with_state_do(control)

Performance measures (metrics)

Do measures() to get full list.

Do measures("log") to list measures with "log" in doc-string.

Transformers

Built-ins include: Standardizer, OneHotEncoder, UnivariateBoxCoxTransformer, FeatureSelector, FillImputer, UnivariateDiscretizer, ContinuousEncoder, UnivariateTimeTypeToContinuous

Externals include: PCA (in MultivariateStats), KMeans, KMedoids (in Clustering).

models(m -> !m.is_supervised) to get full list

Ensemble model wrapper

EnsembleModel(model; weights=Float64[], bagging_fraction=0.8, rng=GLOBAL_RNG, n=100, parallel=true, out_of_bag_measure=[])

Target transformation wrapper

TransformedTargetModel(model; target=Standardizer())

Pipelines

pipe = (X -> coerce(X, :height=>Continuous)) |> OneHotEncoder |> KNNRegressor(K=3)
  • Unsupervised:

    pipe = Standardizer |> OneHotEncoder

  • Concatenation:

    pipe1 |> pipe2 or model |> pipe or pipe |> model, etc.

Advanced model composition techniques

See the Composing Models section of the MLJ manual.

diff --git a/dev/model_browser/index.html b/dev/model_browser/index.html index e4c34f0fb..ee63bd48a 100644 --- a/dev/model_browser/index.html +++ b/dev/model_browser/index.html @@ -1,2 +1,2 @@ -Model Browser · MLJ

Model Browser

Models may appear under multiple categories.

Below an encoder is any transformer that does not fall under another category, such as "Missing Value Imputation" or "Dimension Reduction".

Categories

Regression | Classification | Outlier Detection | Iterative Models | Ensemble Models | Clustering | Dimension Reduction | Bayesian Models | Class Imbalance | Encoders | Static Models | Missing Value Imputation | Distribution Fitter | Text Analysis | Image Processing

Regression

Classification

Outlier Detection

Iterative Models

Ensemble Models

Clustering

Dimension Reduction

Bayesian Models

Class Imbalance

Encoders

Static Models

Missing Value Imputation

Distribution Fitter

Text Analysis

Image Processing

+Model Browser · MLJ

Model Browser

Models may appear under multiple categories.

Below an encoder is any transformer that does not fall under another category, such as "Missing Value Imputation" or "Dimension Reduction".

Categories

Regression | Classification | Outlier Detection | Iterative Models | Ensemble Models | Dimension Reduction | Clustering | Bayesian Models | Class Imbalance | Encoders | Meta Algorithms | Neural networks | Static Models | Missing Value Imputation | Distribution Fitter | Feature Engineering | Text Analysis | Image Processing

Regression

Classification

Outlier Detection

Iterative Models

Ensemble Models

Dimension Reduction

Clustering

Bayesian Models

Class Imbalance

Encoders

Meta Algorithms

Neural networks

Static Models

Missing Value Imputation

Distribution Fitter

Feature Engineering

Text Analysis

Image Processing

diff --git a/dev/model_search/index.html b/dev/model_search/index.html index d19f777d7..9dce0d378 100644 --- a/dev/model_search/index.html +++ b/dev/model_search/index.html @@ -1,8 +1,9 @@ -Model Search · MLJ

Model Search

MLJ has a model registry, allowing the user to search models and their properties, without loading all the packages containing model code. In turn, this allows one to efficiently find all models solving a given machine learning task. The task itself is specified with the help of the matching method, and the search executed with the models methods, as detailed below.

For commonly encountered problems with model search, see also Preparing Data.

A table of all models is also given at List of Supported Models.

Model metadata

Terminology. In this section the word "model" refers to a metadata entry in the model registry, as opposed to an actual model struct that such an entry represents. One can obtain such an entry with the info command:

julia> info("PCA")(name = "PCA",
+Model Search · MLJ

Model Search

MLJ has a model registry, allowing the user to search models and their properties, without loading all the packages containing model code. In turn, this allows one to efficiently find all models solving a given machine learning task. The task itself is specified with the help of the matching method, and the search executed with the models methods, as detailed below.

For commonly encountered problems with model search, see also Preparing Data.

A table of all models is also given at List of Supported Models.

Model metadata

Terminology. In this section the word "model" refers to a metadata entry in the model registry, as opposed to an actual model struct that such an entry represents. One can obtain such an entry with the info command:

julia> info("PCA")(name = "PCA",
  package_name = "MultivariateStats",
  is_supervised = false,
  abstract_type = Unsupervised,
+ constructor = nothing,
  deep_properties = (),
  docstring = "```\nPCA\n```\n\nA model type for constructing a pca, ...",
  fit_data_scitype = Tuple{Table{<:AbstractVector{<:Continuous}}},
@@ -32,7 +33,7 @@
  transform_scitype = Table{<:AbstractVector{<:Continuous}},
  input_scitype = Table{<:AbstractVector{<:Continuous}},
  target_scitype = Unknown,
- output_scitype = Table{<:AbstractVector{<:Continuous}})

So a "model" in the present context is just a named tuple containing metadata, and not an actual model type or instance. If two models with the same name occur in different packages, the package name must be specified, as in info("LinearRegressor", pkg="GLM").

Model document strings can be retreived, without importing the defining code, using the doc function:

doc("DecisionTreeClassifier", pkg="DecisionTree")

General model queries

We list all models (named tuples) using models(), and list the models for which code is already loaded with localmodels():

julia> localmodels()59-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+ output_scitype = Table{<:AbstractVector{<:Continuous}})

So a "model" in the present context is just a named tuple containing metadata, and not an actual model type or instance. If two models with the same name occur in different packages, the package name must be specified, as in info("LinearRegressor", pkg="GLM").

Model document strings can be retreived, without importing the defining code, using the doc function:

doc("DecisionTreeClassifier", pkg="DecisionTree")

General model queries

We list all models (named tuples) using models(), and list the models for which code is already loaded with localmodels():

julia> localmodels()59-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
  (name = BayesianLDA, package_name = MultivariateStats, ... )
  (name = BayesianSubspaceLDA, package_name = MultivariateStats, ... )
@@ -56,6 +57,7 @@
  package_name = "MultivariateStats",
  is_supervised = true,
  abstract_type = Probabilistic,
+ constructor = nothing,
  deep_properties = (),
  docstring = "```\nBayesianLDA\n```\n\nA model type for constructing...",
  fit_data_scitype =
@@ -92,7 +94,7 @@
  transform_scitype = Unknown,
  input_scitype = Table{<:AbstractVector{<:Continuous}},
  target_scitype = AbstractVector{<:Finite},
- output_scitype = Table{<:AbstractVector{<:Continuous}})

One can search for models containing specified strings or regular expressions in their docstring attributes, as in

julia> models("forest")12-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+ output_scitype = Table{<:AbstractVector{<:Continuous}})

One can search for models containing specified strings or regular expressions in their docstring attributes, as in

julia> models("forest")12-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = GeneralImputer, package_name = BetaML, ... )
  (name = IForestDetector, package_name = OutlierDetectionPython, ... )
  (name = RandomForestClassifier, package_name = DecisionTree, ... )
@@ -107,7 +109,7 @@
  (name = StableRulesRegressor, package_name = SIRUS, ... )

or by specifying a filter (Bool-valued function):

julia> filter(model) = model.is_supervised &&
                        model.input_scitype >: MLJ.Table(Continuous) &&
                        model.target_scitype >: AbstractVector{<:Multiclass{3}} &&
-                       model.prediction_type == :deterministicfilter (generic function with 1 method)
julia> models(filter)12-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}: + model.prediction_type == :deterministicfilter (generic function with 1 method)
julia> models(filter)12-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}: (name = DeterministicConstantClassifier, package_name = MLJModels, ... ) (name = LinearSVC, package_name = LIBSVM, ... ) (name = NuSVC, package_name = LIBSVM, ... ) @@ -122,8 +124,8 @@ (name = SVMNuClassifier, package_name = MLJScikitLearnInterface, ... )

Multiple test arguments may be passed to models, which are applied conjunctively.

Matching models to data

Common searches are streamlined with the help of the matching command, defined as follows:

  • matching(model, X, y) == true exactly when model is supervised and admits inputs and targets with the scientific types of X and y, respectively

  • matching(model, X) == true exactly when model is unsupervised and admits inputs with the scientific types of X.

So, to search for all supervised probabilistic models handling input X and target y, one can define the testing function task by

task(model) = matching(model, X, y) && model.prediction_type == :probabilistic

And execute the search with

models(task)

Also defined are Bool-valued callable objects matching(model), matching(X, y) and matching(X), with obvious behavior. For example, matching(X, y)(model) = matching(model, X, y).

So, to search for all models compatible with input X and target y, for example, one executes

models(matching(X, y))

while the preceding search can also be written

models() do model
     matching(model, X, y) &&
     model.prediction_type == :probabilistic
-end

API

MLJModels.modelsFunction
models()

List all models in the MLJ registry. Here and below model means the registry metadata entry for a genuine model type (a proxy for types whose defining code may not be loaded).

models(filters..)

List all models m for which filter(m) is true, for each filter in filters.

models(matching(X, y))

List all supervised models compatible with training data X, y.

models(matching(X))

List all unsupervised models compatible with training data X.

Excluded in the listings are the built-in model-wraps, like EnsembleModel, TunedModel, and IteratedModel.

Example

If

task(model) = model.is_supervised && model.is_probabilistic

then models(task) lists all supervised models making probabilistic predictions.

See also: localmodels.

source
models(needle::Union{AbstractString,Regex})

List all models whole name or docstring matches a given needle.

source
MLJModels.localmodelsFunction
localmodels(; modl=Main)
-localmodels(filters...; modl=Main)
-localmodels(needle::Union{AbstractString,Regex}; modl=Main)

List all models currently available to the user from the module modl without importing a package, and which additional pass through the specified filters. Here a filter is a Bool-valued function on models.

Use load_path to get the path to some model returned, as in these examples:

ms = localmodels()
+end

API

MLJModels.modelsFunction
models(; wrappers=false)

List all models in the MLJ registry. Here and below model means the registry metadata entry for a genuine model type (a proxy for types whose defining code may not be loaded). To include wrappers and other composite models, such as TunedModel and Stack, specify wrappers=true.

models(filters...; wrappers=false)

List all models m for which filter(m) is true, for each filter in filters.

models(matching(X, y); wrappers=false)

List all supervised models compatible with training data X, y.

models(matching(X); wrappers=false)

List all unsupervised models compatible with training data X.

Example

If

task(model) = model.is_supervised && model.is_probabilistic

then models(task) lists all supervised models making probabilistic predictions.

See also: localmodels.

source
models(needle::Union{AbstractString,Regex}; wrappers=false)

List all models whole name or docstring matches a given needle.

source
MLJModels.localmodelsFunction
localmodels(; modl=Main, wrappers=false)
+localmodels(filters...; modl=Main, wrappers=false)
+localmodels(needle::Union{AbstractString,Regex}; modl=Main, wrappers=false)

List all models currently available to the user from the module modl without importing a package, and which additional pass through the specified filters. Here a filter is a Bool-valued function on models.

Use load_path to get the path to some model returned, as in these examples:

ms = localmodels()
 model = ms[1]
-load_path(model)

See also models, load_path.

source
+load_path(model)

See also models, load_path.

source
diff --git a/dev/model_stacking/index.html b/dev/model_stacking/index.html index 10d6a0de6..1d49f0ad2 100644 --- a/dev/model_stacking/index.html +++ b/dev/model_stacking/index.html @@ -1,5 +1,5 @@ -Model Stacking · MLJ

Model Stacking

In a model stack, as introduced by Wolpert (1992), an adjudicating model learns the best way to combine the predictions of multiple base models. In MLJ, such models are constructed using the Stack constructor. To learn more about stacking and to see how to construct a stack "by hand" using Learning Networks, see this Data Science in Julia tutorial

MLJBase.StackType
Stack(; metalearner=nothing, name1=model1, name2=model2, ..., keyword_options...)

Implements the two-layer generalized stack algorithm introduced by Wolpert (1992) and generalized by Van der Laan et al (2007). Returns an instance of type ProbabilisticStack or DeterministicStack, depending on the prediction type of metalearner.

When training a machine bound to such an instance:

  • The data is split into training/validation sets according to the specified resampling strategy.

  • Each base model model1, model2, ... is trained on each training subset and outputs predictions on the corresponding validation sets. The multi-fold predictions are spliced together into a so-called out-of-sample prediction for each model.

  • The adjudicating model, metalearner, is subsequently trained on the out-of-sample predictions to learn the best combination of base model predictions.

  • Each base model is retrained on all supplied data for purposes of passing on new production data onto the adjudicator for making new predictions

Arguments

  • metalearner::Supervised: The model that will optimize the desired criterion based on its internals. For instance, a LinearRegression model will optimize the squared error.

  • resampling: The resampling strategy used to prepare out-of-sample predictions of the base learners.

  • measures: A measure or iterable over measures, to perform an internal evaluation of the learners in the Stack while training. This is not for the evaluation of the Stack itself.

  • cache: Whether machines created in the learning network will cache data or not.

  • acceleration: A supported AbstractResource to define the training parallelization mode of the stack.

  • name1=model1, name2=model2, ...: the Supervised model instances to be used as base learners. The provided names become properties of the instance created to allow hyper-parameter access

Example

The following code defines a DeterministicStack instance for learning a Continuous target, and demonstrates that:

  • Base models can be Probabilistic models even if the stack itself is Deterministic (predict_mean is applied in such cases).

  • As an alternative to hyperparameter optimization, one can stack multiple copies of given model, mutating the hyper-parameter used in each copy.

using MLJ
+Model Stacking · MLJ

Model Stacking

In a model stack, as introduced by Wolpert (1992), an adjudicating model learns the best way to combine the predictions of multiple base models. In MLJ, such models are constructed using the Stack constructor. To learn more about stacking and to see how to construct a stack "by hand" using Learning Networks, see this Data Science in Julia tutorial

MLJBase.StackType
Stack(; metalearner=nothing, name1=model1, name2=model2, ..., keyword_options...)

Implements the two-layer generalized stack algorithm introduced by Wolpert (1992) and generalized by Van der Laan et al (2007). Returns an instance of type ProbabilisticStack or DeterministicStack, depending on the prediction type of metalearner.

When training a machine bound to such an instance:

  • The data is split into training/validation sets according to the specified resampling strategy.

  • Each base model model1, model2, ... is trained on each training subset and outputs predictions on the corresponding validation sets. The multi-fold predictions are spliced together into a so-called out-of-sample prediction for each model.

  • The adjudicating model, metalearner, is subsequently trained on the out-of-sample predictions to learn the best combination of base model predictions.

  • Each base model is retrained on all supplied data for purposes of passing on new production data onto the adjudicator for making new predictions

Arguments

  • metalearner::Supervised: The model that will optimize the desired criterion based on its internals. For instance, a LinearRegression model will optimize the squared error.

  • resampling: The resampling strategy used to prepare out-of-sample predictions of the base learners.

  • measures: A measure or iterable over measures, to perform an internal evaluation of the learners in the Stack while training. This is not for the evaluation of the Stack itself.

  • cache: Whether machines created in the learning network will cache data or not.

  • acceleration: A supported AbstractResource to define the training parallelization mode of the stack.

  • name1=model1, name2=model2, ...: the Supervised model instances to be used as base learners. The provided names become properties of the instance created to allow hyper-parameter access

Example

The following code defines a DeterministicStack instance for learning a Continuous target, and demonstrates that:

  • Base models can be Probabilistic models even if the stack itself is Deterministic (predict_mean is applied in such cases).

  • As an alternative to hyperparameter optimization, one can stack multiple copies of given model, mutating the hyper-parameter used in each copy.

using MLJ
 
 DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree
 EvoTreeRegressor = @load EvoTreeRegressor
@@ -21,4 +21,4 @@
 
 mach = machine(stack, X, y)
 evaluate!(mach; resampling=Holdout(), measure=rmse)
-

The internal evaluation report can be accessed like this and provides a PerformanceEvaluation object for each model:

report(mach).cv_report
source
+

The internal evaluation report can be accessed like this and provides a PerformanceEvaluation object for each model:

report(mach).cv_report
source
diff --git a/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html b/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html index 15be2dc0c..8717cfe7d 100644 --- a/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -ABODDetector · MLJ

ABODDetector

ABODDetector(k = 5,
+ABODDetector · MLJ

ABODDetector

ABODDetector(k = 5,
              metric = Euclidean(),
              algorithm = :kdtree,
              static = :auto,
@@ -10,4 +10,4 @@
 detector = ABODDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Kriegel, Hans-Peter; S hubert, Matthias; Zimek, Arthur (2008): Angle-based outlier detection in high-dimensional data.

[2] Li, Xiaojie; Lv, Jian Cheng; Cheng, Dongdong (2015): Angle-Based Outlier Detection Algorithm with More Stable Relationships.

+test_scores = transform(detector, model, X)

References

[1] Kriegel, Hans-Peter; S hubert, Matthias; Zimek, Arthur (2008): Angle-based outlier detection in high-dimensional data.

[2] Li, Xiaojie; Lv, Jian Cheng; Cheng, Dongdong (2015): Angle-Based Outlier Detection Algorithm with More Stable Relationships.

diff --git a/dev/models/ABODDetector_OutlierDetectionPython/index.html b/dev/models/ABODDetector_OutlierDetectionPython/index.html index b1bc9f471..8bbf696c9 100644 --- a/dev/models/ABODDetector_OutlierDetectionPython/index.html +++ b/dev/models/ABODDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -ABODDetector · MLJ
+ABODDetector · MLJ
diff --git a/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html b/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html index 075fb2c3c..5885d4cdc 100644 --- a/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ARDRegressor · MLJ

ARDRegressor

ARDRegressor

A model type for constructing a Bayesian ARD regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ARDRegressor = @load ARDRegressor pkg=MLJScikitLearnInterface

Do model = ARDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ARDRegressor(n_iter=...).

Hyper-parameters

  • n_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • threshold_lambda = 10000.0
  • fit_intercept = true
  • copy_X = true
  • verbose = false
+ARDRegressor · MLJ

ARDRegressor

ARDRegressor

A model type for constructing a Bayesian ARD regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ARDRegressor = @load ARDRegressor pkg=MLJScikitLearnInterface

Do model = ARDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ARDRegressor(max_iter=...).

Hyper-parameters

  • max_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • threshold_lambda = 10000.0
  • fit_intercept = true
  • copy_X = true
  • verbose = false
diff --git a/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html b/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html index 873385976..eefb0d5a6 100644 --- a/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AdaBoostClassifier · MLJ

AdaBoostClassifier

AdaBoostClassifier

A model type for constructing a ada boost classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostClassifier = @load AdaBoostClassifier pkg=MLJScikitLearnInterface

Do model = AdaBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostClassifier(estimator=...).

An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.

This class implements the algorithm known as AdaBoost-SAMME.

+AdaBoostClassifier · MLJ

AdaBoostClassifier

AdaBoostClassifier

A model type for constructing a ada boost classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostClassifier = @load AdaBoostClassifier pkg=MLJScikitLearnInterface

Do model = AdaBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostClassifier(estimator=...).

An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.

This class implements the algorithm known as AdaBoost-SAMME.

diff --git a/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html b/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html index d42242b13..684228c88 100644 --- a/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AdaBoostRegressor · MLJ

AdaBoostRegressor

AdaBoostRegressor

A model type for constructing a AdaBoost ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostRegressor = @load AdaBoostRegressor pkg=MLJScikitLearnInterface

Do model = AdaBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostRegressor(estimator=...).

An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.

This class implements the algorithm known as AdaBoost.R2.

+AdaBoostRegressor · MLJ

AdaBoostRegressor

AdaBoostRegressor

A model type for constructing a AdaBoost ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostRegressor = @load AdaBoostRegressor pkg=MLJScikitLearnInterface

Do model = AdaBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostRegressor(estimator=...).

An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.

This class implements the algorithm known as AdaBoost.R2.

diff --git a/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html b/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html index cb6355c27..01879b128 100644 --- a/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html +++ b/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html @@ -1,5 +1,5 @@ -AdaBoostStumpClassifier · MLJ

AdaBoostStumpClassifier

AdaBoostStumpClassifier

A model type for constructing a Ada-boosted stump classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostStumpClassifier = @load AdaBoostStumpClassifier pkg=DecisionTree

Do model = AdaBoostStumpClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostStumpClassifier(n_iter=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • n_iter=10: number of iterations of AdaBoost
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted Parameters

The fields of fitted_params(mach) are:

  • stumps: the Ensemble object returned by the core DecisionTree.jl algorithm.
  • coefficients: the stump coefficients (one per stump)

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+AdaBoostStumpClassifier · MLJ

AdaBoostStumpClassifier

AdaBoostStumpClassifier

A model type for constructing a Ada-boosted stump classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostStumpClassifier = @load AdaBoostStumpClassifier pkg=DecisionTree

Do model = AdaBoostStumpClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostStumpClassifier(n_iter=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • n_iter=10: number of iterations of AdaBoost
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted Parameters

The fields of fitted_params(mach) are:

  • stumps: the Ensemble object returned by the core DecisionTree.jl algorithm.
  • coefficients: the stump coefficients (one per stump)

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 Booster = @load AdaBoostStumpClassifier pkg=DecisionTree
 booster = Booster(n_iter=15)
 
@@ -16,4 +16,4 @@
 
 fitted_params(mach).stumps ## raw `Ensemble` object from DecisionTree.jl
 fitted_params(mach).coefs  ## coefficient associated with each stump
-feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.AdaBoostStumpClassifier.

+feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.AdaBoostStumpClassifier.

diff --git a/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html b/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html index 4315d803f..315bc1fc6 100644 --- a/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html +++ b/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AffinityPropagation · MLJ

AffinityPropagation

AffinityPropagation

A model type for constructing a Affinity Propagation Clustering of data, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AffinityPropagation = @load AffinityPropagation pkg=MLJScikitLearnInterface

Do model = AffinityPropagation() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damping=...).

Hyper-parameters

  • damping = 0.5
  • max_iter = 200
  • convergence_iter = 15
  • copy = true
  • preference = nothing
  • affinity = euclidean
  • verbose = false
+AffinityPropagation · MLJ

AffinityPropagation

AffinityPropagation

A model type for constructing a Affinity Propagation Clustering of data, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AffinityPropagation = @load AffinityPropagation pkg=MLJScikitLearnInterface

Do model = AffinityPropagation() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damping=...).

Hyper-parameters

  • damping = 0.5
  • max_iter = 200
  • convergence_iter = 15
  • copy = true
  • preference = nothing
  • affinity = euclidean
  • verbose = false
diff --git a/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html b/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html index 6bf4a10fe..367adb579 100644 --- a/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html +++ b/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AgglomerativeClustering · MLJ

AgglomerativeClustering

AgglomerativeClustering

A model type for constructing a agglomerative clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AgglomerativeClustering = @load AgglomerativeClustering pkg=MLJScikitLearnInterface

Do model = AgglomerativeClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AgglomerativeClustering(n_clusters=...).

Recursively merges the pair of clusters that minimally increases a given linkage distance. Note: there is no predict or transform. Instead, inspect the fitted_params.

+AgglomerativeClustering · MLJ

AgglomerativeClustering

AgglomerativeClustering

A model type for constructing a agglomerative clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AgglomerativeClustering = @load AgglomerativeClustering pkg=MLJScikitLearnInterface

Do model = AgglomerativeClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AgglomerativeClustering(n_clusters=...).

Recursively merges the pair of clusters that minimally increases a given linkage distance. Note: there is no predict or transform. Instead, inspect the fitted_params.

diff --git a/dev/models/AutoEncoder_BetaML/index.html b/dev/models/AutoEncoder_BetaML/index.html index 2033b0471..05f1c24eb 100644 --- a/dev/models/AutoEncoder_BetaML/index.html +++ b/dev/models/AutoEncoder_BetaML/index.html @@ -1,5 +1,5 @@ -AutoEncoder · MLJ

AutoEncoder

mutable struct AutoEncoder <: MLJModelInterface.Unsupervised

A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks

Parameters:

  • encoded_size: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def: 0.33]

  • layers_size: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def: nothing that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. If e_layers or d_layers are specified, this parameter is ignored for the respective part.

  • e_layers: The layers (vector of AbstractLayers) responsable of the encoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • d_layers: The layers (vector of AbstractLayers) responsable of the decoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as (n x d) matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost if loss==squared_cost, nothing otherwise, i.e. use the derivative of the squared cost or autodiff]

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 8]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()] See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • tunemethod: The method - and its parameters - to employ for hyperparameters autotuning. See SuccessiveHalvingSearch for the default method. To implement automatic hyperparameter tuning during the (first) fit! call simply set autotune=true and eventually change the default tunemethod options (including the parameter ranges, the resources to employ and the loss function to adopt).

  • descr: An optional title and/or description for this model

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • use transform to obtain the encoded data, and inverse_trasnform to decode to the original data

Example:

julia> using MLJ
+AutoEncoder · MLJ

AutoEncoder

mutable struct AutoEncoder <: MLJModelInterface.Unsupervised

A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks

Parameters:

  • encoded_size: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def: 0.33]

  • layers_size: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def: nothing that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. If e_layers or d_layers are specified, this parameter is ignored for the respective part.

  • e_layers: The layers (vector of AbstractLayers) responsable of the encoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • d_layers: The layers (vector of AbstractLayers) responsable of the decoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as (n x d) matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost if loss==squared_cost, nothing otherwise, i.e. use the derivative of the squared cost or autodiff]

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 8]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()] See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • tunemethod: The method - and its parameters - to employ for hyperparameters autotuning. See SuccessiveHalvingSearch for the default method. To implement automatic hyperparameter tuning during the (first) fit! call simply set autotune=true and eventually change the default tunemethod options (including the parameter ranges, the resources to employ and the loss function to adopt).

  • descr: An optional title and/or description for this model

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • use transform to obtain the encoded data, and inverse_trasnform to decode to the original data

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -59,4 +59,4 @@
 julia> BetaML.relative_mean_error(MLJ.matrix(X),X_recovered)
 0.03387721261716176
 
-
+
diff --git a/dev/models/BM25Transformer_MLJText/index.html b/dev/models/BM25Transformer_MLJText/index.html index 8a6b87b96..a3c40615d 100644 --- a/dev/models/BM25Transformer_MLJText/index.html +++ b/dev/models/BM25Transformer_MLJText/index.html @@ -1,5 +1,5 @@ -BM25Transformer · MLJ

BM25Transformer

BM25Transformer

A model type for constructing a b m25 transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BM25Transformer = @load BM25Transformer pkg=MLJText

Do model = BM25Transformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BM25Transformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of Okapi BM25 document-word statistics. The BM25 scoring function uses both term frequency (TF) and inverse document frequency (IDF, defined below), as in TfidfTransformer, but additionally adjusts for the probability that a user will consider a search result relevant based, on the terms in the search query and those in each document.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

References:

  • http://ethen8181.github.io/machine-learning/search/bm25_intro.html
  • https://en.wikipedia.org/wiki/Okapi_BM25
  • https://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • κ=2: The term frequency saturation characteristic. Higher values represent slower saturation. What we mean by saturation is the degree to which a term occurring extra times adds to the overall score.
  • β=0.075: Amplifies the particular document length compared to the average length. The bigger β is, the more document length is amplified in terms of the overall score. The default value is 0.75, and the bounds are restricted between 0 and 1.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary, IDF, and mean word counts learned in training, return the matrix of BM25 scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.
  • mean_words_in_docs: The mean number of words in each document.

Examples

BM25Transformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
+BM25Transformer · MLJ

BM25Transformer

BM25Transformer

A model type for constructing a b m25 transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BM25Transformer = @load BM25Transformer pkg=MLJText

Do model = BM25Transformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BM25Transformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of Okapi BM25 document-word statistics. The BM25 scoring function uses both term frequency (TF) and inverse document frequency (IDF, defined below), as in TfidfTransformer, but additionally adjusts for the probability that a user will consider a search result relevant based, on the terms in the search query and those in each document.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

References:

  • http://ethen8181.github.io/machine-learning/search/bm25_intro.html
  • https://en.wikipedia.org/wiki/Okapi_BM25
  • https://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • κ=2: The term frequency saturation characteristic. Higher values represent slower saturation. What we mean by saturation is the degree to which a term occurring extra times adds to the overall score.
  • β=0.075: Amplifies the particular document length compared to the average length. The bigger β is, the more document length is amplified in terms of the overall score. The default value is 0.75, and the bounds are restricted between 0 and 1.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary, IDF, and mean word counts learned in training, return the matrix of BM25 scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.
  • mean_words_in_docs: The mean number of words in each document.

Examples

BM25Transformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
 import TextAnalysis
 
 BM25Transformer = @load BM25Transformer pkg=MLJText
@@ -43,4 +43,4 @@
 MLJ.fit!(mach)
 fitted_params(mach)
 
-tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, CountTransformer

+tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, CountTransformer

diff --git a/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html b/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html index 6d587b450..e9ee6db73 100644 --- a/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BaggingClassifier · MLJ

BaggingClassifier

BaggingClassifier

A model type for constructing a bagging ensemble classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingClassifier = @load BaggingClassifier pkg=MLJScikitLearnInterface

Do model = BaggingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingClassifier(estimator=...).

A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

+BaggingClassifier · MLJ

BaggingClassifier

BaggingClassifier

A model type for constructing a bagging ensemble classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingClassifier = @load BaggingClassifier pkg=MLJScikitLearnInterface

Do model = BaggingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingClassifier(estimator=...).

A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

diff --git a/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html b/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html index 7bbc690bd..5135980c4 100644 --- a/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BaggingRegressor · MLJ

BaggingRegressor

BaggingRegressor

A model type for constructing a bagging ensemble regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingRegressor = @load BaggingRegressor pkg=MLJScikitLearnInterface

Do model = BaggingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingRegressor(estimator=...).

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

+BaggingRegressor · MLJ

BaggingRegressor

BaggingRegressor

A model type for constructing a bagging ensemble regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingRegressor = @load BaggingRegressor pkg=MLJScikitLearnInterface

Do model = BaggingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingRegressor(estimator=...).

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

diff --git a/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html b/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html new file mode 100644 index 000000000..abe05faa8 --- /dev/null +++ b/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html @@ -0,0 +1,27 @@ + +BalancedBaggingClassifier · MLJ

BalancedBaggingClassifier

BalancedBaggingClassifier

A model type for constructing a balanced bagging classifier, based on MLJBalancing.jl.

From MLJ, the type can be imported using

BalancedBaggingClassifier = @load BalancedBaggingClassifier pkg=MLJBalancing

Construct an instance with default hyper-parameters using the syntax bagging_model = BalancedBaggingClassifier(model=...)

Given a probablistic classifier.BalancedBaggingClassifier performs bagging by undersampling only majority data in each bag so that its includes as much samples as in the minority data. This is proposed with an Adaboost classifier where the output scores are averaged in the paper Xu-Ying Liu, Jianxin Wu, & Zhi-Hua Zhou. (2009). Exploratory Undersampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39 (2), 539–5501

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: input features of a form supported by the model being wrapped (typically a table, e.g., DataFrame, with Continuous columns will be supported, as a minimum)
  • y: the binary target, which can be any AbstractVector where length(unique(y)) == 2

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • model::Probabilistic: The classifier to use to train on each bag.
  • T::Integer=0: The number of bags to be used in the ensemble. If not given, will be set as the ratio between the frequency of the majority and minority classes. Can be later found in report(mach).
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if Julia VERSION>=1.7. Otherwise, uses MersenneTwister`.

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.

  • predict_mode(mach, Xnew): return the mode of each prediction above

Example

using MLJ
+using Imbalance
+
+## Load base classifier and BalancedBaggingClassifier
+BalancedBaggingClassifier = @load BalancedBaggingClassifier pkg=MLJBalancing
+LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0
+
+## Construct the base classifier and use it to construct a BalancedBaggingClassifier
+logistic_model = LogisticClassifier()
+model = BalancedBaggingClassifier(model=logistic_model, T=5)
+
+## Load the data and train the BalancedBaggingClassifier
+X, y = Imbalance.generate_imbalanced_data(100, 5; num_vals_per_category = [3, 2],
+                                            class_probs = [0.9, 0.1],
+                                            type = "ColTable",
+                                            rng=42)
+julia> Imbalance.checkbalance(y)
+1: ▇▇▇▇▇▇▇▇▇▇ 16 (19.0%)
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 84 (100.0%)
+
+mach = machine(model, X, y) |> fit!
+
+## Predict using the trained model
+
+yhat = predict(mach, X)     ## probabilistic predictions
+predict_mode(mach, X)       ## point predictions
diff --git a/dev/models/BalancedModel_MLJBalancing/index.html b/dev/models/BalancedModel_MLJBalancing/index.html new file mode 100644 index 000000000..955418b02 --- /dev/null +++ b/dev/models/BalancedModel_MLJBalancing/index.html @@ -0,0 +1,23 @@ + +BalancedModel · MLJ

BalancedModel

BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
+BalancedModel(model;  balancer1=balancer_model1, balancer2=balancer_model2, ...)

Given a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.

Operation

  • During training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.
  • During prediction, the balancers have no effect.

Arguments

  • model::Supervised: A classification model that implements the MLJModelInterface.
  • balancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.
  • balancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.
  • and so on for an arbitrary number of balancers.

Returns

  • An instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.

Example

using MLJ
+using Imbalance
+
+## generate data
+X, y = Imbalance.generate_imbalanced_data(1000, 5; class_probs=[0.2, 0.3, 0.5])
+
+## prepare classification and balancing models
+SMOTENC = @load SMOTENC pkg=Imbalance verbosity=0
+TomekUndersampler = @load TomekUndersampler pkg=Imbalance verbosity=0
+LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0
+
+oversampler = SMOTENC(k=5, ratios=1.0, rng=42)
+undersampler = TomekUndersampler(min_ratios=0.5, rng=42)
+logistic_model = LogisticClassifier()
+
+## wrap them in a BalancedModel
+balanced_model = BalancedModel(model=logistic_model, balancer1=oversampler, balancer2=undersampler)
+
+## now this behaves as a unified model that can be trained, validated, fine-tuned, etc.
+mach = machine(balanced_model, X, y)
+fit!(mach)
diff --git a/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html b/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html index 7d1b2ff61..6f75c63a8 100644 --- a/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html +++ b/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian linear discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MLJScikitLearnInterface

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(solver=...).

Hyper-parameters

  • solver = svd
  • shrinkage = nothing
  • priors = nothing
  • n_components = nothing
  • store_covariance = false
  • tol = 0.0001
  • covariance_estimator = nothing
+BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian linear discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MLJScikitLearnInterface

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(solver=...).

Hyper-parameters

  • solver = svd
  • shrinkage = nothing
  • priors = nothing
  • n_components = nothing
  • store_covariance = false
  • tol = 0.0001
  • covariance_estimator = nothing
diff --git a/dev/models/BayesianLDA_MultivariateStats/index.html b/dev/models/BayesianLDA_MultivariateStats/index.html index a0b6fef18..694c3a6a8 100644 --- a/dev/models/BayesianLDA_MultivariateStats/index.html +++ b/dev/models/BayesianLDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MultivariateStats

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(method=...).

The Bayesian multiclass LDA algorithm learns a projection matrix as described in ordinary LDA. Predicted class posterior probability distributions are derived by applying Bayes' rule with a multivariate Gaussian class-conditional distribution. A prior class distribution can be specified by the user or inferred from training data class frequency.

See also the package documentation. For more information about the algorithm, see Li, Zhu and Ogihara (2006): Using Discriminant Analysis for Multi-class Classification: An Experimental Investigation.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: choice of solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e., dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
+BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MultivariateStats

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(method=...).

The Bayesian multiclass LDA algorithm learns a projection matrix as described in ordinary LDA. Predicted class posterior probability distributions are derived by applying Bayes' rule with a multivariate Gaussian class-conditional distribution. A prior class distribution can be specified by the user or inferred from training data class frequency.

See also the package documentation. For more information about the algorithm, see Li, Zhu and Ogihara (2006): Using Discriminant Analysis for Multi-class Classification: An Experimental Investigation.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: choice of solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e., dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
 
 BayesianLDA = @load BayesianLDA pkg=MultivariateStats
 
@@ -10,4 +10,4 @@
 
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
-labels = predict_mode(mach, X)

See also LDA, SubspaceLDA, BayesianSubspaceLDA

+labels = predict_mode(mach, X)

See also LDA, SubspaceLDA, BayesianSubspaceLDA

diff --git a/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html b/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html index 93a5c8992..c9640c9df 100644 --- a/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html +++ b/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BayesianQDA · MLJ

BayesianQDA

BayesianQDA

A model type for constructing a Bayesian quadratic discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianQDA = @load BayesianQDA pkg=MLJScikitLearnInterface

Do model = BayesianQDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianQDA(priors=...).

Hyper-parameters

  • priors = nothing
  • reg_param = 0.0
  • store_covariance = false
  • tol = 0.0001
+BayesianQDA · MLJ

BayesianQDA

BayesianQDA

A model type for constructing a Bayesian quadratic discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianQDA = @load BayesianQDA pkg=MLJScikitLearnInterface

Do model = BayesianQDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianQDA(priors=...).

Hyper-parameters

  • priors = nothing
  • reg_param = 0.0
  • store_covariance = false
  • tol = 0.0001
diff --git a/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html b/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html index a0ac97db5..9c93d27f1 100644 --- a/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BayesianRidgeRegressor · MLJ

BayesianRidgeRegressor

BayesianRidgeRegressor

A model type for constructing a Bayesian ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianRidgeRegressor = @load BayesianRidgeRegressor pkg=MLJScikitLearnInterface

Do model = BayesianRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianRidgeRegressor(n_iter=...).

Hyper-parameters

  • n_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • fit_intercept = true
  • copy_X = true
  • verbose = false
+BayesianRidgeRegressor · MLJ

BayesianRidgeRegressor

BayesianRidgeRegressor

A model type for constructing a Bayesian ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianRidgeRegressor = @load BayesianRidgeRegressor pkg=MLJScikitLearnInterface

Do model = BayesianRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianRidgeRegressor(max_iter=...).

Hyper-parameters

  • max_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • fit_intercept = true
  • copy_X = true
  • verbose = false
diff --git a/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html b/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html index d3fb05a47..9e8c57508 100644 --- a/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html +++ b/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -BayesianSubspaceLDA · MLJ

BayesianSubspaceLDA

BayesianSubspaceLDA

A model type for constructing a Bayesian subspace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats

Do model = BayesianSubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianSubspaceLDA(normalize=...).

The Bayesian multiclass subspace linear discriminant analysis algorithm learns a projection matrix as described in SubspaceLDA. The posterior class probability distribution is derived as in BayesianLDA.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.

outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.

  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The overall mean of the training data.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
+BayesianSubspaceLDA · MLJ

BayesianSubspaceLDA

BayesianSubspaceLDA

A model type for constructing a Bayesian subspace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats

Do model = BayesianSubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianSubspaceLDA(normalize=...).

The Bayesian multiclass subspace linear discriminant analysis algorithm learns a projection matrix as described in SubspaceLDA. The posterior class probability distribution is derived as in BayesianLDA.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.

outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.

  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The overall mean of the training data.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
 
 BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats
 
@@ -10,4 +10,4 @@
 
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
-labels = predict_mode(mach, X)

See also LDA, BayesianLDA, SubspaceLDA

+labels = predict_mode(mach, X)

See also LDA, BayesianLDA, SubspaceLDA

diff --git a/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html index f9a9ad8e4..27a8e3b1e 100644 --- a/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BernoulliNBClassifier · MLJ

BernoulliNBClassifier

BernoulliNBClassifier

A model type for constructing a Bernoulli naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BernoulliNBClassifier = @load BernoulliNBClassifier pkg=MLJScikitLearnInterface

Do model = BernoulliNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BernoulliNBClassifier(alpha=...).

Binomial naive bayes classifier. It is suitable for classification with binary features; features will be binarized based on the binarize keyword (unless it's nothing in which case the features are assumed to be binary).

+BernoulliNBClassifier · MLJ

BernoulliNBClassifier

BernoulliNBClassifier

A model type for constructing a Bernoulli naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BernoulliNBClassifier = @load BernoulliNBClassifier pkg=MLJScikitLearnInterface

Do model = BernoulliNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BernoulliNBClassifier(alpha=...).

Binomial naive bayes classifier. It is suitable for classification with binary features; features will be binarized based on the binarize keyword (unless it's nothing in which case the features are assumed to be binary).

diff --git a/dev/models/BinaryThresholdPredictor_MLJModels/index.html b/dev/models/BinaryThresholdPredictor_MLJModels/index.html new file mode 100644 index 000000000..757c5c2fc --- /dev/null +++ b/dev/models/BinaryThresholdPredictor_MLJModels/index.html @@ -0,0 +1,26 @@ + +BinaryThresholdPredictor · MLJ

BinaryThresholdPredictor

BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
+rng = Xoshiro(123)
+
+diabetes = OpenML.load(43582)
+outcome, X = unpack(diabetes, ==(:Outcome), rng=rng);
+y = coerce(Int.(outcome), OrderedFactor);

Choosing a probabilistic classifier:

EvoTreesClassifier = @load EvoTreesClassifier
+prob_predictor = EvoTreesClassifier()

Wrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:

point_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)
+Xnew, _ = make_moons(3, rng=rng)
+mach = machine(point_predictor, X, y) |> fit!
+predict(mach, X)[1:3] ## [0, 0, 0]

Estimating performance:

balanced = BalancedAccuracy(adjusted=true)
+e = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])
+e.measurement[1] ## 0.405 ± 0.089

Wrapping in tuning strategy to learn threshold that maximizes balanced accuracy:

r = range(point_predictor, :threshold, lower=0.1, upper=0.9)
+tuned_point_predictor = TunedModel(
+    point_predictor,
+    tuning=RandomSearch(rng=rng),
+    resampling=CV(nfolds=6),
+    range = r,
+    measure=balanced,
+    n=30,
+)
+mach2 = machine(tuned_point_predictor, X, y) |> fit!
+optimized_point_predictor = report(mach2).best_model
+optimized_point_predictor.threshold ## 0.260
+predict(mach2, X)[1:3] ## [1, 1, 0]

Estimating the performance of the auto-thresholding model (nested resampling here):

e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
+e.measurement[1] ## 0.477 ± 0.110
diff --git a/dev/models/Birch_MLJScikitLearnInterface/index.html b/dev/models/Birch_MLJScikitLearnInterface/index.html index 6448978cc..703236b90 100644 --- a/dev/models/Birch_MLJScikitLearnInterface/index.html +++ b/dev/models/Birch_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -Birch · MLJ

Birch

Birch

A model type for constructing a birch, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Birch = @load Birch pkg=MLJScikitLearnInterface

Do model = Birch() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Birch(threshold=...).

Memory-efficient, online-learning algorithm provided as an alternative to MiniBatchKMeans. Note: noisy samples are given the label -1.

+Birch · MLJ

Birch

Birch

A model type for constructing a birch, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Birch = @load Birch pkg=MLJScikitLearnInterface

Do model = Birch() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Birch(threshold=...).

Memory-efficient, online-learning algorithm provided as an alternative to MiniBatchKMeans. Note: noisy samples are given the label -1.

diff --git a/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html b/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html index 7a52d7d1d..c6dcda0fc 100644 --- a/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html +++ b/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BisectingKMeans · MLJ

BisectingKMeans

BisectingKMeans

A model type for constructing a bisecting k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BisectingKMeans = @load BisectingKMeans pkg=MLJScikitLearnInterface

Do model = BisectingKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BisectingKMeans(n_clusters=...).

Bisecting K-Means clustering.

+BisectingKMeans · MLJ

BisectingKMeans

BisectingKMeans

A model type for constructing a bisecting k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BisectingKMeans = @load BisectingKMeans pkg=MLJScikitLearnInterface

Do model = BisectingKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BisectingKMeans(n_clusters=...).

Bisecting K-Means clustering.

diff --git a/dev/models/BorderlineSMOTE1_Imbalance/index.html b/dev/models/BorderlineSMOTE1_Imbalance/index.html index 928da5a54..7579c6bdd 100644 --- a/dev/models/BorderlineSMOTE1_Imbalance/index.html +++ b/dev/models/BorderlineSMOTE1_Imbalance/index.html @@ -1,5 +1,5 @@ -BorderlineSMOTE1 · MLJ

BorderlineSMOTE1

Initiate a BorderlineSMOTE1 model with the given hyper-parameters.

BorderlineSMOTE1

A model type for constructing a borderline smot e1, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance

Do model = BorderlineSMOTE1() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BorderlineSMOTE1(m=...).

BorderlineSMOTE1 implements the BorderlineSMOTE1 algorithm to correct for class imbalance as in Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In D.S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = BorderlineSMOTE1()

Hyperparameters

  • m::Integer=5: The number of neighbors to consider while checking the BorderlineSMOTE1 condition. Should be within the range 0 < m < N where N is the number of observations in the data. It will be automatically set to N-1 if N ≤ m.

  • k::Integer=5: Number of nearest neighbors to consider in the SMOTE part of the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class. It will be automatically set to l-1 for any class with l points where l ≤ k.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • verbosity::Integer=1: Whenever higher than 0 info regarding the points that will participate in oversampling is logged.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using BorderlineSMOTE1, returning both the new and original observations

Example

using MLJ
+BorderlineSMOTE1 · MLJ

BorderlineSMOTE1

Initiate a BorderlineSMOTE1 model with the given hyper-parameters.

BorderlineSMOTE1

A model type for constructing a borderline smot e1, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance

Do model = BorderlineSMOTE1() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BorderlineSMOTE1(m=...).

BorderlineSMOTE1 implements the BorderlineSMOTE1 algorithm to correct for class imbalance as in Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In D.S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = BorderlineSMOTE1()

Hyperparameters

  • m::Integer=5: The number of neighbors to consider while checking the BorderlineSMOTE1 condition. Should be within the range 0 < m < N where N is the number of observations in the data. It will be automatically set to N-1 if N ≤ m.

  • k::Integer=5: Number of nearest neighbors to consider in the SMOTE part of the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class. It will be automatically set to l-1 for any class with l points where l ≤ k.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • verbosity::Integer=1: Whenever higher than 0 info regarding the points that will participate in oversampling is logged.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using BorderlineSMOTE1, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -28,4 +28,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 392 (80.0%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 441 (90.0%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%)
diff --git a/dev/models/CBLOFDetector_OutlierDetectionPython/index.html b/dev/models/CBLOFDetector_OutlierDetectionPython/index.html index 568ef83a0..48525f422 100644 --- a/dev/models/CBLOFDetector_OutlierDetectionPython/index.html +++ b/dev/models/CBLOFDetector_OutlierDetectionPython/index.html @@ -1,7 +1,7 @@ -CBLOFDetector · MLJ

CBLOFDetector

CBLOFDetector(n_clusters = 8,
+CBLOFDetector · MLJ
+                 n_jobs = 1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cblof

diff --git a/dev/models/CDDetector_OutlierDetectionPython/index.html b/dev/models/CDDetector_OutlierDetectionPython/index.html index 66ebbff77..7251495cb 100644 --- a/dev/models/CDDetector_OutlierDetectionPython/index.html +++ b/dev/models/CDDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -CDDetector · MLJ
+CDDetector · MLJ
diff --git a/dev/models/COFDetector_OutlierDetectionNeighbors/index.html b/dev/models/COFDetector_OutlierDetectionNeighbors/index.html index b0398b6fa..26bbbb037 100644 --- a/dev/models/COFDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/COFDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -COFDetector · MLJ

COFDetector

COFDetector(k = 5,
+COFDetector · MLJ

COFDetector

COFDetector(k = 5,
             metric = Euclidean(),
             algorithm = :kdtree,
             leafsize = 10,
@@ -8,4 +8,4 @@
 detector = COFDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Tang, Jian; Chen, Zhixiang; Fu, Ada Wai-Chee; Cheung, David Wai-Lok (2002): Enhancing Effectiveness of Outlier Detections for Low Density Patterns.

+test_scores = transform(detector, model, X)

References

[1] Tang, Jian; Chen, Zhixiang; Fu, Ada Wai-Chee; Cheung, David Wai-Lok (2002): Enhancing Effectiveness of Outlier Detections for Low Density Patterns.

diff --git a/dev/models/COFDetector_OutlierDetectionPython/index.html b/dev/models/COFDetector_OutlierDetectionPython/index.html index ec66d6b31..b64880066 100644 --- a/dev/models/COFDetector_OutlierDetectionPython/index.html +++ b/dev/models/COFDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -COFDetector · MLJ
+COFDetector · MLJ
diff --git a/dev/models/COPODDetector_OutlierDetectionPython/index.html b/dev/models/COPODDetector_OutlierDetectionPython/index.html index 45e9c29a8..309193805 100644 --- a/dev/models/COPODDetector_OutlierDetectionPython/index.html +++ b/dev/models/COPODDetector_OutlierDetectionPython/index.html @@ -1,2 +1,2 @@ -COPODDetector · MLJ
+COPODDetector · MLJ
diff --git a/dev/models/CatBoostClassifier_CatBoost/index.html b/dev/models/CatBoostClassifier_CatBoost/index.html index 515549967..fc5254229 100644 --- a/dev/models/CatBoostClassifier_CatBoost/index.html +++ b/dev/models/CatBoostClassifier_CatBoost/index.html @@ -1,5 +1,5 @@ -CatBoostClassifier · MLJ

CatBoostClassifier

CatBoostClassifier

A model type for constructing a CatBoost classifier, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostClassifier = @load CatBoostClassifier pkg=CatBoost

Do model = CatBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostClassifier(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostClassifier model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
+CatBoostClassifier · MLJ

CatBoostClassifier

CatBoostClassifier

A model type for constructing a CatBoost classifier, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostClassifier = @load CatBoostClassifier pkg=CatBoost

Do model = CatBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostClassifier(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostClassifier model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
 using MLJ
 
 X = (
@@ -13,4 +13,4 @@
 mach = machine(model, X, y)
 fit!(mach)
 probs = predict(mach, X)
-preds = predict_mode(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostClassifier.

+preds = predict_mode(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostClassifier.

diff --git a/dev/models/CatBoostRegressor_CatBoost/index.html b/dev/models/CatBoostRegressor_CatBoost/index.html index 025f4e601..c53a4a28f 100644 --- a/dev/models/CatBoostRegressor_CatBoost/index.html +++ b/dev/models/CatBoostRegressor_CatBoost/index.html @@ -1,5 +1,5 @@ -CatBoostRegressor · MLJ

CatBoostRegressor

CatBoostRegressor

A model type for constructing a CatBoost regressor, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostRegressor = @load CatBoostRegressor pkg=CatBoost

Do model = CatBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostRegressor(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostRegressor model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
+CatBoostRegressor · MLJ

CatBoostRegressor

CatBoostRegressor

A model type for constructing a CatBoost regressor, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostRegressor = @load CatBoostRegressor pkg=CatBoost

Do model = CatBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostRegressor(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostRegressor model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
 using MLJ
 
 X = (
@@ -12,4 +12,4 @@
 model = CatBoostRegressor(iterations=5)
 mach = machine(model, X, y)
 fit!(mach)
-preds = predict(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostRegressor.

+preds = predict(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostRegressor.

diff --git a/dev/models/ClusterUndersampler_Imbalance/index.html b/dev/models/ClusterUndersampler_Imbalance/index.html index b88cb7ab8..8c4a74df8 100644 --- a/dev/models/ClusterUndersampler_Imbalance/index.html +++ b/dev/models/ClusterUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -ClusterUndersampler · MLJ

ClusterUndersampler

Initiate a cluster undersampling model with the given hyper-parameters.

ClusterUndersampler

A model type for constructing a cluster undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ClusterUndersampler = @load ClusterUndersampler pkg=Imbalance

Do model = ClusterUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ClusterUndersampler(mode=...).

ClusterUndersampler implements clustering undersampling as presented in Wei-Chao, L., Chih-Fong, T., Ya-Han, H., & Jing-Shang, J. (2017). Clustering-based undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26. with K-means as the clustering algorithm.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed with model = ClusterUndersampler().

Hyperparameters

  • mode::AbstractString="nearest: If "center" then the undersampled data will consist of the centriods of
each cluster found; if `"nearest"` then it will consist of the nearest neighbor of each centroid.
  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • maxiter::Integer=100: Maximum number of iterations to run K-means

  • rng::Integer=42: Random number generator seed. Must be an integer.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using ClusterUndersampler, returning the undersampled versions

Example

using MLJ
+ClusterUndersampler · MLJ

ClusterUndersampler

Initiate a cluster undersampling model with the given hyper-parameters.

ClusterUndersampler

A model type for constructing a cluster undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ClusterUndersampler = @load ClusterUndersampler pkg=Imbalance

Do model = ClusterUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ClusterUndersampler(mode=...).

ClusterUndersampler implements clustering undersampling as presented in Wei-Chao, L., Chih-Fong, T., Ya-Han, H., & Jing-Shang, J. (2017). Clustering-based undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26. with K-means as the clustering algorithm.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed with model = ClusterUndersampler().

Hyperparameters

  • mode::AbstractString="nearest: If "center" then the undersampled data will consist of the centriods of
each cluster found; if `"nearest"` then it will consist of the nearest neighbor of each centroid.
  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • maxiter::Integer=100: Maximum number of iterations to run K-means

  • rng::Integer=42: Random number generator seed. Must be an integer.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using ClusterUndersampler, returning the undersampled versions

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -29,4 +29,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
-1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)
+1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)
diff --git a/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html index dbd09bcc1..8133be99a 100644 --- a/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ComplementNBClassifier · MLJ

ComplementNBClassifier

ComplementNBClassifier

A model type for constructing a Complement naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ComplementNBClassifier = @load ComplementNBClassifier pkg=MLJScikitLearnInterface

Do model = ComplementNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ComplementNBClassifier(alpha=...).

Similar to MultinomialNBClassifier but with more robust assumptions. Suited for imbalanced datasets.

+ComplementNBClassifier · MLJ

ComplementNBClassifier

ComplementNBClassifier

A model type for constructing a Complement naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ComplementNBClassifier = @load ComplementNBClassifier pkg=MLJScikitLearnInterface

Do model = ComplementNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ComplementNBClassifier(alpha=...).

Similar to MultinomialNBClassifier but with more robust assumptions. Suited for imbalanced datasets.

diff --git a/dev/models/ConstantClassifier_MLJModels/index.html b/dev/models/ConstantClassifier_MLJModels/index.html index 845ee3e97..e473746ca 100644 --- a/dev/models/ConstantClassifier_MLJModels/index.html +++ b/dev/models/ConstantClassifier_MLJModels/index.html @@ -1,5 +1,5 @@ -ConstantClassifier · MLJ

ConstantClassifier

ConstantClassifier

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.

Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

None.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
+ConstantClassifier · MLJ

ConstantClassifier

ConstantClassifier

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.

Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

None.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
 
 clf = ConstantClassifier()
 
@@ -26,4 +26,4 @@
 pdf(yhat, L)
 
 ## point predictions:
-predict_mode(mach, Xnew)

See also ConstantRegressor

+predict_mode(mach, Xnew)

See also ConstantRegressor

diff --git a/dev/models/ConstantRegressor_MLJModels/index.html b/dev/models/ConstantRegressor_MLJModels/index.html index 3bf1eedca..3e55a162b 100644 --- a/dev/models/ConstantRegressor_MLJModels/index.html +++ b/dev/models/ConstantRegressor_MLJModels/index.html @@ -1,5 +1,5 @@ -ConstantRegressor · MLJ

ConstantRegressor

ConstantRegressor

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution returned is the one of the type specified that best fits the training target data. Use predict_mean or predict_median to predict the mean or median values instead. If not specified, a normal distribution is fit.

Almost any reasonable model is expected to outperform ConstantRegressor which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantRegressor() or model = ConstantRegressor(distribution=...) to construct a model instance.

Training data

In MLJ (or MLJBase) bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • distribution_type=Distributions.Normal: The distribution to be fit to the target data. Must be a subtype of Distributions.ContinuousUnivariateDistribution.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mean(mach, Xnew): Return instead the means of the probabilistic predictions returned above.
  • predict_median(mach, Xnew): Return instead the medians of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
+ConstantRegressor · MLJ

ConstantRegressor

ConstantRegressor

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution returned is the one of the type specified that best fits the training target data. Use predict_mean or predict_median to predict the mean or median values instead. If not specified, a normal distribution is fit.

Almost any reasonable model is expected to outperform ConstantRegressor which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantRegressor() or model = ConstantRegressor(distribution=...) to construct a model instance.

Training data

In MLJ (or MLJBase) bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • distribution_type=Distributions.Normal: The distribution to be fit to the target data. Must be a subtype of Distributions.ContinuousUnivariateDistribution.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mean(mach, Xnew): Return instead the means of the probabilistic predictions returned above.
  • predict_median(mach, Xnew): Return instead the medians of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
 
 X, y = make_regression(10, 2) ## synthetic data: a table and vector
 regressor = ConstantRegressor()
@@ -10,4 +10,4 @@
 Xnew, _ = make_regression(3, 2)
 predict(mach, Xnew)
 predict_mean(mach, Xnew)
-

See also ConstantClassifier

+

See also ConstantClassifier

diff --git a/dev/models/ContinuousEncoder_MLJModels/index.html b/dev/models/ContinuousEncoder_MLJModels/index.html index 27abef6c5..b37936bec 100644 --- a/dev/models/ContinuousEncoder_MLJModels/index.html +++ b/dev/models/ContinuousEncoder_MLJModels/index.html @@ -1,5 +1,5 @@ -ContinuousEncoder · MLJ

ContinuousEncoder

ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.
  • If ftr is Multiclass, one-hot encode it.
  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.
  • If ftr is Count, replace it with coerce(ftr, Continuous).
  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.
  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table
  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding
  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table
  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
+ContinuousEncoder · MLJ

ContinuousEncoder

ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.
  • If ftr is Multiclass, one-hot encode it.
  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.
  • If ftr is Count, replace it with coerce(ftr, Continuous).
  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.
  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table
  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding
  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table
  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
      grade=categorical(["A", "B", "A", "C"], ordered=true),
      height=[1.85, 1.67, 1.5, 1.67],
      n_devices=[3, 2, 4, 3],
@@ -35,4 +35,4 @@
 julia> setdiff(schema(X).names, report(mach).features_to_keep) ## dropped features
 1-element Vector{Symbol}:
  :comments
-

See also OneHotEncoder

+

See also OneHotEncoder

diff --git a/dev/models/CountTransformer_MLJText/index.html b/dev/models/CountTransformer_MLJText/index.html index 37fb17baa..927fa5ca4 100644 --- a/dev/models/CountTransformer_MLJText/index.html +++ b/dev/models/CountTransformer_MLJText/index.html @@ -1,5 +1,5 @@ -CountTransformer · MLJ

CountTransformer

CountTransformer

A model type for constructing a count transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CountTransformer = @load CountTransformer pkg=MLJText

Do model = CountTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CountTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of term counts.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.

Operations

  • transform(mach, Xnew): Based on the vocabulary learned in training, return the matrix of counts for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.

Examples

CountTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
+CountTransformer · MLJ

CountTransformer

CountTransformer

A model type for constructing a count transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CountTransformer = @load CountTransformer pkg=MLJText

Do model = CountTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CountTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of term counts.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.

Operations

  • transform(mach, Xnew): Based on the vocabulary learned in training, return the matrix of counts for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.

Examples

CountTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
 import TextAnalysis
 
 CountTransformer = @load CountTransformer pkg=MLJText
@@ -43,4 +43,4 @@
 MLJ.fit!(mach)
 fitted_params(mach)
 
-tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, BM25Transformer

+tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, BM25Transformer

diff --git a/dev/models/DBSCAN_Clustering/index.html b/dev/models/DBSCAN_Clustering/index.html index 04735daab..38f62eba6 100644 --- a/dev/models/DBSCAN_Clustering/index.html +++ b/dev/models/DBSCAN_Clustering/index.html @@ -1,5 +1,5 @@ -DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a DBSCAN clusterer (density-based spatial clustering of applications with noise), based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=Clustering

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(radius=...).

DBSCAN is a clustering algorithm that groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. Point types - core, boundary or noise - are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • radius=1.0: query radius.
  • leafsize=20: number of points binned in each leaf node of the nearest neighbor k-d tree.
  • min_neighbors=1: minimum number of a core point neighbors.
  • min_cluster_size=1: minimum number of points in a valid cluster.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Note that points of type noise will always get a label of 0.

Report

After calling predict(mach), the fields of report(mach) are:

  • point_types: A CategoricalVector with the DBSCAN point type classification, one element per row of X. Elements are either 'C' (core), 'B' (boundary), or 'N' (noise).

  • nclusters: The number of clusters (excluding the noise "cluster")

  • cluster_labels: The unique list of cluster labels

  • clusters: A vector of Clustering.DbscanCluster objects from Clustering.jl, which have these fields:

    • size: number of points in a cluster (core + boundary)
    • core_indices: indices of points in the cluster core
    • boundary_indices: indices of points on the cluster boundary

Examples

using MLJ
+DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a DBSCAN clusterer (density-based spatial clustering of applications with noise), based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=Clustering

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(radius=...).

DBSCAN is a clustering algorithm that groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. Point types - core, boundary or noise - are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • radius=1.0: query radius.
  • leafsize=20: number of points binned in each leaf node of the nearest neighbor k-d tree.
  • min_neighbors=1: minimum number of a core point neighbors.
  • min_cluster_size=1: minimum number of points in a valid cluster.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Note that points of type noise will always get a label of 0.

Report

After calling predict(mach), the fields of report(mach) are:

  • point_types: A CategoricalVector with the DBSCAN point type classification, one element per row of X. Elements are either 'C' (core), 'B' (boundary), or 'N' (noise).

  • nclusters: The number of clusters (excluding the noise "cluster")

  • cluster_labels: The unique list of cluster labels

  • clusters: A vector of Clustering.DbscanCluster objects from Clustering.jl, which have these fields:

    • size: number of points in a cluster (core + boundary)
    • core_indices: indices of points in the cluster core
    • boundary_indices: indices of points on the cluster boundary

Examples

using MLJ
 
 X, labels  = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X
 y = map(labels) do label
@@ -32,4 +32,4 @@
    :black
 end
 using Plots
-scatter(points, color=colors)
+scatter(points, color=colors)
diff --git a/dev/models/DBSCAN_MLJScikitLearnInterface/index.html b/dev/models/DBSCAN_MLJScikitLearnInterface/index.html index cea48b5c4..a67cd071d 100644 --- a/dev/models/DBSCAN_MLJScikitLearnInterface/index.html +++ b/dev/models/DBSCAN_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a dbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=MLJScikitLearnInterface

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(eps=...).

Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.

+DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a dbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=MLJScikitLearnInterface

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(eps=...).

Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.

diff --git a/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html b/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html index fedeee1f3..e2b87fc0a 100644 --- a/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -DNNDetector · MLJ

DNNDetector

DNNDetector(d = 0,
+DNNDetector · MLJ

DNNDetector

DNNDetector(d = 0,
             metric = Euclidean(),
             algorithm = :kdtree,
             leafsize = 10,
@@ -8,4 +8,4 @@
 detector = DNNDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Knorr, Edwin M.; Ng, Raymond T. (1998): Algorithms for Mining Distance-Based Outliers in Large Datasets.

+test_scores = transform(detector, model, X)

References

[1] Knorr, Edwin M.; Ng, Raymond T. (1998): Algorithms for Mining Distance-Based Outliers in Large Datasets.

diff --git a/dev/models/DecisionTreeClassifier_BetaML/index.html b/dev/models/DecisionTreeClassifier_BetaML/index.html index e96699008..ab1bf1393 100644 --- a/dev/models/DecisionTreeClassifier_BetaML/index.html +++ b/dev/models/DecisionTreeClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -DecisionTreeClassifier · MLJ

DecisionTreeClassifier

mutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic

A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+DecisionTreeClassifier · MLJ

DecisionTreeClassifier

mutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic

A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -27,4 +27,4 @@
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
  UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
- UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
diff --git a/dev/models/DecisionTreeClassifier_DecisionTree/index.html b/dev/models/DecisionTreeClassifier_DecisionTree/index.html index 4f85d41d4..d875bd6a1 100644 --- a/dev/models/DecisionTreeClassifier_DecisionTree/index.html +++ b/dev/models/DecisionTreeClassifier_DecisionTree/index.html @@ -1,5 +1,5 @@ -DecisionTreeClassifier · MLJ

DecisionTreeClassifier

DecisionTreeClassifier

A model type for constructing a CART decision tree classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree

Do model = DecisionTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeClassifier(max_depth=...).

DecisionTreeClassifier implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • display_depth=5: max depth to show when displaying the tree
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • raw_tree: the raw Node, Leaf or Root object returned by the core DecisionTree.jl algorithm
  • tree: a visualizable, wrapped version of raw_tree implementing the AbstractTrees.jl interface; see "Examples" below
  • encoding: dictionary of target classes keyed on integers used internally by DecisionTree.jl
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Report

The fields of report(mach) are:

  • classes_seen: list of target classes actually observed in training
  • print_tree: alternative method to print the fitted tree, with single argument the tree depth; interpretation requires internal integer-class encoding (see "Fitted parameters" above).
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+DecisionTreeClassifier · MLJ

DecisionTreeClassifier

DecisionTreeClassifier

A model type for constructing a CART decision tree classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree

Do model = DecisionTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeClassifier(max_depth=...).

DecisionTreeClassifier implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • display_depth=5: max depth to show when displaying the tree
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • raw_tree: the raw Node, Leaf or Root object returned by the core DecisionTree.jl algorithm
  • tree: a visualizable, wrapped version of raw_tree implementing the AbstractTrees.jl interface; see "Examples" below
  • encoding: dictionary of target classes keyed on integers used internally by DecisionTree.jl
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Report

The fields of report(mach) are:

  • classes_seen: list of target classes actually observed in training
  • print_tree: alternative method to print the fitted tree, with single argument the tree depth; interpretation requires internal integer-class encoding (see "Fitted parameters" above).
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
 model = DecisionTreeClassifier(max_depth=3, min_samples_split=3)
 
@@ -28,4 +28,4 @@
 using Plots, TreeRecipe
 plot(tree) ## for a graphical representation of the tree
 
-feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeClassifier.

+feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeClassifier.

diff --git a/dev/models/DecisionTreeRegressor_BetaML/index.html b/dev/models/DecisionTreeRegressor_BetaML/index.html index d1b366218..3fbd41679 100644 --- a/dev/models/DecisionTreeRegressor_BetaML/index.html +++ b/dev/models/DecisionTreeRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -DecisionTreeRegressor · MLJ

DecisionTreeRegressor

mutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic

A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+DecisionTreeRegressor · MLJ

DecisionTreeRegressor

mutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic

A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -30,4 +30,4 @@
   ⋮    
  23.9  23.75
  22.0  22.2
- 11.9  13.2
+ 11.9 13.2
diff --git a/dev/models/DecisionTreeRegressor_DecisionTree/index.html b/dev/models/DecisionTreeRegressor_DecisionTree/index.html index d990e5df2..737a12b12 100644 --- a/dev/models/DecisionTreeRegressor_DecisionTree/index.html +++ b/dev/models/DecisionTreeRegressor_DecisionTree/index.html @@ -1,5 +1,5 @@ -DecisionTreeRegressor · MLJ

DecisionTreeRegressor

DecisionTreeRegressor

A model type for constructing a CART decision tree regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree

Do model = DecisionTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeRegressor(max_depth=...).

DecisionTreeRegressor implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: the tree or stump object returned by the core DecisionTree.jl algorithm
  • features: the names of the features encountered in training

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+DecisionTreeRegressor · MLJ

DecisionTreeRegressor

DecisionTreeRegressor

A model type for constructing a CART decision tree regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree

Do model = DecisionTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeRegressor(max_depth=...).

DecisionTreeRegressor implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: the tree or stump object returned by the core DecisionTree.jl algorithm
  • features: the names of the features encountered in training

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree
 model = DecisionTreeRegressor(max_depth=3, min_samples_split=3)
 
@@ -24,4 +24,4 @@
       ├─ -2.931299926506291 (0/11)
       └─ -4.726518740473489 (0/8)
 
-feature_importances(mach) ## get feature importances

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeRegressor.

+feature_importances(mach) ## get feature importances

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeRegressor.

diff --git a/dev/models/DeterministicConstantClassifier_MLJModels/index.html b/dev/models/DeterministicConstantClassifier_MLJModels/index.html index 689a22a50..007fbf65c 100644 --- a/dev/models/DeterministicConstantClassifier_MLJModels/index.html +++ b/dev/models/DeterministicConstantClassifier_MLJModels/index.html @@ -1,2 +1,2 @@ -DeterministicConstantClassifier · MLJ

DeterministicConstantClassifier

DeterministicConstantClassifier

A model type for constructing a deterministic constant classifier, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantClassifier = @load DeterministicConstantClassifier pkg=MLJModels

Do model = DeterministicConstantClassifier() to construct an instance with default hyper-parameters.

+DeterministicConstantClassifier · MLJ

DeterministicConstantClassifier

DeterministicConstantClassifier

A model type for constructing a deterministic constant classifier, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantClassifier = @load DeterministicConstantClassifier pkg=MLJModels

Do model = DeterministicConstantClassifier() to construct an instance with default hyper-parameters.

diff --git a/dev/models/DeterministicConstantRegressor_MLJModels/index.html b/dev/models/DeterministicConstantRegressor_MLJModels/index.html index 682effdac..11dd754e3 100644 --- a/dev/models/DeterministicConstantRegressor_MLJModels/index.html +++ b/dev/models/DeterministicConstantRegressor_MLJModels/index.html @@ -1,2 +1,2 @@ -DeterministicConstantRegressor · MLJ

DeterministicConstantRegressor

DeterministicConstantRegressor

A model type for constructing a deterministic constant regressor, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantRegressor = @load DeterministicConstantRegressor pkg=MLJModels

Do model = DeterministicConstantRegressor() to construct an instance with default hyper-parameters.

+DeterministicConstantRegressor · MLJ

DeterministicConstantRegressor

DeterministicConstantRegressor

A model type for constructing a deterministic constant regressor, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantRegressor = @load DeterministicConstantRegressor pkg=MLJModels

Do model = DeterministicConstantRegressor() to construct an instance with default hyper-parameters.

diff --git a/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html b/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html index 1484061d2..77747c26c 100644 --- a/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -DummyClassifier · MLJ

DummyClassifier

DummyClassifier

A model type for constructing a dummy classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyClassifier = @load DummyClassifier pkg=MLJScikitLearnInterface

Do model = DummyClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyClassifier(strategy=...).

DummyClassifier is a classifier that makes predictions using simple rules.

+DummyClassifier · MLJ

DummyClassifier

DummyClassifier

A model type for constructing a dummy classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyClassifier = @load DummyClassifier pkg=MLJScikitLearnInterface

Do model = DummyClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyClassifier(strategy=...).

DummyClassifier is a classifier that makes predictions using simple rules.

diff --git a/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html b/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html index eada957da..285222a57 100644 --- a/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -DummyRegressor · MLJ

DummyRegressor

DummyRegressor

A model type for constructing a dummy regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyRegressor = @load DummyRegressor pkg=MLJScikitLearnInterface

Do model = DummyRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyRegressor(strategy=...).

DummyRegressor is a regressor that makes predictions using simple rules.

+DummyRegressor · MLJ

DummyRegressor

DummyRegressor

A model type for constructing a dummy regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyRegressor = @load DummyRegressor pkg=MLJScikitLearnInterface

Do model = DummyRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyRegressor(strategy=...).

DummyRegressor is a regressor that makes predictions using simple rules.

diff --git a/dev/models/ECODDetector_OutlierDetectionPython/index.html b/dev/models/ECODDetector_OutlierDetectionPython/index.html index b70bc3030..6901d5717 100644 --- a/dev/models/ECODDetector_OutlierDetectionPython/index.html +++ b/dev/models/ECODDetector_OutlierDetectionPython/index.html @@ -1,2 +1,2 @@ -ECODDetector · MLJ
+ECODDetector · MLJ
diff --git a/dev/models/ENNUndersampler_Imbalance/index.html b/dev/models/ENNUndersampler_Imbalance/index.html index 304b2f168..475f08050 100644 --- a/dev/models/ENNUndersampler_Imbalance/index.html +++ b/dev/models/ENNUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -ENNUndersampler · MLJ

ENNUndersampler

Initiate a ENN undersampling model with the given hyper-parameters.

ENNUndersampler

A model type for constructing a enn undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ENNUndersampler = @load ENNUndersampler pkg=Imbalance

Do model = ENNUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ENNUndersampler(k=...).

ENNUndersampler undersamples a dataset by removing ("cleaning") points that violate a certain condition such as having a different class compared to the majority of the neighbors as proposed in Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ENNUndersampler()

Hyperparameters

  • k::Integer=5: Number of nearest neighbors to consider in the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class.
  • keep_condition::AbstractString="mode": The condition that leads to cleaning a point upon violation. Takes one of "exists", "mode", "only mode" and "all"
- `"exists"`: the point has at least one neighbor from the same class
+ENNUndersampler · MLJ

ENNUndersampler

Initiate a ENN undersampling model with the given hyper-parameters.

ENNUndersampler

A model type for constructing a enn undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ENNUndersampler = @load ENNUndersampler pkg=Imbalance

Do model = ENNUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ENNUndersampler(k=...).

ENNUndersampler undersamples a dataset by removing ("cleaning") points that violate a certain condition such as having a different class compared to the majority of the neighbors as proposed in Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ENNUndersampler()

Hyperparameters

  • k::Integer=5: Number of nearest neighbors to consider in the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class.
  • keep_condition::AbstractString="mode": The condition that leads to cleaning a point upon violation. Takes one of "exists", "mode", "only mode" and "all"
- `"exists"`: the point has at least one neighbor from the same class
 - `"mode"`: the class of the point is one of the most frequent classes of the neighbors (there may be many)
 - `"only mode"`: the class of the point is the single most frequent class of the neighbors
 - `"all"`: the class of the point is the same as all the neighbors
  • min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.

    • Can be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float minimum ratio for that class
  • force_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • try_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using ENNUndersampler, returning the undersampled versions

Example

using MLJ
@@ -28,4 +28,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 24 (240.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 24 (240.0%)
diff --git a/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html index f496b3bb2..f1658d72d 100644 --- a/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ElasticNetCVRegressor · MLJ

ElasticNetCVRegressor

ElasticNetCVRegressor

A model type for constructing a elastic net regression with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetCVRegressor = @load ElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
+ElasticNetCVRegressor · MLJ

ElasticNetCVRegressor

ElasticNetCVRegressor

A model type for constructing a elastic net regression with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetCVRegressor = @load ElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/ElasticNetRegressor_MLJLinearModels/index.html b/dev/models/ElasticNetRegressor_MLJLinearModels/index.html index 3ca112df7..26f5d3660 100644 --- a/dev/models/ElasticNetRegressor_MLJLinearModels/index.html +++ b/dev/models/ElasticNetRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJLinearModels

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters.

Elastic net is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0

  • gamma::Real: strength of the L1 regularization. Default: 0.0

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad.

    If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
+ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJLinearModels

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters.

Elastic net is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0

  • gamma::Real: strength of the L1 regularization. Default: 0.0

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad.

    If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(ElasticNetRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also LassoRegressor.

+fitted_params(mach)

See also LassoRegressor.

diff --git a/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html b/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html index 21c93d539..96574f424 100644 --- a/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • precompute = false
  • max_iter = 1000
  • copy_X = true
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
+ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • precompute = false
  • max_iter = 1000
  • copy_X = true
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/EnsembleModel_MLJEnsembles/index.html b/dev/models/EnsembleModel_MLJEnsembles/index.html new file mode 100644 index 000000000..e4deea15f --- /dev/null +++ b/dev/models/EnsembleModel_MLJEnsembles/index.html @@ -0,0 +1,8 @@ + +EnsembleModel · MLJ

EnsembleModel

EnsembleModel(model,
+              atomic_weights=Float64[],
+              bagging_fraction=0.8,
+              n=100,
+              rng=GLOBAL_RNG,
+              acceleration=CPU1(),
+              out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

diff --git a/dev/models/EpsilonSVR_LIBSVM/index.html b/dev/models/EpsilonSVR_LIBSVM/index.html index 8ce05c116..8bd4cc4a9 100644 --- a/dev/models/EpsilonSVR_LIBSVM/index.html +++ b/dev/models/EpsilonSVR_LIBSVM/index.html @@ -1,5 +1,5 @@ -EpsilonSVR · MLJ

EpsilonSVR

EpsilonSVR

A model type for constructing a ϵ-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

EpsilonSVR = @load EpsilonSVR pkg=LIBSVM

Do model = EpsilonSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EpsilonSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an adaptation of the classifier SVC to regression, but has an additional parameter epsilon (denoted $ϵ$ in the cited reference).

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • epsilon=0.1 (range (0, Inf)): the parameter denoted $ϵ$ in the cited reference; epsilon is the thickness of the penalty-free neighborhood of the graph of the prediction function ("slab" or "tube"). Specifically, a data point (x, y) incurs no training loss unless it is outside this neighborhood; the further away it is from the this neighborhood, the greater the loss penalty.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+EpsilonSVR · MLJ

EpsilonSVR

EpsilonSVR

A model type for constructing a ϵ-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

EpsilonSVR = @load EpsilonSVR pkg=LIBSVM

Do model = EpsilonSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EpsilonSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an adaptation of the classifier SVC to regression, but has an additional parameter epsilon (denoted $ϵ$ in the cited reference).

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • epsilon=0.1 (range (0, Inf)): the parameter denoted $ϵ$ in the cited reference; epsilon is the thickness of the penalty-free neighborhood of the graph of the prediction function ("slab" or "tube"). Specifically, a data point (x, y) incurs no training loss unless it is outside this neighborhood; the further away it is from the this neighborhood, the greater the loss penalty.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 EpsilonSVR = @load EpsilonSVR pkg=LIBSVM            ## model type
@@ -22,4 +22,4 @@
 3-element Vector{Float64}:
   1.1121225361666656
   0.04667702229741916
- -0.6958148424680672

See also NuSVR, LIVSVM.jl and the original C implementation documentation.

+ -0.6958148424680672

See also NuSVR, LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/EvoLinearRegressor_EvoLinear/index.html b/dev/models/EvoLinearRegressor_EvoLinear/index.html index 28072c388..34962a6f5 100644 --- a/dev/models/EvoLinearRegressor_EvoLinear/index.html +++ b/dev/models/EvoLinearRegressor_EvoLinear/index.html @@ -1,3 +1,3 @@ -EvoLinearRegressor · MLJ

EvoLinearRegressor

EvoLinearRegressor(; kwargs...)

A model type for constructing a EvoLinearRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoLinearRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoLinearRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoLinearRegressor()
-m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoLinearRegressor = @load EvoLinearRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoLinearRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the EvoLinearModel object returned by EvoLnear.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
+EvoLinearRegressor · MLJ

EvoLinearRegressor

EvoLinearRegressor(; kwargs...)

A model type for constructing a EvoLinearRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoLinearRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoLinearRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoLinearRegressor()
+m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoLinearRegressor = @load EvoLinearRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoLinearRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the EvoLinearModel object returned by EvoLnear.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
diff --git a/dev/models/EvoSplineRegressor_EvoLinear/index.html b/dev/models/EvoSplineRegressor_EvoLinear/index.html index da91b7512..a664d4618 100644 --- a/dev/models/EvoSplineRegressor_EvoLinear/index.html +++ b/dev/models/EvoSplineRegressor_EvoLinear/index.html @@ -1,3 +1,3 @@ -EvoSplineRegressor · MLJ

EvoSplineRegressor

EvoSplineRegressor(; kwargs...)

A model type for constructing a EvoSplineRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoSplineRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoSplineRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoSplineRegressor()
-m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoSplineRegressor = @load EvoSplineRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoSplineRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the SplineModel object returned by EvoSplineRegressor fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
+EvoSplineRegressor · MLJ

EvoSplineRegressor

EvoSplineRegressor(; kwargs...)

A model type for constructing a EvoSplineRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoSplineRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoSplineRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoSplineRegressor()
+m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoSplineRegressor = @load EvoSplineRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoSplineRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the SplineModel object returned by EvoSplineRegressor fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
diff --git a/dev/models/EvoTreeClassifier_EvoTrees/index.html b/dev/models/EvoTreeClassifier_EvoTrees/index.html index 74b1b1f3f..46906fe6c 100644 --- a/dev/models/EvoTreeClassifier_EvoTrees/index.html +++ b/dev/models/EvoTreeClassifier_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeClassifier · MLJ

EvoTreeClassifier

EvoTreeClassifier(;kwargs...)

A model type for constructing a EvoTreeClassifier, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface. EvoTreeClassifier is used to perform multi-class classification, using cross-entropy loss.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, K] where K is the number of classes:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees

Do model = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Multiclas or <:OrderedFactor; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeClassifier · MLJ

EvoTreeClassifier

EvoTreeClassifier(;kwargs...)

A model type for constructing a EvoTreeClassifier, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface. EvoTreeClassifier is used to perform multi-class classification, using cross-entropy loss.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, K] where K is the number of classes:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees

Do model = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Multiclas or <:OrderedFactor; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeClassifier(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -12,4 +12,4 @@
 X, y = @load_iris
 mach = machine(model, X, y) |> fit!
 preds = predict(mach, X)
-preds = predict_mode(mach, X)

See also EvoTrees.jl.

+preds = predict_mode(mach, X)

See also EvoTrees.jl.

diff --git a/dev/models/EvoTreeCount_EvoTrees/index.html b/dev/models/EvoTreeCount_EvoTrees/index.html index d746b78c2..6c1d3125a 100644 --- a/dev/models/EvoTreeCount_EvoTrees/index.html +++ b/dev/models/EvoTreeCount_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeCount · MLJ

EvoTreeCount

EvoTreeCount(;kwargs...)

A model type for constructing a EvoTreeCount, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeCount is used to perform Poisson probabilistic regression on count target.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeCount = @load EvoTreeCount pkg=EvoTrees

Do model = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Count; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Poisson distributions given features Xnew having the same scitype as X above. Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeCount · MLJ

EvoTreeCount

EvoTreeCount(;kwargs...)

A model type for constructing a EvoTreeCount, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeCount is used to perform Poisson probabilistic regression on count target.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeCount = @load EvoTreeCount pkg=EvoTrees

Do model = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Count; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Poisson distributions given features Xnew having the same scitype as X above. Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeCount(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -15,4 +15,4 @@
 preds = predict_mean(mach, X)
 preds = predict_mode(mach, X)
 preds = predict_median(mach, X)
-

See also EvoTrees.jl.

+

See also EvoTrees.jl.

diff --git a/dev/models/EvoTreeGaussian_EvoTrees/index.html b/dev/models/EvoTreeGaussian_EvoTrees/index.html index 745355d74..53e4dede6 100644 --- a/dev/models/EvoTreeGaussian_EvoTrees/index.html +++ b/dev/models/EvoTreeGaussian_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeGaussian · MLJ

EvoTreeGaussian

EvoTreeGaussian(;kwargs...)

A model type for constructing a EvoTreeGaussian, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ and σ parameters to maximize likelihood.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, 2] where the second dimensions refer to μ and σ respectively:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees

Do model = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian distributions given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeGaussian · MLJ

EvoTreeGaussian

EvoTreeGaussian(;kwargs...)

A model type for constructing a EvoTreeGaussian, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ and σ parameters to maximize likelihood.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, 2] where the second dimensions refer to μ and σ respectively:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees

Do model = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian distributions given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 params = EvoTreeGaussian(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -14,4 +14,4 @@
 preds = predict(mach, X)
 preds = predict_mean(mach, X)
 preds = predict_mode(mach, X)
-preds = predict_median(mach, X)
+preds = predict_median(mach, X)
diff --git a/dev/models/EvoTreeMLE_EvoTrees/index.html b/dev/models/EvoTreeMLE_EvoTrees/index.html index 022ec1956..728d087d6 100644 --- a/dev/models/EvoTreeMLE_EvoTrees/index.html +++ b/dev/models/EvoTreeMLE_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeMLE · MLJ

EvoTreeMLE

EvoTreeMLE(;kwargs...)

A model type for constructing a EvoTreeMLE, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is specified through loss kwargs. Both Gaussian and Logistic distributions are supported.

Hyper-parameters

loss=:gaussian: Loss to be be minimized during training. One of:

  • :gaussian / :gaussian_mle
  • :logistic / :logistic_mle
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.
  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0.

A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, nparams] where the second dimensions refer to μ & σ for Normal/Gaussian and μ & s for Logistic.

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees

Do model = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian or Logistic distributions (according to provided loss) given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeMLE · MLJ

EvoTreeMLE

EvoTreeMLE(;kwargs...)

A model type for constructing a EvoTreeMLE, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is specified through loss kwargs. Both Gaussian and Logistic distributions are supported.

Hyper-parameters

loss=:gaussian: Loss to be be minimized during training. One of:

  • :gaussian / :gaussian_mle
  • :logistic / :logistic_mle
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.
  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0.

A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, nparams] where the second dimensions refer to μ & σ for Normal/Gaussian and μ & s for Logistic.

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees

Do model = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian or Logistic distributions (according to provided loss) given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeMLE(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -14,4 +14,4 @@
 preds = predict(mach, X)
 preds = predict_mean(mach, X)
 preds = predict_mode(mach, X)
-preds = predict_median(mach, X)
+preds = predict_median(mach, X)
diff --git a/dev/models/EvoTreeRegressor_EvoTrees/index.html b/dev/models/EvoTreeRegressor_EvoTrees/index.html index 49d431461..0b6a3e0ac 100644 --- a/dev/models/EvoTreeRegressor_EvoTrees/index.html +++ b/dev/models/EvoTreeRegressor_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeRegressor · MLJ

EvoTreeRegressor

EvoTreeRegressor(;kwargs...)

A model type for constructing a EvoTreeRegressor, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface.

Hyper-parameters

  • loss=:mse: Loss to be be minimized during training. One of:

    • :mse
    • :logloss
    • :gamma
    • :tweedie
    • :quantile
    • :l1
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • alpha::T=0.5: Loss specific parameter in the [0, 1] range: - :quantile: target quantile for the regression. - :l1: weighting parameters to positive vs negative residuals. - Positive residual weights = alpha - Negative residual weights = (1 - alpha)

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). Only :linear, :logistic, :gamma and tweedie losses are supported at the moment.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ Interface

From MLJ, the type can be imported using:

EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees

Do model = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeRegressor · MLJ

EvoTreeRegressor

EvoTreeRegressor(;kwargs...)

A model type for constructing a EvoTreeRegressor, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface.

Hyper-parameters

  • loss=:mse: Loss to be be minimized during training. One of:

    • :mse
    • :logloss
    • :gamma
    • :tweedie
    • :quantile
    • :l1
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • alpha::T=0.5: Loss specific parameter in the [0, 1] range: - :quantile: target quantile for the regression. - :l1: weighting parameters to positive vs negative residuals. - Positive residual weights = alpha - Negative residual weights = (1 - alpha)

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). Only :linear, :logistic, :gamma and tweedie losses are supported at the moment.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ Interface

From MLJ, the type can be imported using:

EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees

Do model = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -11,4 +11,4 @@
 model = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)
 X, y = @load_boston
 mach = machine(model, X, y) |> fit!
-preds = predict(mach, X)
+preds = predict(mach, X)
diff --git a/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html b/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html index c3c04ee09..73d414b73 100644 --- a/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ExtraTreesClassifier · MLJ

ExtraTreesClassifier

ExtraTreesClassifier

A model type for constructing a extra trees classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesClassifier = @load ExtraTreesClassifier pkg=MLJScikitLearnInterface

Do model = ExtraTreesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesClassifier(n_estimators=...).

Extra trees classifier, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

+ExtraTreesClassifier · MLJ

ExtraTreesClassifier

ExtraTreesClassifier

A model type for constructing a extra trees classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesClassifier = @load ExtraTreesClassifier pkg=MLJScikitLearnInterface

Do model = ExtraTreesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesClassifier(n_estimators=...).

Extra trees classifier, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

diff --git a/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html b/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html index 54725a0c5..5a882ae2a 100644 --- a/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ExtraTreesRegressor · MLJ

ExtraTreesRegressor

ExtraTreesRegressor

A model type for constructing a extra trees regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesRegressor = @load ExtraTreesRegressor pkg=MLJScikitLearnInterface

Do model = ExtraTreesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesRegressor(n_estimators=...).

Extra trees regressor, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

+ExtraTreesRegressor · MLJ

ExtraTreesRegressor

ExtraTreesRegressor

A model type for constructing a extra trees regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesRegressor = @load ExtraTreesRegressor pkg=MLJScikitLearnInterface

Do model = ExtraTreesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesRegressor(n_estimators=...).

Extra trees regressor, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

diff --git a/dev/models/FactorAnalysis_MultivariateStats/index.html b/dev/models/FactorAnalysis_MultivariateStats/index.html index b2c4edc32..496add57b 100644 --- a/dev/models/FactorAnalysis_MultivariateStats/index.html +++ b/dev/models/FactorAnalysis_MultivariateStats/index.html @@ -1,5 +1,5 @@ -FactorAnalysis · MLJ

FactorAnalysis

FactorAnalysis

A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats

Do model = FactorAnalysis() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...).

Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:cm: Method to use to solve the problem, one of :ml, :em, :bayes.
  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • maxiter::Int=1000: Maximum number of iterations.
  • tol::Real=1e-6: Convergence tolerance.
  • eta::Real=tol: Variance lower bound.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data (number of factors).
  • variance: The variance of the factors.
  • covariance_matrix: The estimated covariance matrix.
  • mean: The mean of the untransformed training data, of length indim.
  • loadings: The factor loadings. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
+FactorAnalysis · MLJ

FactorAnalysis

FactorAnalysis

A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats

Do model = FactorAnalysis() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...).

Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:cm: Method to use to solve the problem, one of :ml, :em, :bayes.
  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • maxiter::Int=1000: Maximum number of iterations.
  • tol::Real=1e-6: Convergence tolerance.
  • eta::Real=tol: Variance lower bound.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data (number of factors).
  • variance: The variance of the factors.
  • covariance_matrix: The estimated covariance matrix.
  • mean: The mean of the untransformed training data, of length indim.
  • loadings: The factor loadings. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
 
 FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats
 
@@ -8,4 +8,4 @@
 model = FactorAnalysis(maxoutdim=2)
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also KernelPCA, ICA, PPCA, PCA

+Xproj = transform(mach, X)

See also KernelPCA, ICA, PPCA, PCA

diff --git a/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html b/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html index 265714403..b69eca028 100644 --- a/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html +++ b/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -FeatureAgglomeration · MLJ

FeatureAgglomeration

FeatureAgglomeration

A model type for constructing a feature agglomeration, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureAgglomeration = @load FeatureAgglomeration pkg=MLJScikitLearnInterface

Do model = FeatureAgglomeration() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureAgglomeration(n_clusters=...).

Similar to AgglomerativeClustering, but recursively merges features instead of samples."

+FeatureAgglomeration · MLJ

FeatureAgglomeration

FeatureAgglomeration

A model type for constructing a feature agglomeration, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureAgglomeration = @load FeatureAgglomeration pkg=MLJScikitLearnInterface

Do model = FeatureAgglomeration() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureAgglomeration(n_clusters=...).

Similar to AgglomerativeClustering, but recursively merges features instead of samples."

diff --git a/dev/models/FeatureSelector_FeatureSelection/index.html b/dev/models/FeatureSelector_FeatureSelection/index.html new file mode 100644 index 000000000..4dd225235 --- /dev/null +++ b/dev/models/FeatureSelector_FeatureSelection/index.html @@ -0,0 +1,17 @@ + +FeatureSelector · MLJ

FeatureSelector

FeatureSelector

A model type for constructing a feature selector, based on FeatureSelection.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=FeatureSelection

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training
    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)
    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.
  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ
+
+X = (ordinal1 = [1, 2, 3],
+     ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
+     ordinal3 = [10.0, 20.0, 30.0],
+     ordinal4 = [-20.0, -30.0, -40.0],
+     nominal = coerce(["Your father", "he", "is"], Multiclass));
+
+selector = FeatureSelector(features=[:ordinal3, ], ignore=true);
+
+julia> transform(fit!(machine(selector, X)), X)
+(ordinal1 = [1, 2, 3],
+ ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
+ ordinal4 = [-20.0, -30.0, -40.0],
+ nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
+
diff --git a/dev/models/FeatureSelector_MLJModels/index.html b/dev/models/FeatureSelector_MLJModels/index.html deleted file mode 100644 index 499e1e5f5..000000000 --- a/dev/models/FeatureSelector_MLJModels/index.html +++ /dev/null @@ -1,17 +0,0 @@ - -FeatureSelector · MLJ

FeatureSelector

FeatureSelector

A model type for constructing a feature selector, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=MLJModels

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training
    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)
    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.
  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ
-
-X = (ordinal1 = [1, 2, 3],
-     ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
-     ordinal3 = [10.0, 20.0, 30.0],
-     ordinal4 = [-20.0, -30.0, -40.0],
-     nominal = coerce(["Your father", "he", "is"], Multiclass));
-
-selector = FeatureSelector(features=[:ordinal3, ], ignore=true);
-
-julia> transform(fit!(machine(selector, X)), X)
-(ordinal1 = [1, 2, 3],
- ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
- ordinal4 = [-20.0, -30.0, -40.0],
- nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
-
diff --git a/dev/models/FillImputer_MLJModels/index.html b/dev/models/FillImputer_MLJModels/index.html index 010f23465..dc306f6b0 100644 --- a/dev/models/FillImputer_MLJModels/index.html +++ b/dev/models/FillImputer_MLJModels/index.html @@ -1,5 +1,5 @@ -FillImputer · MLJ

FillImputer

FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".
  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training
  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)
  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
+FillImputer · MLJ

FillImputer

FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".
  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training
  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)
  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
 imputer = FillImputer()
 
 X = (a = [1.0, 2.0, missing, 3.0, missing],
@@ -31,4 +31,4 @@
 julia> transform(mach, X)
 (a = [1.0, 2.0, 2.0, 3.0, 2.0],
  b = CategoricalValue{String, UInt32}["y", "n", "y", "y", "y"],
- c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

+ c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

diff --git a/dev/models/GMMDetector_OutlierDetectionPython/index.html b/dev/models/GMMDetector_OutlierDetectionPython/index.html index 296f84765..e72bf8181 100644 --- a/dev/models/GMMDetector_OutlierDetectionPython/index.html +++ b/dev/models/GMMDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -GMMDetector · MLJ

GMMDetector

GMMDetector(n_components=1,
+GMMDetector · MLJ
+               warm_start=False)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.gmm

diff --git a/dev/models/GaussianMixtureClusterer_BetaML/index.html b/dev/models/GaussianMixtureClusterer_BetaML/index.html index cef1c61fb..700daab8e 100644 --- a/dev/models/GaussianMixtureClusterer_BetaML/index.html +++ b/dev/models/GaussianMixtureClusterer_BetaML/index.html @@ -1,5 +1,5 @@ -GaussianMixtureClusterer · MLJ

GaussianMixtureClusterer

mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised

A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::AbstractVector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the ?GMM module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the initialisation_strategy parameter is set to "gived". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector of n_classes mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def: [DiagonalGaussian() for i in 1:n_classes]]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:


+GaussianMixtureClusterer · MLJ

GaussianMixtureClusterer

mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised

A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::AbstractVector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the ?GMM module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the initialisation_strategy parameter is set to "gived". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector of n_classes mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def: [DiagonalGaussian() for i in 1:n_classes]]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:


 julia> using MLJ
 
 julia> X, y        = @load_iris;
@@ -34,4 +34,4 @@
  ⋮
  UnivariateFinite{Multiclass{3}}(1=>5.39e-25, 2=>0.0167, 3=>0.983)
  UnivariateFinite{Multiclass{3}}(1=>7.5e-29, 2=>0.000106, 3=>1.0)
- UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)
+ UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)
diff --git a/dev/models/GaussianMixtureImputer_BetaML/index.html b/dev/models/GaussianMixtureImputer_BetaML/index.html index e70f41b39..398f39060 100644 --- a/dev/models/GaussianMixtureImputer_BetaML/index.html +++ b/dev/models/GaussianMixtureImputer_BetaML/index.html @@ -1,5 +1,5 @@ -GaussianMixtureImputer · MLJ

GaussianMixtureImputer

mutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised

Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures areSphericalGaussian,DiagonalGaussianandFullGaussian. [def:DiagonalGaussian`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
+GaussianMixtureImputer · MLJ

GaussianMixtureImputer

mutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised

Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures areSphericalGaussian,DiagonalGaussianandFullGaussian. [def:DiagonalGaussian`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
 
 julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
 
@@ -33,4 +33,4 @@
  2.51842  15.1747
  3.3      38.0
  2.47412  -2.3
- 5.2      -2.4
+ 5.2 -2.4
diff --git a/dev/models/GaussianMixtureRegressor_BetaML/index.html b/dev/models/GaussianMixtureRegressor_BetaML/index.html index 6b3a67117..bc0c0f4fd 100644 --- a/dev/models/GaussianMixtureRegressor_BetaML/index.html +++ b/dev/models/GaussianMixtureRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -GaussianMixtureRegressor · MLJ

GaussianMixtureRegressor

mutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+GaussianMixtureRegressor · MLJ

GaussianMixtureRegressor

mutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y      = @load_boston;
 
@@ -30,4 +30,4 @@
  24.70344283512716
   ⋮
  17.172486989759676
- 17.172486989759644
+ 17.172486989759644
diff --git a/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html index 42f0f184f..245276544 100644 --- a/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=MLJScikitLearnInterface

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianNBClassifier(priors=...).

Hyper-parameters

  • priors = nothing
  • var_smoothing = 1.0e-9
+GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=MLJScikitLearnInterface

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianNBClassifier(priors=...).

Hyper-parameters

  • priors = nothing
  • var_smoothing = 1.0e-9
diff --git a/dev/models/GaussianNBClassifier_NaiveBayes/index.html b/dev/models/GaussianNBClassifier_NaiveBayes/index.html index b46aa6110..ddb403547 100644 --- a/dev/models/GaussianNBClassifier_NaiveBayes/index.html +++ b/dev/models/GaussianNBClassifier_NaiveBayes/index.html @@ -1,5 +1,5 @@ -GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=NaiveBayes

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters.

Given each class taken on by the target variable y, it is supposed that the conditional probability distribution for the input variables X is a multivariate Gaussian. The mean and covariance of these Gaussian distributions are estimated using maximum likelihood, and a probability distribution for y given X is deduced by applying Bayes' rule. The required marginal for y is estimated using class frequency in the training data.

Important. The name "naive Bayes classifier" is perhaps misleading. Since we are learning the full multivariate Gaussian distributions for X given y, we are not applying the usual naive Bayes independence condition, which would amount to forcing the covariance matrix to be diagonal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.

  • c_stats: A dictionary containing observed statistics on each input class. Each class is represented by a DataStats object, with the following fields:

    • n_vars: The number of variables used to describe the class's behavior.
    • n_obs: The number of times the class is observed.
    • obs_axis: The axis along which the observations were computed.
  • gaussians: A per class dictionary of Gaussians, each representing the distribution of the class. Represented with type Distributions.MvNormal from the Distributions.jl package.

  • n_obs: The total number of observations in the training data.

Examples

using MLJ
+GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=NaiveBayes

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters.

Given each class taken on by the target variable y, it is supposed that the conditional probability distribution for the input variables X is a multivariate Gaussian. The mean and covariance of these Gaussian distributions are estimated using maximum likelihood, and a probability distribution for y given X is deduced by applying Bayes' rule. The required marginal for y is estimated using class frequency in the training data.

Important. The name "naive Bayes classifier" is perhaps misleading. Since we are learning the full multivariate Gaussian distributions for X given y, we are not applying the usual naive Bayes independence condition, which would amount to forcing the covariance matrix to be diagonal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.

  • c_stats: A dictionary containing observed statistics on each input class. Each class is represented by a DataStats object, with the following fields:

    • n_vars: The number of variables used to describe the class's behavior.
    • n_obs: The number of times the class is observed.
    • obs_axis: The axis along which the observations were computed.
  • gaussians: A per class dictionary of Gaussians, each representing the distribution of the class. Represented with type Distributions.MvNormal from the Distributions.jl package.

  • n_obs: The total number of observations in the training data.

Examples

using MLJ
 GaussianNB = @load GaussianNBClassifier pkg=NaiveBayes
 
 X, y = @load_iris
@@ -10,4 +10,4 @@
 
 preds = predict(mach, X) ## probabilistic predictions
 preds[1]
-predict_mode(mach, X) ## point predictions

See also MultinomialNBClassifier

+predict_mode(mach, X) ## point predictions

See also MultinomialNBClassifier

diff --git a/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html b/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html index 4594b0ae7..f11771932 100644 --- a/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GaussianProcessClassifier · MLJ

GaussianProcessClassifier

GaussianProcessClassifier

A model type for constructing a Gaussian process classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessClassifier = @load GaussianProcessClassifier pkg=MLJScikitLearnInterface

Do model = GaussianProcessClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessClassifier(kernel=...).

Hyper-parameters

  • kernel = nothing
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • copy_X_train = true
  • random_state = nothing
  • max_iter_predict = 100
  • warm_start = false
  • multi_class = one_vs_rest
+GaussianProcessClassifier · MLJ

GaussianProcessClassifier

GaussianProcessClassifier

A model type for constructing a Gaussian process classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessClassifier = @load GaussianProcessClassifier pkg=MLJScikitLearnInterface

Do model = GaussianProcessClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessClassifier(kernel=...).

Hyper-parameters

  • kernel = nothing
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • copy_X_train = true
  • random_state = nothing
  • max_iter_predict = 100
  • warm_start = false
  • multi_class = one_vs_rest
diff --git a/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html b/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html index 88043737e..6ee869238 100644 --- a/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GaussianProcessRegressor · MLJ

GaussianProcessRegressor

GaussianProcessRegressor

A model type for constructing a Gaussian process regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessRegressor = @load GaussianProcessRegressor pkg=MLJScikitLearnInterface

Do model = GaussianProcessRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessRegressor(kernel=...).

Hyper-parameters

  • kernel = nothing
  • alpha = 1.0e-10
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • normalize_y = false
  • copy_X_train = true
  • random_state = nothing
+GaussianProcessRegressor · MLJ

GaussianProcessRegressor

GaussianProcessRegressor

A model type for constructing a Gaussian process regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessRegressor = @load GaussianProcessRegressor pkg=MLJScikitLearnInterface

Do model = GaussianProcessRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessRegressor(kernel=...).

Hyper-parameters

  • kernel = nothing
  • alpha = 1.0e-10
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • normalize_y = false
  • copy_X_train = true
  • random_state = nothing
diff --git a/dev/models/GeneralImputer_BetaML/index.html b/dev/models/GeneralImputer_BetaML/index.html index bce810ecd..d1d4c9ae9 100644 --- a/dev/models/GeneralImputer_BetaML/index.html +++ b/dev/models/GeneralImputer_BetaML/index.html @@ -1,5 +1,5 @@ -GeneralImputer · MLJ

GeneralImputer

mutable struct GeneralImputer <: MLJModelInterface.Unsupervised

Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).

Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options]), train!(m,X,Y) and predict(m,X).

Hyperparameters:

  • cols_to_impute::Union{String, Vector{Int64}}: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords "auto" (default) or "all". With "auto" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use "all" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.
  • estimator::Any: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be a cols_to_impute-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default: nothing, i.e. use BetaML random forests, handling classification and regression jobs automatically].
  • missing_supported::Union{Bool, Vector{Bool}}: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default: false]
  • fit_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.fit!]
  • predict_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.predict]
  • recursive_passages::Int64: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.

Examples :

  • Using BetaML models:
julia> using MLJ;
+GeneralImputer · MLJ

GeneralImputer

mutable struct GeneralImputer <: MLJModelInterface.Unsupervised

Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).

Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options]), train!(m,X,Y) and predict(m,X).

Hyperparameters:

  • cols_to_impute::Union{String, Vector{Int64}}: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords "auto" (default) or "all". With "auto" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use "all" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.
  • estimator::Any: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be a cols_to_impute-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default: nothing, i.e. use BetaML random forests, handling classification and regression jobs automatically].
  • missing_supported::Union{Bool, Vector{Bool}}: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default: false]
  • fit_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.fit!]
  • predict_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.predict]
  • recursive_passages::Int64: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.

Examples :

  • Using BetaML models:
julia> using MLJ;
 julia> import BetaML ## The library from which to get the individual estimators to be used for each column imputation
 julia> X = ["a"         8.2;
             "a"     missing;
@@ -57,4 +57,4 @@
  "b"  20
  "c"  -1.8
  "c"  -2.3
- "c"  -2.4
+ "c" -2.4
diff --git a/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html b/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html index 5e7084eee..54203d13d 100644 --- a/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GradientBoostingClassifier · MLJ

GradientBoostingClassifier

GradientBoostingClassifier

A model type for constructing a gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingClassifier = @load GradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = GradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+GradientBoostingClassifier · MLJ

GradientBoostingClassifier

GradientBoostingClassifier

A model type for constructing a gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingClassifier = @load GradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = GradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html b/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html index 7ea57c37a..aa12148a5 100644 --- a/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GradientBoostingRegressor · MLJ

GradientBoostingRegressor

GradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingRegressor = @load GradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = GradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+GradientBoostingRegressor · MLJ

GradientBoostingRegressor

GradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingRegressor = @load GradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = GradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/HBOSDetector_OutlierDetectionPython/index.html b/dev/models/HBOSDetector_OutlierDetectionPython/index.html index 1ef06b524..13c0d32a5 100644 --- a/dev/models/HBOSDetector_OutlierDetectionPython/index.html +++ b/dev/models/HBOSDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -HBOSDetector · MLJ

HBOSDetector

HBOSDetector(n_bins = 10,
+HBOSDetector · MLJ
+                tol = 0.5)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.hbos

diff --git a/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html b/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html index b763c9aae..9619d70bc 100644 --- a/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html +++ b/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HDBSCAN · MLJ

HDBSCAN

HDBSCAN

A model type for constructing a hdbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HDBSCAN = @load HDBSCAN pkg=MLJScikitLearnInterface

Do model = HDBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HDBSCAN(min_cluster_size=...).

Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection.

+HDBSCAN · MLJ

HDBSCAN

HDBSCAN

A model type for constructing a hdbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HDBSCAN = @load HDBSCAN pkg=MLJScikitLearnInterface

Do model = HDBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HDBSCAN(min_cluster_size=...).

Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection.

diff --git a/dev/models/HierarchicalClustering_Clustering/index.html b/dev/models/HierarchicalClustering_Clustering/index.html index b0d38520e..97943c69d 100644 --- a/dev/models/HierarchicalClustering_Clustering/index.html +++ b/dev/models/HierarchicalClustering_Clustering/index.html @@ -1,5 +1,5 @@ -HierarchicalClustering · MLJ

HierarchicalClustering

HierarchicalClustering

A model type for constructing a hierarchical clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HierarchicalClustering = @load HierarchicalClustering pkg=Clustering

Do model = HierarchicalClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HierarchicalClustering(linkage=...).

Hierarchical Clustering is a clustering algorithm that organizes the data in a dendrogram based on distances between groups of points and computes cluster assignments by cutting the dendrogram at a given height. More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. The dendrogram and the dendrogram cutter are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • linkage = :single: linkage method (:single, :average, :complete, :ward, :ward_presquared)
  • metric = SqEuclidean: metric (see Distances.jl for available metrics)
  • branchorder = :r: branchorder (:r, :barjoseph, :optimal)
  • h = nothing: height at which the dendrogram is cut
  • k = 3: number of clusters.

If both k and h are specified, it is guaranteed that the number of clusters is not less than k and their height is not above h.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Report

After calling predict(mach), the fields of report(mach) are:

  • dendrogram: the dendrogram that was computed when calling predict.
  • cutter: a dendrogram cutter that can be called with a height h or a number of clusters k, to obtain a new assignment of the data points to clusters (see example below).

Examples

using MLJ
+HierarchicalClustering · MLJ

HierarchicalClustering

HierarchicalClustering

A model type for constructing a hierarchical clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HierarchicalClustering = @load HierarchicalClustering pkg=Clustering

Do model = HierarchicalClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HierarchicalClustering(linkage=...).

Hierarchical Clustering is a clustering algorithm that organizes the data in a dendrogram based on distances between groups of points and computes cluster assignments by cutting the dendrogram at a given height. More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. The dendrogram and the dendrogram cutter are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • linkage = :single: linkage method (:single, :average, :complete, :ward, :ward_presquared)
  • metric = SqEuclidean: metric (see Distances.jl for available metrics)
  • branchorder = :r: branchorder (:r, :barjoseph, :optimal)
  • h = nothing: height at which the dendrogram is cut
  • k = 3: number of clusters.

If both k and h are specified, it is guaranteed that the number of clusters is not less than k and their height is not above h.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Report

After calling predict(mach), the fields of report(mach) are:

  • dendrogram: the dendrogram that was computed when calling predict.
  • cutter: a dendrogram cutter that can be called with a height h or a number of clusters k, to obtain a new assignment of the data points to clusters (see example below).

Examples

using MLJ
 
 X, labels  = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X
 
@@ -15,4 +15,4 @@
 plot(report(mach).dendrogram)
 
 ## make new predictions by cutting the dendrogram at another height
-report(mach).cutter(h = 2.5)
+report(mach).cutter(h = 2.5)
diff --git a/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html b/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html index 4d2256040..1220c180f 100644 --- a/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HistGradientBoostingClassifier · MLJ

HistGradientBoostingClassifier

HistGradientBoostingClassifier

A model type for constructing a hist gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingClassifier = @load HistGradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+HistGradientBoostingClassifier · MLJ

HistGradientBoostingClassifier

HistGradientBoostingClassifier

A model type for constructing a hist gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingClassifier = @load HistGradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html b/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html index 4dbbafd45..db47b50ba 100644 --- a/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HistGradientBoostingRegressor · MLJ

HistGradientBoostingRegressor

HistGradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingRegressor = @load HistGradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+HistGradientBoostingRegressor · MLJ

HistGradientBoostingRegressor

HistGradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingRegressor = @load HistGradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/HuberRegressor_MLJLinearModels/index.html b/dev/models/HuberRegressor_MLJLinearModels/index.html index fb771ba9e..41ca98631 100644 --- a/dev/models/HuberRegressor_MLJLinearModels/index.html +++ b/dev/models/HuberRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a huber regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJLinearModels

Do model = HuberRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to HuberRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the HuberRho function (radius of the ball within which the loss is a quadratic loss) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a huber regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJLinearModels

Do model = HuberRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to HuberRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the HuberRho function (radius of the ball within which the loss is a quadratic loss) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(HuberRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also RobustRegressor, QuantileRegressor.

+fitted_params(mach)

See also RobustRegressor, QuantileRegressor.

diff --git a/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html b/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html index 4712b7bee..f945674e5 100644 --- a/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a Huber regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJScikitLearnInterface

Do model = HuberRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HuberRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 1.35
  • max_iter = 100
  • alpha = 0.0001
  • warm_start = false
  • fit_intercept = true
  • tol = 1.0e-5
+HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a Huber regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJScikitLearnInterface

Do model = HuberRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HuberRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 1.35
  • max_iter = 100
  • alpha = 0.0001
  • warm_start = false
  • fit_intercept = true
  • tol = 1.0e-5
diff --git a/dev/models/ICA_MultivariateStats/index.html b/dev/models/ICA_MultivariateStats/index.html index cf223bde3..9d36674d8 100644 --- a/dev/models/ICA_MultivariateStats/index.html +++ b/dev/models/ICA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -ICA · MLJ

ICA

ICA

A model type for constructing a independent component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ICA = @load ICA pkg=MultivariateStats

Do model = ICA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ICA(outdim=...).

Independent component analysis is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • outdim::Int=0: The number of independent components to recover, set automatically if 0.
  • alg::Symbol=:fastica: The algorithm to use (only :fastica is supported at the moment).
  • fun::Symbol=:tanh: The approximate neg-entropy function, one of :tanh, :gaus.
  • do_whiten::Bool=true: Whether or not to perform pre-whitening.
  • maxiter::Int=100: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance for change in the unmixing matrix W.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: mean to use, if nothing (default) centering is computed and applied, if zero, no centering; otherwise a vector of means can be passed.
  • winit::Union{Nothing,Matrix{<:Real}}=nothing: Initial guess for the unmixing matrix W: either an empty matrix (for random initialization of W), a matrix of size m × k (if do_whiten is true), or a matrix of size m × k. Here m is the number of components (columns) of the input.

Operations

  • transform(mach, Xnew): Return the component-separated version of input Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: The estimated component matrix.
  • mean: The estimated mean vector.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • mean: The mean of the untransformed training data, of length indim.

Examples

using MLJ
+ICA · MLJ

ICA

ICA

A model type for constructing a independent component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ICA = @load ICA pkg=MultivariateStats

Do model = ICA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ICA(outdim=...).

Independent component analysis is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • outdim::Int=0: The number of independent components to recover, set automatically if 0.
  • alg::Symbol=:fastica: The algorithm to use (only :fastica is supported at the moment).
  • fun::Symbol=:tanh: The approximate neg-entropy function, one of :tanh, :gaus.
  • do_whiten::Bool=true: Whether or not to perform pre-whitening.
  • maxiter::Int=100: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance for change in the unmixing matrix W.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: mean to use, if nothing (default) centering is computed and applied, if zero, no centering; otherwise a vector of means can be passed.
  • winit::Union{Nothing,Matrix{<:Real}}=nothing: Initial guess for the unmixing matrix W: either an empty matrix (for random initialization of W), a matrix of size m × k (if do_whiten is true), or a matrix of size m × k. Here m is the number of components (columns) of the input.

Operations

  • transform(mach, Xnew): Return the component-separated version of input Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: The estimated component matrix.
  • mean: The estimated mean vector.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • mean: The mean of the untransformed training data, of length indim.

Examples

using MLJ
 
 ICA = @load ICA pkg=MultivariateStats
 
@@ -28,4 +28,4 @@
 plot(X_unmixed.x1)
 plot(X_unmixed.x2)
 plot(X_unmixed.x3)
-

See also PCA, KernelPCA, FactorAnalysis, PPCA

+

See also PCA, KernelPCA, FactorAnalysis, PPCA

diff --git a/dev/models/IForestDetector_OutlierDetectionPython/index.html b/dev/models/IForestDetector_OutlierDetectionPython/index.html index 124cfbe2e..23624855f 100644 --- a/dev/models/IForestDetector_OutlierDetectionPython/index.html +++ b/dev/models/IForestDetector_OutlierDetectionPython/index.html @@ -1,8 +1,8 @@ -IForestDetector · MLJ

IForestDetector

IForestDetector(n_estimators = 100,
+IForestDetector · MLJ
+                   n_jobs = 1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.iforest

diff --git a/dev/models/INNEDetector_OutlierDetectionPython/index.html b/dev/models/INNEDetector_OutlierDetectionPython/index.html index 484b6a619..b2ff2dc70 100644 --- a/dev/models/INNEDetector_OutlierDetectionPython/index.html +++ b/dev/models/INNEDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -INNEDetector · MLJ

INNEDetector

INNEDetector(n_estimators=200,
+INNEDetector · MLJ
+                random_state=None)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.inne

diff --git a/dev/models/ImageClassifier_MLJFlux/index.html b/dev/models/ImageClassifier_MLJFlux/index.html index 7575ef687..a7736c581 100644 --- a/dev/models/ImageClassifier_MLJFlux/index.html +++ b/dev/models/ImageClassifier_MLJFlux/index.html @@ -1,5 +1,5 @@ -ImageClassifier · MLJ

ImageClassifier

ImageClassifier

A model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ImageClassifier = @load ImageClassifier pkg=MLJFlux

Do model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).

ImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we use MLJFlux and a custom builder to classify the MNIST image dataset.

using MLJ
+ImageClassifier · MLJ

ImageClassifier

ImageClassifier

A model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ImageClassifier = @load ImageClassifier pkg=MLJFlux

Do model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).

ImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we use MLJFlux and a custom builder to classify the MNIST image dataset.

using MLJ
 using Flux
 import MLJFlux
 import MLJIteration ## for `skip` control

First we want to download the MNIST dataset, and unpack into images and labels:

import MLDatasets: MNIST
@@ -45,4 +45,4 @@
           resampling=Holdout(fraction_train=0.5),
           measure=cross_entropy,
           rows=1:1000,
-          verbosity=0)

See also NeuralNetworkClassifier.

+ verbosity=0)

See also NeuralNetworkClassifier.

diff --git a/dev/models/InteractionTransformer_MLJModels/index.html b/dev/models/InteractionTransformer_MLJModels/index.html index 6e519c3e9..37cc8393e 100644 --- a/dev/models/InteractionTransformer_MLJModels/index.html +++ b/dev/models/InteractionTransformer_MLJModels/index.html @@ -1,5 +1,5 @@ -InteractionTransformer · MLJ

InteractionTransformer

InteractionTransformer

A model type for constructing a interaction transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

InteractionTransformer = @load InteractionTransformer pkg=MLJModels

Do model = InteractionTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in InteractionTransformer(order=...).

Generates all polynomial interaction terms up to the given order for the subset of chosen columns. Any column that contains elements with scitype <:Infinite is a valid basis to generate interactions. If features is not specified, all such columns with scitype <:Infinite in the table are used as a basis.

In MLJ or MLJBase, you can transform features X with the single call

transform(machine(model), X)

See also the example below.

Hyper-parameters

  • order: Maximum order of interactions to be generated.
  • features: Restricts interations generation to those columns

Operations

  • transform(machine(model), X): Generates polynomial interaction terms out of table X using the hyper-parameters specified in model.

Example

using MLJ
+InteractionTransformer · MLJ

InteractionTransformer

InteractionTransformer

A model type for constructing a interaction transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

InteractionTransformer = @load InteractionTransformer pkg=MLJModels

Do model = InteractionTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in InteractionTransformer(order=...).

Generates all polynomial interaction terms up to the given order for the subset of chosen columns. Any column that contains elements with scitype <:Infinite is a valid basis to generate interactions. If features is not specified, all such columns with scitype <:Infinite in the table are used as a basis.

In MLJ or MLJBase, you can transform features X with the single call

transform(machine(model), X)

See also the example below.

Hyper-parameters

  • order: Maximum order of interactions to be generated.
  • features: Restricts interations generation to those columns

Operations

  • transform(machine(model), X): Generates polynomial interaction terms out of table X using the hyper-parameters specified in model.

Example

using MLJ
 
 X = (
     A = [1, 2, 3],
@@ -29,4 +29,4 @@
  C = [7, 8, 9],
  D = ["x₁", "x₂", "x₃"],
  A_B = [4, 10, 18],)
-
+
diff --git a/dev/models/IteratedModel_MLJIteration/index.html b/dev/models/IteratedModel_MLJIteration/index.html new file mode 100644 index 000000000..7b3b0f7f6 --- /dev/null +++ b/dev/models/IteratedModel_MLJIteration/index.html @@ -0,0 +1,15 @@ + +IteratedModel · MLJ

IteratedModel

IteratedModel(model;
+    controls=MLJIteration.DEFAULT_CONTROLS,
+    resampling=Holdout(),
+    measure=nothing,
+    retrain=false,
+    advanced_options...,
+)

Wrap the specified supervised model in the specified iteration controls. Here model should support iteration, which is true if (iteration_parameter(model) is different from nothing.

Available controls: Step(), Info(), Warn(), Error(), Callback(), WithLossDo(), WithTrainingLossesDo(), WithNumberDo(), Data(), Disjunction(), GL(), InvalidValue(), Never(), NotANumber(), NumberLimit(), NumberSinceBest(), PQ(), Patience(), Threshold(), TimeLimit(), Warmup(), WithIterationsDo(), WithEvaluationDo(), WithFittedParamsDo(), WithReportDo(), WithMachineDo(), WithModelDo(), CycleLearningRate() and Save().

Important

To make out-of-sample losses available to the controls, the wrapped model is only trained on part of the data, as iteration proceeds. The user may want to force retraining on all data after controlled iteration has finished by specifying retrain=true. See also "Training", and the retrain option, under "Extended help" below.

Extended help

Options

  • controls=Any[IterationControl.Step(1), EarlyStopping.Patience(5), EarlyStopping.GL(2.0), EarlyStopping.TimeLimit(Dates.Millisecond(108000)), EarlyStopping.InvalidValue()]: Controls are summarized at https://JuliaAI.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.
  • resampling=Holdout(fraction_train=0.7): The default resampling holds back 30% of data for computing an out-of-sample estimate of performance (the "loss") for loss-based controls such as WithLossDo. Specify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). If the model does not report a training loss, you can use resampling=InSample() instead. Otherwise, resampling must have type Holdout or be a vector with one element of the form (train_indices, test_indices).
  • measure=nothing: StatisticalMeasures.jl compatible measure for estimating model performance (the "loss", but the orientation is immaterial - i.e., this could be a score). Inferred by default. Ignored if resampling=nothing.
  • retrain=false: If retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected ("learned"). That is, the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped. This is typically desired if wrapping the iterated model further, or when inserting in a pipeline or other composite model. If retrain=false (default) and resampling isa Holdout, then iterated_model behaves like the original model trained on a subset of the provided data.
  • weights=nothing: per-observation weights to be passed to measure where supported; if unspecified, these are understood to be uniform.
  • class_weights=nothing: class-weights to be passed to measure where supported; if unspecified, these are understood to be uniform.
  • operation=nothing: Operation, such as predict or predict_mode, for computing target values, or proxy target values, for consumption by measure; automatically inferred by default.
  • check_measure=true: Specify false to override checks on measure for compatibility with the training data.
  • iteration_parameter=nothing: A symbol, such as :epochs, naming the iteration parameter of model; inferred by default. Note that the actual value of the iteration parameter in the supplied model is ignored; only the value of an internal clone is mutated during training the wrapped model.
  • cache=true: Whether or not model-specific representations of data are cached in between iteration parameter increments; specify cache=false to prioritize memory over speed.

Training

Training an instance iterated_model of IteratedModel on some data (by binding to a machine and calling fit!, for example) performs the following actions:

  • Assuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy.
  • A clone of the wrapped model, model is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).
  • The iteration parameter of the clone is set to 0.
  • The specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).
  • Once a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false (true by default) or resampling === nothing, in which case mach_production coincides with train_mach.

Prediction

Calling predict(mach, Xnew) in the example above returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.

Controls that mutate parameters

A control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control

Callback(mach -> mach.model.eta = 1.05*mach.model.eta)

However, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.

Warm restarts

In the following example, the second fit! call will not restart training of the internal train_mach, assuming model supports warm restarts:

iterated_model = IteratedModel(
+    model,
+    controls = [Step(1), NumberLimit(100)],
+)
+mach = machine(iterated_model, X, y)
+fit!(mach) ## train for 100 iterations
+iterated_model.controls = [Step(1), NumberLimit(50)],
+fit!(mach) ## train for an *extra* 50 iterations

More generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

diff --git a/dev/models/KDEDetector_OutlierDetectionPython/index.html b/dev/models/KDEDetector_OutlierDetectionPython/index.html index e387cdd38..b5c4bccd2 100644 --- a/dev/models/KDEDetector_OutlierDetectionPython/index.html +++ b/dev/models/KDEDetector_OutlierDetectionPython/index.html @@ -1,6 +1,6 @@ -KDEDetector · MLJ

KDEDetector

KDEDetector(bandwidth=1.0,
+KDEDetector · MLJ
+               metric_params=None)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.kde

diff --git a/dev/models/KMeansClusterer_BetaML/index.html b/dev/models/KMeansClusterer_BetaML/index.html index 6cf554fcd..54fa8bc9a 100644 --- a/dev/models/KMeansClusterer_BetaML/index.html +++ b/dev/models/KMeansClusterer_BetaML/index.html @@ -1,5 +1,5 @@ -KMeansClusterer · MLJ

KMeansClusterer

mutable struct KMeansClusterer <: MLJModelInterface.Unsupervised

The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary to KMedoidsClusterer, the KMeansClusterer algorithm is not guaranteed to converge with other distances than the Euclidean one.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
+KMeansClusterer · MLJ

KMeansClusterer

mutable struct KMeansClusterer <: MLJModelInterface.Unsupervised

The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary to KMedoidsClusterer, the KMeansClusterer algorithm is not guaranteed to converge with other distances than the Euclidean one.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -29,4 +29,4 @@
  ⋮            
  "virginica"  3
  "virginica"  3
- "virginica"  1
+ "virginica" 1
diff --git a/dev/models/KMeans_Clustering/index.html b/dev/models/KMeans_Clustering/index.html index 4f3a51489..fedd099c8 100644 --- a/dev/models/KMeans_Clustering/index.html +++ b/dev/models/KMeans_Clustering/index.html @@ -1,5 +1,5 @@ -KMeans · MLJ

KMeans

KMeans

A model type for constructing a K-means clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=Clustering

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(k=...).

K-means is a classical method for clustering or vector quantization. It produces a fixed number of clusters, each associated with a center (also known as a prototype), and each data point is assigned to a cluster with the nearest center.

From a mathematical standpoint, K-means is a coordinate descent algorithm that solves the following optimization problem:

:$

\text{minimize} \ \sum{i=1}^n \| \mathbf{x}i - \boldsymbol{\mu}{zi} \|^2 \ \text{w.r.t.} \ (\boldsymbol{\mu}, z) :$

Here, $\boldsymbol{\mu}_k$ is the center of the $k$-th cluster, and $z_i$ is an index of the cluster for $i$-th point $\mathbf{x}_i$.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init = :kmpp: One of the following options to indicate how cluster seeds should be initialized:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial cluster centers.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • centers: The coordinates of the cluster centers.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
+KMeans · MLJ

KMeans

KMeans

A model type for constructing a K-means clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=Clustering

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(k=...).

K-means is a classical method for clustering or vector quantization. It produces a fixed number of clusters, each associated with a center (also known as a prototype), and each data point is assigned to a cluster with the nearest center.

From a mathematical standpoint, K-means is a coordinate descent algorithm that solves the following optimization problem:

:$

\text{minimize} \ \sum{i=1}^n \| \mathbf{x}i - \boldsymbol{\mu}{zi} \|^2 \ \text{w.r.t.} \ (\boldsymbol{\mu}, z) :$

Here, $\boldsymbol{\mu}_k$ is the center of the $k$-th cluster, and $z_i$ is an index of the cluster for $i$-th point $\mathbf{x}_i$.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init = :kmpp: One of the following options to indicate how cluster seeds should be initialized:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial cluster centers.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • centers: The coordinates of the cluster centers.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
 KMeans = @load KMeans pkg=Clustering
 
 table = load_iris()
@@ -17,4 +17,4 @@
 
 @assert center_dists[1][1] == 0.0
 @assert center_dists[2][2] == 0.0
-@assert center_dists[3][3] == 0.0

See also KMedoids

+@assert center_dists[3][3] == 0.0

See also KMedoids

diff --git a/dev/models/KMeans_MLJScikitLearnInterface/index.html b/dev/models/KMeans_MLJScikitLearnInterface/index.html index 44a77622a..58b37f355 100644 --- a/dev/models/KMeans_MLJScikitLearnInterface/index.html +++ b/dev/models/KMeans_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -KMeans · MLJ

KMeans

KMeans

A model type for constructing a k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=MLJScikitLearnInterface

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(n_clusters=...).

K-Means algorithm: find K centroids corresponding to K clusters in the data.

+KMeans · MLJ

KMeans

KMeans

A model type for constructing a k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=MLJScikitLearnInterface

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(n_clusters=...).

K-Means algorithm: find K centroids corresponding to K clusters in the data.

diff --git a/dev/models/KMeans_ParallelKMeans/index.html b/dev/models/KMeans_ParallelKMeans/index.html index 69baa9380..38ca0741a 100644 --- a/dev/models/KMeans_ParallelKMeans/index.html +++ b/dev/models/KMeans_ParallelKMeans/index.html @@ -1,2 +1,2 @@ -KMeans · MLJ

KMeans

Parallel & lightning fast implementation of all available variants of the KMeans clustering algorithm in native Julia. Compatible with Julia 1.3+

+KMeans · MLJ

KMeans

Parallel & lightning fast implementation of all available variants of the KMeans clustering algorithm in native Julia. Compatible with Julia 1.3+

diff --git a/dev/models/KMedoidsClusterer_BetaML/index.html b/dev/models/KMedoidsClusterer_BetaML/index.html index 386037fb4..7c832a6eb 100644 --- a/dev/models/KMedoidsClusterer_BetaML/index.html +++ b/dev/models/KMedoidsClusterer_BetaML/index.html @@ -1,5 +1,5 @@ -KMedoidsClusterer · MLJ

KMedoidsClusterer

mutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).

Similar to K-Means, but the "representatives" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
+KMedoidsClusterer · MLJ

KMedoidsClusterer

mutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).

Similar to K-Means, but the "representatives" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -29,4 +29,4 @@
  ⋮            
  "virginica"  1
  "virginica"  1
- "virginica"  2
+ "virginica" 2
diff --git a/dev/models/KMedoids_Clustering/index.html b/dev/models/KMedoids_Clustering/index.html index e8f8fcdb2..69fe853a6 100644 --- a/dev/models/KMedoids_Clustering/index.html +++ b/dev/models/KMedoids_Clustering/index.html @@ -1,5 +1,5 @@ -KMedoids · MLJ

KMedoids

KMedoids

A model type for constructing a K-medoids clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMedoids = @load KMedoids pkg=Clustering

Do model = KMedoids() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMedoids(k=...).

K-medoids is a clustering algorithm that works by finding $k$ data points (called medoids) such that the total distance between each data point and the closest medoid is minimal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init (defaults to :kmpp): how medoids should be initialized, could be one of the following:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial medoids.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • medoids: The coordinates of the cluster medoids.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
+KMedoids · MLJ

KMedoids

KMedoids

A model type for constructing a K-medoids clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMedoids = @load KMedoids pkg=Clustering

Do model = KMedoids() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMedoids(k=...).

K-medoids is a clustering algorithm that works by finding $k$ data points (called medoids) such that the total distance between each data point and the closest medoid is minimal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init (defaults to :kmpp): how medoids should be initialized, could be one of the following:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial medoids.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • medoids: The coordinates of the cluster medoids.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
 KMedoids = @load KMedoids pkg=Clustering
 
 table = load_iris()
@@ -17,4 +17,4 @@
 
 @assert center_dists[1][1] == 0.0
 @assert center_dists[2][2] == 0.0
-@assert center_dists[3][3] == 0.0

See also KMeans

+@assert center_dists[3][3] == 0.0

See also KMeans

diff --git a/dev/models/KNNClassifier_NearestNeighborModels/index.html b/dev/models/KNNClassifier_NearestNeighborModels/index.html index 5476df758..4e26cb9ac 100644 --- a/dev/models/KNNClassifier_NearestNeighborModels/index.html +++ b/dev/models/KNNClassifier_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -KNNClassifier · MLJ

KNNClassifier

KNNClassifier

A model type for constructing a K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels

Do model = KNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNClassifier(K=...).

KNNClassifier implements K-Nearest Neighbors classifier which is non-parametric algorithm that predicts a discrete class distribution associated with a new point by taking a vote over the classes of the k-nearest points. Each neighbor vote is assigned a weight based on proximity of the neighbor point to the test point according to a specified distance metric.

For more information about the weighting kernels, see the paper by Geler et.al Comparison of different weighting schemes for the kNN classifier on time-series data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is <:Finite (<:Multiclass or <:OrderedFactor will do); check the scitype with scitype(y)
  • w is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
+KNNClassifier · MLJ

KNNClassifier

KNNClassifier

A model type for constructing a K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels

Do model = KNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNClassifier(K=...).

KNNClassifier implements K-Nearest Neighbors classifier which is non-parametric algorithm that predicts a discrete class distribution associated with a new point by taking a vote over the classes of the k-nearest points. Each neighbor vote is assigned a weight based on proximity of the neighbor point to the test point according to a specified distance metric.

For more information about the weighting kernels, see the paper by Geler et.al Comparison of different weighting schemes for the kNN classifier on time-series data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is <:Finite (<:Multiclass or <:OrderedFactor will do); check the scitype with scitype(y)
  • w is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
 KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels
 X, y = @load_crabs; ## a table and a vector from the crabs dataset
 ## view possible kernels
@@ -9,4 +9,4 @@
 mach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit
 y_hat = predict(mach, X)
 labels = predict_mode(mach, X)
-

See also MultitargetKNNClassifier

+

See also MultitargetKNNClassifier

diff --git a/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html b/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html index 74c2f6b60..3c1a43955 100644 --- a/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -KNNDetector · MLJ

KNNDetector

KNNDetector(k=5,
+KNNDetector · MLJ

KNNDetector

KNNDetector(k=5,
             metric=Euclidean,
             algorithm=:kdtree,
             leafsize=10,
@@ -8,4 +8,4 @@
 detector = KNNDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000): Efficient Algorithms for Mining Outliers from Large Data Sets.

[2] Angiulli, Fabrizio; Pizzuti, Clara (2002): Fast Outlier Detection in High Dimensional Spaces.

+test_scores = transform(detector, model, X)

References

[1] Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000): Efficient Algorithms for Mining Outliers from Large Data Sets.

[2] Angiulli, Fabrizio; Pizzuti, Clara (2002): Fast Outlier Detection in High Dimensional Spaces.

diff --git a/dev/models/KNNDetector_OutlierDetectionPython/index.html b/dev/models/KNNDetector_OutlierDetectionPython/index.html index 852993c4e..f13c99cdd 100644 --- a/dev/models/KNNDetector_OutlierDetectionPython/index.html +++ b/dev/models/KNNDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -KNNDetector · MLJ

KNNDetector

KNNDetector(n_neighbors = 5,
+KNNDetector · MLJ
+               n_jobs = 1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.knn

diff --git a/dev/models/KNNRegressor_NearestNeighborModels/index.html b/dev/models/KNNRegressor_NearestNeighborModels/index.html index 0e981607f..f8f6fa731 100644 --- a/dev/models/KNNRegressor_NearestNeighborModels/index.html +++ b/dev/models/KNNRegressor_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -KNNRegressor · MLJ

KNNRegressor

KNNRegressor

A model type for constructing a K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels

Do model = KNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNRegressor(K=...).

KNNRegressor implements K-Nearest Neighbors regressor which is non-parametric algorithm that predicts the response associated with a new point by taking an weighted average of the response of the K-nearest points.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
+KNNRegressor · MLJ

KNNRegressor

KNNRegressor

A model type for constructing a K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels

Do model = KNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNRegressor(K=...).

KNNRegressor implements K-Nearest Neighbors regressor which is non-parametric algorithm that predicts the response associated with a new point by taking an weighted average of the response of the K-nearest points.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
 KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels
 X, y = @load_boston; ## loads the crabs dataset from MLJBase
 ## view possible kernels
@@ -7,4 +7,4 @@
 model = KNNRegressor(weights = NearestNeighborModels.Inverse()) #KNNRegressor instantiation
 mach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit
 y_hat = predict(mach, X)
-

See also MultitargetKNNRegressor

+

See also MultitargetKNNRegressor

diff --git a/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html b/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html index 991b2859b..bc22f5b3b 100644 --- a/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -KNeighborsClassifier · MLJ

KNeighborsClassifier

KNeighborsClassifier

A model type for constructing a K-nearest neighbors classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsClassifier = @load KNeighborsClassifier pkg=MLJScikitLearnInterface

Do model = KNeighborsClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsClassifier(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
+KNeighborsClassifier · MLJ

KNeighborsClassifier

KNeighborsClassifier

A model type for constructing a K-nearest neighbors classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsClassifier = @load KNeighborsClassifier pkg=MLJScikitLearnInterface

Do model = KNeighborsClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsClassifier(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
diff --git a/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html b/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html index 3097b801c..5961feb05 100644 --- a/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -KNeighborsRegressor · MLJ

KNeighborsRegressor

KNeighborsRegressor

A model type for constructing a K-nearest neighbors regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsRegressor = @load KNeighborsRegressor pkg=MLJScikitLearnInterface

Do model = KNeighborsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsRegressor(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
+KNeighborsRegressor · MLJ

KNeighborsRegressor

KNeighborsRegressor

A model type for constructing a K-nearest neighbors regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsRegressor = @load KNeighborsRegressor pkg=MLJScikitLearnInterface

Do model = KNeighborsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsRegressor(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
diff --git a/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html b/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html index a6c55b585..f52705962 100644 --- a/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html +++ b/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html @@ -1,2 +1,2 @@ -KPLSRegressor · MLJ

KPLSRegressor

A Kernel Partial Least Squares Regressor. A Kernel PLS2 NIPALS algorithms. Can be used mainly for regression.

+KPLSRegressor · MLJ

KPLSRegressor

A Kernel Partial Least Squares Regressor. A Kernel PLS2 NIPALS algorithms. Can be used mainly for regression.

diff --git a/dev/models/KernelPCA_MultivariateStats/index.html b/dev/models/KernelPCA_MultivariateStats/index.html index 0cd5bf08f..0774dc1b0 100644 --- a/dev/models/KernelPCA_MultivariateStats/index.html +++ b/dev/models/KernelPCA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -KernelPCA · MLJ

KernelPCA

KernelPCA

A model type for constructing a kernel prinicipal component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KernelPCA = @load KernelPCA pkg=MultivariateStats

Do model = KernelPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KernelPCA(maxoutdim=...).

In kernel PCA the linear operations of ordinary principal component analysis are performed in a reproducing Hilbert space.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • kernel::Function=(x,y)->x'y: The kernel function, takes in 2 vector arguments x and y, returns a scalar value. Defaults to the dot product of x and y.
  • solver::Symbol=:eig: solver to use for the eigenvalues, one of :eig(default, uses LinearAlgebra.eigen), :eigs(uses Arpack.eigs).
  • inverse::Bool=true: perform calculations needed for inverse transform
  • beta::Real=1.0: strength of the ridge regression that learns the inverse transform when inverse is true.
  • tol::Real=0.0: Convergence tolerance for eigenvalue solver.
  • maxiter::Int=300: maximum number of iterations for eigenvalue solver.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • principalvars: The variance of the principal components.

Examples

using MLJ
+KernelPCA · MLJ

KernelPCA

KernelPCA

A model type for constructing a kernel prinicipal component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KernelPCA = @load KernelPCA pkg=MultivariateStats

Do model = KernelPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KernelPCA(maxoutdim=...).

In kernel PCA the linear operations of ordinary principal component analysis are performed in a reproducing Hilbert space.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • kernel::Function=(x,y)->x'y: The kernel function, takes in 2 vector arguments x and y, returns a scalar value. Defaults to the dot product of x and y.
  • solver::Symbol=:eig: solver to use for the eigenvalues, one of :eig(default, uses LinearAlgebra.eigen), :eigs(uses Arpack.eigs).
  • inverse::Bool=true: perform calculations needed for inverse transform
  • beta::Real=1.0: strength of the ridge regression that learns the inverse transform when inverse is true.
  • tol::Real=0.0: Convergence tolerance for eigenvalue solver.
  • maxiter::Int=300: maximum number of iterations for eigenvalue solver.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • principalvars: The variance of the principal components.

Examples

using MLJ
 using LinearAlgebra
 
 KernelPCA = @load KernelPCA pkg=MultivariateStats
@@ -13,4 +13,4 @@
 model = KernelPCA(maxoutdim=2, kernel=rbf_kernel(1))
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also PCA, ICA, FactorAnalysis, PPCA

+Xproj = transform(mach, X)

See also PCA, ICA, FactorAnalysis, PPCA

diff --git a/dev/models/KernelPerceptronClassifier_BetaML/index.html b/dev/models/KernelPerceptronClassifier_BetaML/index.html index a0cdc9fe2..14874bc39 100644 --- a/dev/models/KernelPerceptronClassifier_BetaML/index.html +++ b/dev/models/KernelPerceptronClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -KernelPerceptronClassifier · MLJ

KernelPerceptronClassifier

mutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic

The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • kernel::Function: Kernel function to employ. See ?radial_kernel or ?polynomial_kernel (once loaded the BetaML package) for details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radial_kernel]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 100]
  • initial_errors::Union{Nothing, Vector{Vector{Int64}}}: Initial distribution of the number of errors errors [def: nothing, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as (n_classes * (n_classes - 1)) / 2
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+KernelPerceptronClassifier · MLJ

KernelPerceptronClassifier

mutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic

The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • kernel::Function: Kernel function to employ. See ?radial_kernel or ?polynomial_kernel (once loaded the BetaML package) for details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radial_kernel]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 100]
  • initial_errors::Union{Nothing, Vector{Vector{Int64}}}: Initial distribution of the number of errors errors [def: nothing, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as (n_classes * (n_classes - 1)) / 2
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -26,4 +26,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.245, virginica=>0.665)
- UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)
diff --git a/dev/models/LADRegressor_MLJLinearModels/index.html b/dev/models/LADRegressor_MLJLinearModels/index.html index 47c4ece57..76903426f 100644 --- a/dev/models/LADRegressor_MLJLinearModels/index.html +++ b/dev/models/LADRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LADRegressor · MLJ

LADRegressor

LADRegressor

A model type for constructing a lad regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LADRegressor = @load LADRegressor pkg=MLJLinearModels

Do model = LADRegressor() to construct an instance with default hyper-parameters.

Least absolute deviation regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is the absolute loss and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

See also RobustRegressor.

Parameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+LADRegressor · MLJ

LADRegressor

LADRegressor

A model type for constructing a lad regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LADRegressor = @load LADRegressor pkg=MLJLinearModels

Do model = LADRegressor() to construct an instance with default hyper-parameters.

Least absolute deviation regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is the absolute loss and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

See also RobustRegressor.

Parameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(LADRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)
+fitted_params(mach)
diff --git a/dev/models/LDA_MultivariateStats/index.html b/dev/models/LDA_MultivariateStats/index.html index d742d8355..be5eaaa58 100644 --- a/dev/models/LDA_MultivariateStats/index.html +++ b/dev/models/LDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -LDA · MLJ

LDA

LDA

A model type for constructing a linear discriminant analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LDA = @load LDA pkg=MultivariateStats

Do model = LDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LDA(method=...).

Multiclass linear discriminant analysis learns a projection in a space of features to a lower dimensional space, in a way that attempts to preserve as much as possible the degree to which the classes of a discrete target variable can be discriminated. This can be used either for dimension reduction of the features (see transform below) or for probabilistic classification of the target (see predict below).

In the case of prediction, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: The solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
+LDA · MLJ

LDA

LDA

A model type for constructing a linear discriminant analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LDA = @load LDA pkg=MultivariateStats

Do model = LDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LDA(method=...).

Multiclass linear discriminant analysis learns a projection in a space of features to a lower dimensional space, in a way that attempts to preserve as much as possible the degree to which the classes of a discrete target variable can be discriminated. This can be used either for dimension reduction of the features (see transform below) or for probabilistic classification of the target (see predict below).

In the case of prediction, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: The solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
 
 LDA = @load LDA pkg=MultivariateStats
 
@@ -11,4 +11,4 @@
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
 labels = predict_mode(mach, X)
-

See also BayesianLDA, SubspaceLDA, BayesianSubspaceLDA

+

See also BayesianLDA, SubspaceLDA, BayesianSubspaceLDA

diff --git a/dev/models/LGBMClassifier_LightGBM/index.html b/dev/models/LGBMClassifier_LightGBM/index.html index 89666d30c..5aac49fe2 100644 --- a/dev/models/LGBMClassifier_LightGBM/index.html +++ b/dev/models/LGBMClassifier_LightGBM/index.html @@ -1,2 +1,2 @@ -LGBMClassifier · MLJ
+LGBMClassifier · MLJ
diff --git a/dev/models/LGBMRegressor_LightGBM/index.html b/dev/models/LGBMRegressor_LightGBM/index.html index 23eaccc4e..95f37ff69 100644 --- a/dev/models/LGBMRegressor_LightGBM/index.html +++ b/dev/models/LGBMRegressor_LightGBM/index.html @@ -1,2 +1,2 @@ -LGBMRegressor · MLJ
+LGBMRegressor · MLJ
diff --git a/dev/models/LMDDDetector_OutlierDetectionPython/index.html b/dev/models/LMDDDetector_OutlierDetectionPython/index.html index 301bd3dc8..2b89ba5da 100644 --- a/dev/models/LMDDDetector_OutlierDetectionPython/index.html +++ b/dev/models/LMDDDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -LMDDDetector · MLJ

LMDDDetector

LMDDDetector(n_iter = 50,
+LMDDDetector · MLJ
+                random_state = nothing)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lmdd

diff --git a/dev/models/LOCIDetector_OutlierDetectionPython/index.html b/dev/models/LOCIDetector_OutlierDetectionPython/index.html index e9f0e756b..eb42d2b30 100644 --- a/dev/models/LOCIDetector_OutlierDetectionPython/index.html +++ b/dev/models/LOCIDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -LOCIDetector · MLJ
+LOCIDetector · MLJ
diff --git a/dev/models/LODADetector_OutlierDetectionPython/index.html b/dev/models/LODADetector_OutlierDetectionPython/index.html index ffbff7ab8..f061dc7f4 100644 --- a/dev/models/LODADetector_OutlierDetectionPython/index.html +++ b/dev/models/LODADetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -LODADetector · MLJ
+LODADetector · MLJ
diff --git a/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html b/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html index 43edeafa9..793098696 100644 --- a/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -LOFDetector · MLJ

LOFDetector

LOFDetector(k = 5,
+LOFDetector · MLJ

LOFDetector

LOFDetector(k = 5,
             metric = Euclidean(),
             algorithm = :kdtree,
             leafsize = 10,
@@ -8,4 +8,4 @@
 detector = LOFDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.; Sander, Jörg (2000): LOF: Identifying Density-Based Local Outliers.

+test_scores = transform(detector, model, X)

References

[1] Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.; Sander, Jörg (2000): LOF: Identifying Density-Based Local Outliers.

diff --git a/dev/models/LOFDetector_OutlierDetectionPython/index.html b/dev/models/LOFDetector_OutlierDetectionPython/index.html index 2e9e70593..1a858b2e0 100644 --- a/dev/models/LOFDetector_OutlierDetectionPython/index.html +++ b/dev/models/LOFDetector_OutlierDetectionPython/index.html @@ -1,9 +1,9 @@ -LOFDetector · MLJ

LOFDetector

LOFDetector(n_neighbors = 5,
+LOFDetector · MLJ
+               novelty = true)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lof

diff --git a/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html index 4188dc743..5fc3fe41c 100644 --- a/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LarsCVRegressor · MLJ

LarsCVRegressor

LarsCVRegressor

A model type for constructing a least angle regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsCVRegressor = @load LarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • normalize = false
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
+LarsCVRegressor · MLJ

LarsCVRegressor

LarsCVRegressor

A model type for constructing a least angle regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsCVRegressor = @load LarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
diff --git a/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html b/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html index 4529f03a3..32f3b35f0 100644 --- a/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LarsRegressor · MLJ

LarsRegressor

LarsRegressor

A model type for constructing a least angle regressor (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsRegressor = @load LarsRegressor pkg=MLJScikitLearnInterface

Do model = LarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • normalize = false
  • precompute = auto
  • n_nonzero_coefs = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
+LarsRegressor · MLJ

LarsRegressor

LarsRegressor

A model type for constructing a least angle regressor (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsRegressor = @load LarsRegressor pkg=MLJScikitLearnInterface

Do model = LarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • n_nonzero_coefs = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
diff --git a/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html index 526c9c18e..393aec573 100644 --- a/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoCVRegressor · MLJ

LassoCVRegressor

LassoCVRegressor

A model type for constructing a lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoCVRegressor = @load LassoCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
+LassoCVRegressor · MLJ

LassoCVRegressor

LassoCVRegressor

A model type for constructing a lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoCVRegressor = @load LassoCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html index 449594c77..347e4dfd7 100644 --- a/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoLarsCVRegressor · MLJ

LassoLarsCVRegressor

LassoLarsCVRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS) with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsCVRegressor = @load LassoLarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • normalize = false
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
+LassoLarsCVRegressor · MLJ

LassoLarsCVRegressor

LassoLarsCVRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS) with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsCVRegressor = @load LassoLarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
diff --git a/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html index 4d131bb77..854279565 100644 --- a/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoLarsICRegressor · MLJ

LassoLarsICRegressor

LassoLarsICRegressor

A model type for constructing a Lasso model with LARS using BIC or AIC for model selection, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsICRegressor = @load LassoLarsICRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsICRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsICRegressor(criterion=...).

Hyper-parameters

  • criterion = aic
  • fit_intercept = true
  • verbose = false
  • normalize = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
+LassoLarsICRegressor · MLJ

LassoLarsICRegressor

LassoLarsICRegressor

A model type for constructing a Lasso model with LARS using BIC or AIC for model selection, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsICRegressor = @load LassoLarsICRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsICRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsICRegressor(criterion=...).

Hyper-parameters

  • criterion = aic
  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
diff --git a/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html index 40e385054..84ab29902 100644 --- a/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoLarsRegressor · MLJ

LassoLarsRegressor

LassoLarsRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsRegressor = @load LassoLarsRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • verbose = false
  • normalize = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
  • positive = false
+LassoLarsRegressor · MLJ

LassoLarsRegressor

LassoLarsRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsRegressor = @load LassoLarsRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
  • positive = false
diff --git a/dev/models/LassoRegressor_MLJLinearModels/index.html b/dev/models/LassoRegressor_MLJLinearModels/index.html index a2e498201..1fa04006e 100644 --- a/dev/models/LassoRegressor_MLJLinearModels/index.html +++ b/dev/models/LassoRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJLinearModels

Do model = LassoRegressor() to construct an instance with default hyper-parameters.

Lasso regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is

$

|Xθ - y|₂²/2 + λ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L1 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad. If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used. Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
+LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJLinearModels

Do model = LassoRegressor() to construct an instance with default hyper-parameters.

Lasso regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is

$

|Xθ - y|₂²/2 + λ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L1 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad. If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used. Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(LassoRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also ElasticNetRegressor.

+fitted_params(mach)

See also ElasticNetRegressor.

diff --git a/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html index 5d0129f51..3f8a17b38 100644 --- a/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJScikitLearnInterface

Do model = LassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • precompute = false
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
+LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJScikitLearnInterface

Do model = LassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • precompute = false
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/LinearBinaryClassifier_GLM/index.html b/dev/models/LinearBinaryClassifier_GLM/index.html index e6d448f52..6a802f64e 100644 --- a/dev/models/LinearBinaryClassifier_GLM/index.html +++ b/dev/models/LinearBinaryClassifier_GLM/index.html @@ -1,5 +1,5 @@ -LinearBinaryClassifier · MLJ

LinearBinaryClassifier

LinearBinaryClassifier

A model type for constructing a linear binary classifier, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM

Do model = LinearBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearBinaryClassifier(fit_intercept=...).

LinearBinaryClassifier is a generalized linear model, specialised to the case of a binary target variable, with a user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearBinaryClassifier · MLJ

LinearBinaryClassifier

LinearBinaryClassifier

A model type for constructing a linear binary classifier, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM

Do model = LinearBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearBinaryClassifier(fit_intercept=...).

LinearBinaryClassifier is a generalized linear model, specialised to the case of a binary target variable, with a user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

Here

  • X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor(2) or <:Multiclass(2); check the scitype with schema(y)
  • w: is a vector of Real per-observation weights

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)
  • link=GLM.LogitLink: The function which links the linear prediction function to the probability of a particular outcome or class. This must have type GLM.Link01. Options include GLM.LogitLink(), GLM.ProbitLink(), CloglogLink(),CauchitLink()`.
  • offsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.
  • maxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.
  • atol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • rtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • minstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.
  • report_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • features: The names of the features used during model fitting.
  • coef: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Report

The fields of report(mach) are:

  • deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares
  • dof_residual: The degrees of freedom for residuals, when meaningful.
  • stderror: The standard errors of the coefficients.
  • vcov: The estimated variance-covariance matrix of the coefficient estimates.
  • coef_table: Table which displays coefficients and summarizes their significance and confidence intervals.
  • glm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.

Examples

using MLJ
 import GLM ## namespace must be available
 
@@ -25,4 +25,4 @@
 fitted_params(mach).coef
 fitted_params(mach).intercept
 
-report(mach)

See also LinearRegressor, LinearCountRegressor

+report(mach)

See also LinearRegressor, LinearCountRegressor

diff --git a/dev/models/LinearCountRegressor_GLM/index.html b/dev/models/LinearCountRegressor_GLM/index.html index 0f5502692..43d8dc1d9 100644 --- a/dev/models/LinearCountRegressor_GLM/index.html +++ b/dev/models/LinearCountRegressor_GLM/index.html @@ -1,5 +1,5 @@ -LinearCountRegressor · MLJ

LinearCountRegressor

LinearCountRegressor

A model type for constructing a linear count regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearCountRegressor = @load LinearCountRegressor pkg=GLM

Do model = LinearCountRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearCountRegressor(fit_intercept=...).

LinearCountRegressor is a generalized linear model, specialised to the case of a Count target variable (non-negative, unbounded integer) with user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearCountRegressor · MLJ

LinearCountRegressor

LinearCountRegressor

A model type for constructing a linear count regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearCountRegressor = @load LinearCountRegressor pkg=GLM

Do model = LinearCountRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearCountRegressor(fit_intercept=...).

LinearCountRegressor is a generalized linear model, specialised to the case of a Count target variable (non-negative, unbounded integer) with user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

Here

  • X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Count; check the scitype with schema(y)
  • w: is a vector of Real per-observation weights

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)
  • distribution=Distributions.Poisson(): The distribution which the residuals/errors of the model should fit.
  • link=GLM.LogLink(): The function which links the linear prediction function to the probability of a particular outcome or class. This should be one of the following: GLM.IdentityLink(), GLM.InverseLink(), GLM.InverseSquareLink(), GLM.LogLink(), GLM.SqrtLink().
  • offsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.
  • maxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.
  • atol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • rtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • minstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.
  • report_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.
  • predict_mean(mach, Xnew): instead return the mean of each prediction above
  • predict_median(mach, Xnew): instead return the median of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • features: The names of the features encountered during model fitting.
  • coef: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Report

The fields of report(mach) are:

  • deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares
  • dof_residual: The degrees of freedom for residuals, when meaningful.
  • stderror: The standard errors of the coefficients.
  • vcov: The estimated variance-covariance matrix of the coefficient estimates.
  • coef_table: Table which displays coefficients and summarizes their significance and confidence intervals.
  • glm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.

Examples

using MLJ
 import MLJ.Distributions.Poisson
 
@@ -31,4 +31,4 @@
  -2.0255901752504775
   3.014407534033522
 
-report(mach)

See also LinearRegressor, LinearBinaryClassifier

+report(mach)

See also LinearRegressor, LinearBinaryClassifier

diff --git a/dev/models/LinearRegressor_GLM/index.html b/dev/models/LinearRegressor_GLM/index.html index 40676a6c0..9547f5d7b 100644 --- a/dev/models/LinearRegressor_GLM/index.html +++ b/dev/models/LinearRegressor_GLM/index.html @@ -1,5 +1,5 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=GLM

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

LinearRegressor assumes the target is a continuous variable whose conditional distribution is normal with constant variance, and whose expected value is a linear combination of the features (identity link function). Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=GLM

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

LinearRegressor assumes the target is a continuous variable whose conditional distribution is normal with constant variance, and whose expected value is a linear combination of the features (identity link function). Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

Here

  • X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)
  • w: is a vector of Real per-observation weights

Hyper-parameters

  • fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)
  • dropcollinear=false: Whether to drop features in the training data to ensure linear independence. If true , only the first of each set of linearly-dependent features is used. The coefficient for redundant linearly dependent features is 0.0 and all associated statistics are set to NaN.
  • offsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.
  • report_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.
  • predict_mean(mach, Xnew): instead return the mean of each prediction above
  • predict_median(mach, Xnew): instead return the median of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • features: The names of the features encountered during model fitting.
  • coef: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Report

When all keys are enabled in report_keys, the following fields are available in report(mach):

  • deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares
  • dof_residual: The degrees of freedom for residuals, when meaningful.
  • stderror: The standard errors of the coefficients.
  • vcov: The estimated variance-covariance matrix of the coefficient estimates.
  • coef_table: Table which displays coefficients and summarizes their significance and confidence intervals.
  • glm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.

Examples

using MLJ
 LinearRegressor = @load LinearRegressor pkg=GLM
 glm = LinearRegressor()
@@ -15,4 +15,4 @@
 fitted_params(mach).coef ## x1, x2, intercept
 fitted_params(mach).intercept
 
-report(mach)

See also LinearCountRegressor, LinearBinaryClassifier

+report(mach)

See also LinearCountRegressor, LinearBinaryClassifier

diff --git a/dev/models/LinearRegressor_MLJLinearModels/index.html b/dev/models/LinearRegressor_MLJLinearModels/index.html index 2b95687d8..d1ac50834 100644 --- a/dev/models/LinearRegressor_MLJLinearModels/index.html +++ b/dev/models/LinearRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels

Do model = LinearRegressor() to construct an instance with default hyper-parameters.

This model provides standard linear regression with objective function

$

|Xθ - y|₂²/2 $

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: "any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient.

    If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels

Do model = LinearRegressor() to construct an instance with default hyper-parameters.

This model provides standard linear regression with objective function

$

|Xθ - y|₂²/2 $

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: "any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient.

    If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(LinearRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)
+fitted_params(mach)
diff --git a/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html b/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html index 934ea02fc..049064a3c 100644 --- a/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a ordinary least-squares regressor (OLS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJScikitLearnInterface

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • n_jobs = nothing
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a ordinary least-squares regressor (OLS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJScikitLearnInterface

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • n_jobs = nothing
diff --git a/dev/models/LinearRegressor_MultivariateStats/index.html b/dev/models/LinearRegressor_MultivariateStats/index.html index a1f360946..930053b10 100644 --- a/dev/models/LinearRegressor_MultivariateStats/index.html +++ b/dev/models/LinearRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MultivariateStats

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(bias=...).

LinearRegressor assumes the target is a Continuous variable and trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MultivariateStats

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(bias=...).

LinearRegressor assumes the target is a Continuous variable and trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 
 LinearRegressor = @load LinearRegressor pkg=MultivariateStats
 linear_regressor = LinearRegressor()
@@ -8,4 +8,4 @@
 mach = machine(linear_regressor, X, y) |> fit!
 
 Xnew, _ = make_regression(3, 2)
-yhat = predict(mach, Xnew) ## new predictions

See also MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

+yhat = predict(mach, Xnew) ## new predictions

See also MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

diff --git a/dev/models/LinearSVC_LIBSVM/index.html b/dev/models/LinearSVC_LIBSVM/index.html index 84e95f313..9487a5a1c 100644 --- a/dev/models/LinearSVC_LIBSVM/index.html +++ b/dev/models/LinearSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -LinearSVC · MLJ

LinearSVC

LinearSVC

A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearSVC = @load LinearSVC pkg=LIBSVM

Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).

Reference for algorithm and core C-library: Rong-En Fan et al (2008): "LIBLINEAR: A Library for Large Linear Classification." Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf.

This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearSVC · MLJ

LinearSVC

LinearSVC

A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearSVC = @load LinearSVC pkg=LIBSVM

Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).

Reference for algorithm and core C-library: Rong-En Fan et al (2008): "LIBLINEAR: A Library for Large Linear Classification." Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf.

This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • solver=LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: linear solver, which must be one of the following from the LIBSVM.jl package:

    • LIBSVM.Linearsolver.L2R_LR: L2-regularized logistic regression (primal))
    • LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: L2-regularized L2-loss support vector classification (dual)
    • LIBSVM.Linearsolver.L2R_L2LOSS_SVC: L2-regularized L2-loss support vector classification (primal)
    • LIBSVM.Linearsolver.L2R_L1LOSS_SVC_DUAL: L2-regularized L1-loss support vector classification (dual)
    • LIBSVM.Linearsolver.MCSVM_CS: support vector classification by Crammer and Singer) LIBSVM.Linearsolver.L1R_L2LOSS_SVC: L1-regularized L2-loss support vector classification)
    • LIBSVM.Linearsolver.L1R_LR: L1-regularized logistic regression
    • LIBSVM.Linearsolver.L2R_LR_DUAL: L2-regularized logistic regression (dual)
  • tolerance::Float64=Inf: tolerance for the stopping criterion;

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • bias= -1.0: if bias >= 0, instance x becomes [x; bias]; if bias < 0, no bias term added (default -1)

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Examples

using MLJ
 import LIBSVM
 
@@ -25,4 +25,4 @@
 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
  "versicolor"
  "versicolor"
- "versicolor"

See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.

+ "versicolor"

See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html b/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html index 2257de071..7a15fbb7b 100644 --- a/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LogisticCVClassifier · MLJ

LogisticCVClassifier

LogisticCVClassifier

A model type for constructing a logistic regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticCVClassifier = @load LogisticCVClassifier pkg=MLJScikitLearnInterface

Do model = LogisticCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticCVClassifier(Cs=...).

Hyper-parameters

  • Cs = 10
  • fit_intercept = true
  • cv = 5
  • dual = false
  • penalty = l2
  • scoring = nothing
  • solver = lbfgs
  • tol = 0.0001
  • max_iter = 100
  • class_weight = nothing
  • n_jobs = nothing
  • verbose = 0
  • refit = true
  • intercept_scaling = 1.0
  • multi_class = auto
  • random_state = nothing
  • l1_ratios = nothing
+LogisticCVClassifier · MLJ

LogisticCVClassifier

LogisticCVClassifier

A model type for constructing a logistic regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticCVClassifier = @load LogisticCVClassifier pkg=MLJScikitLearnInterface

Do model = LogisticCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticCVClassifier(Cs=...).

Hyper-parameters

  • Cs = 10
  • fit_intercept = true
  • cv = 5
  • dual = false
  • penalty = l2
  • scoring = nothing
  • solver = lbfgs
  • tol = 0.0001
  • max_iter = 100
  • class_weight = nothing
  • n_jobs = nothing
  • verbose = 0
  • refit = true
  • intercept_scaling = 1.0
  • multi_class = auto
  • random_state = nothing
  • l1_ratios = nothing
diff --git a/dev/models/LogisticClassifier_MLJLinearModels/index.html b/dev/models/LogisticClassifier_MLJLinearModels/index.html index 125ba2d94..b93a55327 100644 --- a/dev/models/LogisticClassifier_MLJLinearModels/index.html +++ b/dev/models/LogisticClassifier_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels

Do model = LogisticClassifier() to construct an instance with default hyper-parameters.

This model is more commonly known as "logistic regression". It is a standard classifier for both binary and multiclass classification. The objective function applies either a logistic loss (binary target) or multinomial (softmax) loss, and has a mixed L1/L2 penalty:

$

L(y, Xθ) + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

.

Here $L$ is either MLJLinearModels.LogisticLoss or MLJLinearModels.MultiClassLoss, $λ$ and $γ$ indicate the strength of the L2 (resp. L1) regularization components and $n$ is the number of training observations.

With scale_penalty_with_samples = false the objective function is instead

$

L(y, Xθ) + λ|θ|₂²/2 + γ|θ|₁ $

.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1 and strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, Newton, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels

Do model = LogisticClassifier() to construct an instance with default hyper-parameters.

This model is more commonly known as "logistic regression". It is a standard classifier for both binary and multiclass classification. The objective function applies either a logistic loss (binary target) or multinomial (softmax) loss, and has a mixed L1/L2 penalty:

$

L(y, Xθ) + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

.

Here $L$ is either MLJLinearModels.LogisticLoss or MLJLinearModels.MultiClassLoss, $λ$ and $γ$ indicate the strength of the L2 (resp. L1) regularization components and $n$ is the number of training observations.

With scale_penalty_with_samples = false the objective function is instead

$

L(y, Xθ) + λ|θ|₂²/2 + γ|θ|₁ $

.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1 and strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, Newton, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_blobs(centers = 2)
 mach = fit!(machine(LogisticClassifier(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also MultinomialClassifier.

+fitted_params(mach)

See also MultinomialClassifier.

diff --git a/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html b/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html index 2f2e8ca35..646264ccc 100644 --- a/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJScikitLearnInterface

Do model = LogisticClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • dual = false
  • tol = 0.0001
  • C = 1.0
  • fit_intercept = true
  • intercept_scaling = 1.0
  • class_weight = nothing
  • random_state = nothing
  • solver = lbfgs
  • max_iter = 100
  • multi_class = auto
  • verbose = 0
  • warm_start = false
  • n_jobs = nothing
  • l1_ratio = nothing
+LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJScikitLearnInterface

Do model = LogisticClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • dual = false
  • tol = 0.0001
  • C = 1.0
  • fit_intercept = true
  • intercept_scaling = 1.0
  • class_weight = nothing
  • random_state = nothing
  • solver = lbfgs
  • max_iter = 100
  • multi_class = auto
  • verbose = 0
  • warm_start = false
  • n_jobs = nothing
  • l1_ratio = nothing
diff --git a/dev/models/MCDDetector_OutlierDetectionPython/index.html b/dev/models/MCDDetector_OutlierDetectionPython/index.html index f7be9574e..265ca7dd0 100644 --- a/dev/models/MCDDetector_OutlierDetectionPython/index.html +++ b/dev/models/MCDDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -MCDDetector · MLJ

MCDDetector

MCDDetector(store_precision = true,
+MCDDetector · MLJ
+               random_state = nothing)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.mcd

diff --git a/dev/models/MeanShift_MLJScikitLearnInterface/index.html b/dev/models/MeanShift_MLJScikitLearnInterface/index.html index 1f80b0561..6ddfc1370 100644 --- a/dev/models/MeanShift_MLJScikitLearnInterface/index.html +++ b/dev/models/MeanShift_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MeanShift · MLJ

MeanShift

MeanShift

A model type for constructing a mean shift, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MeanShift = @load MeanShift pkg=MLJScikitLearnInterface

Do model = MeanShift() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MeanShift(bandwidth=...).

Mean shift clustering using a flat kernel. Mean shift clustering aims to discover "blobs" in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids."

+MeanShift · MLJ

MeanShift

MeanShift

A model type for constructing a mean shift, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MeanShift = @load MeanShift pkg=MLJScikitLearnInterface

Do model = MeanShift() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MeanShift(bandwidth=...).

Mean shift clustering using a flat kernel. Mean shift clustering aims to discover "blobs" in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids."

diff --git a/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html b/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html index 8c7da0c23..49f8c7f27 100644 --- a/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html +++ b/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MiniBatchKMeans · MLJ

MiniBatchKMeans

MiniBatchKMeans

A model type for constructing a Mini-Batch K-Means clustering., based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MiniBatchKMeans = @load MiniBatchKMeans pkg=MLJScikitLearnInterface

Do model = MiniBatchKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MiniBatchKMeans(n_clusters=...).

Hyper-parameters

  • n_clusters = 8
  • max_iter = 100
  • batch_size = 100
  • verbose = 0
  • compute_labels = true
  • random_state = nothing
  • tol = 0.0
  • max_no_improvement = 10
  • init_size = nothing
  • n_init = 3
  • init = k-means++
  • reassignment_ratio = 0.01
+MiniBatchKMeans · MLJ

MiniBatchKMeans

MiniBatchKMeans

A model type for constructing a Mini-Batch K-Means clustering., based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MiniBatchKMeans = @load MiniBatchKMeans pkg=MLJScikitLearnInterface

Do model = MiniBatchKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MiniBatchKMeans(n_clusters=...).

Hyper-parameters

  • n_clusters = 8
  • max_iter = 100
  • batch_size = 100
  • verbose = 0
  • compute_labels = true
  • random_state = nothing
  • tol = 0.0
  • max_no_improvement = 10
  • init_size = nothing
  • n_init = 3
  • init = k-means++
  • reassignment_ratio = 0.01
diff --git a/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html index 8a57e64c9..274f2aac5 100644 --- a/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskElasticNetCVRegressor · MLJ

MultiTaskElasticNetCVRegressor

MultiTaskElasticNetCVRegressor

A model type for constructing a multi-target elastic net regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetCVRegressor = @load MultiTaskElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • random_state = nothing
  • selection = cyclic
+MultiTaskElasticNetCVRegressor · MLJ

MultiTaskElasticNetCVRegressor

MultiTaskElasticNetCVRegressor

A model type for constructing a multi-target elastic net regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetCVRegressor = @load MultiTaskElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html index b36a83600..b749fb85c 100644 --- a/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskElasticNetRegressor · MLJ

MultiTaskElasticNetRegressor

MultiTaskElasticNetRegressor

A model type for constructing a multi-target elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetRegressor = @load MultiTaskElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • random_state = nothing
  • selection = cyclic
+MultiTaskElasticNetRegressor · MLJ

MultiTaskElasticNetRegressor

MultiTaskElasticNetRegressor

A model type for constructing a multi-target elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetRegressor = @load MultiTaskElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html index 34f497c9f..fd166ab9f 100644 --- a/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskLassoCVRegressor · MLJ

MultiTaskLassoCVRegressor

MultiTaskLassoCVRegressor

A model type for constructing a multi-target lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoCVRegressor = @load MultiTaskLassoCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 300
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = 1
  • random_state = nothing
  • selection = cyclic
+MultiTaskLassoCVRegressor · MLJ

MultiTaskLassoCVRegressor

MultiTaskLassoCVRegressor

A model type for constructing a multi-target lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoCVRegressor = @load MultiTaskLassoCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 300
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = 1
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html index 5d90523d9..341edacbd 100644 --- a/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskLassoRegressor · MLJ

MultiTaskLassoRegressor

MultiTaskLassoRegressor

A model type for constructing a multi-target lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoRegressor = @load MultiTaskLassoRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • random_state = nothing
  • selection = cyclic
+MultiTaskLassoRegressor · MLJ

MultiTaskLassoRegressor

MultiTaskLassoRegressor

A model type for constructing a multi-target lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoRegressor = @load MultiTaskLassoRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultinomialClassifier_MLJLinearModels/index.html b/dev/models/MultinomialClassifier_MLJLinearModels/index.html index be40153cb..d7283cee5 100644 --- a/dev/models/MultinomialClassifier_MLJLinearModels/index.html +++ b/dev/models/MultinomialClassifier_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -MultinomialClassifier · MLJ

MultinomialClassifier

MultinomialClassifier

A model type for constructing a multinomial classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialClassifier = @load MultinomialClassifier pkg=MLJLinearModels

Do model = MultinomialClassifier() to construct an instance with default hyper-parameters.

This model coincides with LogisticClassifier, except certain optimizations possible in the special binary case will not be applied. Its hyperparameters are identical.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+MultinomialClassifier · MLJ

MultinomialClassifier

MultinomialClassifier

A model type for constructing a multinomial classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialClassifier = @load MultinomialClassifier pkg=MLJLinearModels

Do model = MultinomialClassifier() to construct an instance with default hyper-parameters.

This model coincides with LogisticClassifier, except certain optimizations possible in the special binary case will not be applied. Its hyperparameters are identical.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_blobs(centers = 3)
 mach = fit!(machine(MultinomialClassifier(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also LogisticClassifier.

+fitted_params(mach)

See also LogisticClassifier.

diff --git a/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html index 32e33e5d4..6d718ad0a 100644 --- a/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=MLJScikitLearnInterface

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

Multinomial naive bayes classifier. It is suitable for classification with discrete features (e.g. word counts for text classification).

+MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=MLJScikitLearnInterface

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

Multinomial naive bayes classifier. It is suitable for classification with discrete features (e.g. word counts for text classification).

diff --git a/dev/models/MultinomialNBClassifier_NaiveBayes/index.html b/dev/models/MultinomialNBClassifier_NaiveBayes/index.html index 386b2ea94..9338e84a5 100644 --- a/dev/models/MultinomialNBClassifier_NaiveBayes/index.html +++ b/dev/models/MultinomialNBClassifier_NaiveBayes/index.html @@ -1,5 +1,5 @@ -MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

The multinomial naive Bayes classifier is often applied when input features consist of a counts (scitype Count) and when observations for a fixed target class are generated from a multinomial distribution with fixed probability vector, but whose sample length varies from observation to observation. For example, features might represent word counts in text documents being classified by sentiment.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Count; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • alpha=1: Lindstone smoothing in estimation of multinomial probability vectors from training histograms (default corresponds to Laplacian smoothing).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.
  • x_counts: A dictionary containing the categorical counts of each input class.
  • x_totals: The sum of each count (input feature), ungrouped.
  • n_obs: The total number of observations in the training data.

Examples

using MLJ
+MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

The multinomial naive Bayes classifier is often applied when input features consist of a counts (scitype Count) and when observations for a fixed target class are generated from a multinomial distribution with fixed probability vector, but whose sample length varies from observation to observation. For example, features might represent word counts in text documents being classified by sentiment.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Count; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • alpha=1: Lindstone smoothing in estimation of multinomial probability vectors from training histograms (default corresponds to Laplacian smoothing).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.
  • x_counts: A dictionary containing the categorical counts of each input class.
  • x_totals: The sum of each count (input feature), ungrouped.
  • n_obs: The total number of observations in the training data.

Examples

using MLJ
 import TextAnalysis
 
 CountTransformer = @load CountTransformer pkg=MLJText
@@ -41,4 +41,4 @@
 log_loss(y_prob, y[5:6])
 
 ## point predictions:
-yhat = mode.(y_prob) ## or `predict_mode(mach2, rows=5:6)`

See also GaussianNBClassifier

+yhat = mode.(y_prob) ## or `predict_mode(mach2, rows=5:6)`

See also GaussianNBClassifier

diff --git a/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html b/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html index 17649fdfb..b38cf1bfc 100644 --- a/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html +++ b/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -MultitargetGaussianMixtureRegressor · MLJ

MultitargetGaussianMixtureRegressor

mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+MultitargetGaussianMixtureRegressor · MLJ

MultitargetGaussianMixtureRegressor

mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -32,4 +32,4 @@
  23.3358  51.6717
   ⋮       
  16.6843  38.3686
- 16.6843  38.3686
+ 16.6843 38.3686
diff --git a/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html b/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html index ec498600d..9afad905d 100644 --- a/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html +++ b/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -MultitargetKNNClassifier · MLJ

MultitargetKNNClassifier

MultitargetKNNClassifier

A model type for constructing a multitarget K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels

Do model = MultitargetKNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNClassifier(K=...).

Multi-target K-Nearest Neighbors Classifier (MultitargetKNNClassifier) is a variation of KNNClassifier that assumes the target variable is vector-valued with Multiclass or OrderedFactor components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • yis the target, which can be any table of responses whose element scitype is either<:Finite(<:Multiclassor<:OrderedFactorwill do); check the columns scitypes withschema(y). Each column ofy` is assumed to belong to a common categorical pool.
  • w is the observation weights which can either be nothing(default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.
  • output_type::Type{<:MultiUnivariateFinite}=DictTable : One of (ColumnTable, DictTable). The type of table type to use for predictions. Setting to ColumnTable might improve performance for narrow tables while setting to DictTable improves performance for wide tables.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are either a ColumnTable or DictTable of UnivariateFiniteVector columns depending on the value set for the output_type parameter discussed above. The probabilistic predictions are uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of each column of the table of probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ, StableRNGs
+MultitargetKNNClassifier · MLJ

MultitargetKNNClassifier

MultitargetKNNClassifier

A model type for constructing a multitarget K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels

Do model = MultitargetKNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNClassifier(K=...).

Multi-target K-Nearest Neighbors Classifier (MultitargetKNNClassifier) is a variation of KNNClassifier that assumes the target variable is vector-valued with Multiclass or OrderedFactor components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • yis the target, which can be any table of responses whose element scitype is either<:Finite(<:Multiclassor<:OrderedFactorwill do); check the columns scitypes withschema(y). Each column ofy` is assumed to belong to a common categorical pool.
  • w is the observation weights which can either be nothing(default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.
  • output_type::Type{<:MultiUnivariateFinite}=DictTable : One of (ColumnTable, DictTable). The type of table type to use for predictions. Setting to ColumnTable might improve performance for narrow tables while setting to DictTable improves performance for wide tables.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are either a ColumnTable or DictTable of UnivariateFiniteVector columns depending on the value set for the output_type parameter discussed above. The probabilistic predictions are uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of each column of the table of probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ, StableRNGs
 
 ## set rng for reproducibility
 rng = StableRNG(10)
@@ -28,4 +28,4 @@
 ## predict
 y_hat = predict(mach, X)
 labels = predict_mode(mach, X)
-

See also KNNClassifier

+

See also KNNClassifier

diff --git a/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html b/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html index acf1db9c7..b7615e131 100644 --- a/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html +++ b/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -MultitargetKNNRegressor · MLJ

MultitargetKNNRegressor

MultitargetKNNRegressor

A model type for constructing a multitarget K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels

Do model = MultitargetKNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNRegressor(K=...).

Multi-target K-Nearest Neighbors regressor (MultitargetKNNRegressor) is a variation of KNNRegressor that assumes the target variable is vector-valued with Continuous components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check column scitypes with schema(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
+MultitargetKNNRegressor · MLJ

MultitargetKNNRegressor

MultitargetKNNRegressor

A model type for constructing a multitarget K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels

Do model = MultitargetKNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNRegressor(K=...).

Multi-target K-Nearest Neighbors regressor (MultitargetKNNRegressor) is a variation of KNNRegressor that assumes the target variable is vector-valued with Continuous components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check column scitypes with schema(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
 
 ## Create Data
 X, y = make_regression(10, 5, n_targets=2)
@@ -18,4 +18,4 @@
 
 ## Predict
 y_hat = predict(mach, X)
-

See also KNNRegressor

+

See also KNNRegressor

diff --git a/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html b/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html index d24099803..5154f93b8 100644 --- a/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html +++ b/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -MultitargetLinearRegressor · MLJ

MultitargetLinearRegressor

MultitargetLinearRegressor

A model type for constructing a multitarget linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetLinearRegressor = @load MultitargetLinearRegressor pkg=MultivariateStats

Do model = MultitargetLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetLinearRegressor(bias=...).

MultitargetLinearRegressor assumes the target variable is vector-valued with continuous components. It trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+MultitargetLinearRegressor · MLJ

MultitargetLinearRegressor

MultitargetLinearRegressor

A model type for constructing a multitarget linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetLinearRegressor = @load MultitargetLinearRegressor pkg=MultivariateStats

Do model = MultitargetLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetLinearRegressor(bias=...).

MultitargetLinearRegressor assumes the target variable is vector-valued with continuous components. It trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 using DataFrames
 
 LinearRegressor = @load MultitargetLinearRegressor pkg=MultivariateStats
@@ -10,4 +10,4 @@
 mach = machine(linear_regressor, X, y) |> fit!
 
 Xnew, _ = make_regression(3, 9)
-yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

+yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

diff --git a/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html b/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html index fd884fed7..d75614bc8 100644 --- a/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html +++ b/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

mutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 300]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix

Example:

julia> using MLJ
+MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

mutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 300]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -38,4 +38,4 @@
   ⋮                   
  23.9  52.8  23.3573  50.654
  22.0  49.0  22.1141  48.5926
- 11.9  28.8  19.9639  45.5823
+ 11.9 28.8 19.9639 45.5823
diff --git a/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html b/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html index 9ff256659..9c31ac6d9 100644 --- a/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html +++ b/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html @@ -1,5 +1,5 @@ -MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

MultitargetNeuralNetworkRegressor

A model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux

Do model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).

MultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we apply a multi-target regression model to synthetic data:

using MLJ
+MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

MultitargetNeuralNetworkRegressor

A model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux

Do model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).

MultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we apply a multi-target regression model to synthetic data:

using MLJ
 import MLJFlux
 using Flux

First, we generate some synthetic data (needs MLJBase 0.20.16 or higher):

X, y = make_regression(100, 9; n_targets = 2) ## both tables
 schema(y)
@@ -24,4 +24,4 @@
 ## loss for `(Xtest, test)`:
 fit!(mach) ## trains on all data `(X, y)`
 yhat = predict(mach, Xtest)
-multi_loss(yhat, ytest)

See also NeuralNetworkRegressor

+multi_loss(yhat, ytest)

See also NeuralNetworkRegressor

diff --git a/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html b/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html index f2632cd2a..0a22c60f1 100644 --- a/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html +++ b/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -MultitargetRidgeRegressor · MLJ

MultitargetRidgeRegressor

MultitargetRidgeRegressor

A model type for constructing a multitarget ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats

Do model = MultitargetRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetRidgeRegressor(lambda=...).

Multi-target ridge regression adds a quadratic penalty term to multi-target least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. In this case, the output represents a response vector. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+MultitargetRidgeRegressor · MLJ

MultitargetRidgeRegressor

MultitargetRidgeRegressor

A model type for constructing a multitarget ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats

Do model = MultitargetRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetRidgeRegressor(lambda=...).

Multi-target ridge regression adds a quadratic penalty term to multi-target least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. In this case, the output represents a response vector. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 using DataFrames
 
 RidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats
@@ -10,4 +10,4 @@
 mach = machine(ridge_regressor, X, y) |> fit!
 
 Xnew, _ = make_regression(3, 6)
-yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, MultitargetLinearRegressor, RidgeRegressor

+yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, MultitargetLinearRegressor, RidgeRegressor

diff --git a/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html b/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html index bfc353c00..0825af3f1 100644 --- a/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html +++ b/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html @@ -1,11 +1,11 @@ -MultitargetSRRegressor · MLJ

MultitargetSRRegressor

MultitargetSRRegressor

A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression

Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(binary_operators=...).

Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype

Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.

  • y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
  • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous. The same weights are used for all targets.

Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

Hyper-parameters

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
    +MultitargetSRRegressor · MLJ

    MultitargetSRRegressor

    MultitargetSRRegressor

    A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

    From MLJ, the type can be imported using

    MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression

    Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(binary_operators=...).

    Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

    Training data

    In MLJ or MLJBase, bind an instance model to data with

    mach = machine(model, X, y)

    OR

    mach = machine(model, X, y, w)

    Here:

    • X is any table of input features (eg, a DataFrame) whose columns are of scitype

    Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.

    • y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
    • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous. The same weights are used for all targets.

    Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

    Hyper-parameters

    • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

    • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

    • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

    • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

    • batch_size: What batch size to use if using batching.

    • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

    • loss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

        function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
             prediction, flag = eval_tree_array(tree, dataset.X, options)
             if !flag
                 return L(Inf)
             end
             return sum((prediction .- dataset.y) .^ 2) / dataset.n
      -  end
    • node_type::Type{N}=Node: The type of node to use for the search. For example, Node or GraphNode.

    • populations: How many populations of equations to use.

    • population_size: How many equations in each population.

    • ncycles_per_iteration: How many generations to consider per iteration.

    • tournament_selection_n: Number of expressions considered in each tournament.

    • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

    • topn: Number of equations to return to the host process, and to consider for the hall of fame.

    • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

    • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

    • complexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.

    • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

    • maxsize: Maximum size of equations during the search.

    • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

    • parsimony: A multiplicative factor for how much complexity is punished.

    • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

    • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

    • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

    • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

    • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

    • bumper: Whether to use Bumper.jl for faster evaluation. Experimental!

    • migration: Whether to migrate equations between processes.

    • hof_migration: Whether to migrate equations from the hall of fame to processes.

    • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

    • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

    • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

    • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

    • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).

    • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

    • optimizer_probability: Probability of performing optimization of constants at the end of a given iteration.

    • optimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.

    • optimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 0 which means no limit.

    • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

    • output_file: What file to store equations to, as a backup.

    • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

    • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

    • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.

    • crossover_probability: Probability of performing crossover.

    • annealing: Whether to use simulated annealing.

    • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

    • verbosity: Whether to print debugging statements or not.

    • print_precision: How many digits to print when printing equations. By default, this is 5.

    • save_to_file: Whether to save equations to a file during the search.

    • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

    • una_constraints: Likewise, for unary operators.

    • seed: What random seed to use. nothing uses no seed.

    • progress: Whether to use a progress bar output (verbosity will have no effect).

    • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

    • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

    • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

    • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

    • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

    • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

    • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

    • niterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.

    • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".

    • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).

    • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.

    • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.

    • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.

    • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.

    • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.

    • selection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.

    • dimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.

    Operations

    • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.
    • predict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.

    Fitted parameters

    The fields of fitted_params(mach) are:

    • best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
    • equations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity. T is equal to the element type of the passed data.
    • equation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.

    Report

    The fields of report(mach) are:

    • best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
    • equations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity.
    • equation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.
    • complexities::Vector{Vector{Int}}: The complexity of each expression in each Pareto frontier.
    • losses::Vector{Vector{L}}: The loss of each expression in each Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.
    • scores::Vector{Vector{L}}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.

    Examples

    using MLJ
    +  end
  • populations: How many populations of equations to use.

  • population_size: How many equations in each population.

  • ncycles_per_iteration: How many generations to consider per iteration.

  • tournament_selection_n: Number of expressions considered in each tournament.

  • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

  • topn: Number of equations to return to the host process, and to consider for the hall of fame.

  • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

  • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

  • complexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.

  • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

  • maxsize: Maximum size of equations during the search.

  • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

  • parsimony: A multiplicative factor for how much complexity is punished.

  • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

  • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

  • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

  • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

  • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

  • migration: Whether to migrate equations between processes.

  • hof_migration: Whether to migrate equations from the hall of fame to processes.

  • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

  • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

  • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

  • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

  • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

  • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is "BFGS", but "NelderMead" is also supported.

  • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

  • output_file: What file to store equations to, as a backup.

  • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

  • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

  • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.

  • crossover_probability: Probability of performing crossover.

  • annealing: Whether to use simulated annealing.

  • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

  • verbosity: Whether to print debugging statements or not.

  • print_precision: How many digits to print when printing equations. By default, this is 5.

  • save_to_file: Whether to save equations to a file during the search.

  • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

  • una_constraints: Likewise, for unary operators.

  • seed: What random seed to use. nothing uses no seed.

  • progress: Whether to use a progress bar output (verbosity will have no effect).

  • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

  • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

  • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

  • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

  • enable_autodiff: Whether to enable automatic differentiation functionality. This is turned off by default. If turned on, this will be turned off if one of the operators does not have well-defined gradients.

  • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

  • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

  • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

  • niterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.

  • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".

  • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).

  • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.

  • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.

  • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.

  • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.

  • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.

  • selection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.

  • dimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.
  • predict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.

Fitted parameters

The fields of fitted_params(mach) are:

  • best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity. T is equal to the element type of the passed data.
  • equation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.

Report

The fields of report(mach) are:

  • best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity.
  • equation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.
  • complexities::Vector{Vector{Int}}: The complexity of each expression in each Pareto frontier.
  • losses::Vector{Vector{L}}: The loss of each expression in each Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.
  • scores::Vector{Vector{L}}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.

Examples

using MLJ
 MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression
 X = (a=rand(100), b=rand(100), c=rand(100))
 Y = (y1=(@. cos(X.c) * 2.1 - 0.9), y2=(@. X.a * X.b + X.c))
@@ -17,4 +17,4 @@
 r = report(mach)
 for (output_index, (eq, i)) in enumerate(zip(r.equation_strings, r.best_idx))
     println("Equation used for ", output_index, ": ", eq[i])
-end

See also SRRegressor.

+end

See also SRRegressor.

diff --git a/dev/models/NeuralNetworkClassifier_BetaML/index.html b/dev/models/NeuralNetworkClassifier_BetaML/index.html index 36e7db51d..2bdbf1d2e 100644 --- a/dev/models/NeuralNetworkClassifier_BetaML/index.html +++ b/dev/models/NeuralNetworkClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

mutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers. The last "softmax" layer is automatically added.

  • loss: Loss (cost) function [def: BetaML.crossentropy]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dcrossentropy, i.e. the derivative of the cross-entropy]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • categories: The categories to represent as columns. [def: nothing, i.e. unique training values].

  • handle_unknown: How to handle categories not seens in training or not present in the provided categories array? "error" (default) rises an error, "infrequent" adds a specific column for these categories.

  • other_categories_name: Which value during prediction to assign to this "other" category (i.e. categories not seen on training or not present in the provided categories array? [def: nothing, i.e. typemax(Int64) for integer vectors and "other" for other types]. This setting is active only if handle_unknown="infrequent" and in that case it MUST be specified if Y is neither integer or strings

  • rng: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.

Example:

julia> using MLJ
+NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

mutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers. The last "softmax" layer is automatically added.

  • loss: Loss (cost) function [def: BetaML.crossentropy]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dcrossentropy, i.e. the derivative of the cross-entropy]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • categories: The categories to represent as columns. [def: nothing, i.e. unique training values].

  • handle_unknown: How to handle categories not seens in training or not present in the provided categories array? "error" (default) rises an error, "infrequent" adds a specific column for these categories.

  • other_categories_name: Which value during prediction to assign to this "other" category (i.e. categories not seen on training or not present in the provided categories array? [def: nothing, i.e. typemax(Int64) for integer vectors and "other" for other types]. This setting is active only if handle_unknown="infrequent" and in that case it MUST be specified if Y is neither integer or strings

  • rng: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -34,4 +34,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>0.573, versicolor=>0.213, virginica=>0.213)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.236, versicolor=>0.236, virginica=>0.529)
- UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)
diff --git a/dev/models/NeuralNetworkClassifier_MLJFlux/index.html b/dev/models/NeuralNetworkClassifier_MLJFlux/index.html index 334ea3e84..8a52ab876 100644 --- a/dev/models/NeuralNetworkClassifier_MLJFlux/index.html +++ b/dev/models/NeuralNetworkClassifier_MLJFlux/index.html @@ -1,5 +1,5 @@ -NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

NeuralNetworkClassifier

A model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux

Do model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).

NeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.

using MLJ
+NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

NeuralNetworkClassifier

A model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux

Do model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).

NeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.

using MLJ
 using Flux
 import RDatasets

First, we can load the data:

iris = RDatasets.dataset("datasets", "iris");
 y, X = unpack(iris, ==(:Species), rng=123); ## a vector and a table
@@ -19,4 +19,4 @@
      xlab=curve.parameter_name,
      xscale=curve.parameter_scale,
      ylab = "Cross Entropy")
-

See also ImageClassifier.

+

See also ImageClassifier.

diff --git a/dev/models/NeuralNetworkRegressor_BetaML/index.html b/dev/models/NeuralNetworkRegressor_BetaML/index.html index 91db61bf9..91e8e395e 100644 --- a/dev/models/NeuralNetworkRegressor_BetaML/index.html +++ b/dev/models/NeuralNetworkRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

mutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices, even if the regression task is 1-D

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be be a n-records vector.

Example:

julia> using MLJ
+NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

mutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices, even if the regression task is 1-D

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be be a n-records vector.

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -35,4 +35,4 @@
   ⋮    
  23.9  30.9032
  22.0  29.49
- 11.9  27.2438
+ 11.9 27.2438
diff --git a/dev/models/NeuralNetworkRegressor_MLJFlux/index.html b/dev/models/NeuralNetworkRegressor_MLJFlux/index.html index 17b756c63..6e6be21a5 100644 --- a/dev/models/NeuralNetworkRegressor_MLJFlux/index.html +++ b/dev/models/NeuralNetworkRegressor_MLJFlux/index.html @@ -1,5 +1,5 @@ -NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

NeuralNetworkRegressor

A model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux

Do model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).

NeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increasing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a regression model for the Boston house price dataset.

using MLJ
+NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

NeuralNetworkRegressor

A model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux

Do model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).

NeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increasing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a regression model for the Boston house price dataset.

using MLJ
 import MLJFlux
 using Flux

First, we load in the data: The :MEDV column becomes the target vector y, and all remaining columns go into a table X, with the exception of :CHAS:

data = OpenML.load(531); ## Loads from https://www.openml.org/d/531
 y, X = unpack(data, ==(:MEDV), !=(:CHAS); rng=123);
@@ -42,4 +42,4 @@
 ## loss for `(Xtest, test)`:
 fit!(mach) ## train on `(X, y)`
 yhat = predict(mach, Xtest)
-l2(yhat, ytest)  |> mean

These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.

For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.

See also MultitargetNeuralNetworkRegressor

+l2(yhat, ytest) |> mean

These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.

For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.

See also MultitargetNeuralNetworkRegressor

diff --git a/dev/models/NuSVC_LIBSVM/index.html b/dev/models/NuSVC_LIBSVM/index.html index 9c50c2a83..dc188db60 100644 --- a/dev/models/NuSVC_LIBSVM/index.html +++ b/dev/models/NuSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -NuSVC · MLJ

NuSVC

NuSVC

A model type for constructing a ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVC = @load NuSVC pkg=LIBSVM

Do model = NuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVC(kernel=...).

This model is a re-parameterization of the SVC classifier, where nu replaces cost, and is mathematically equivalent to it. The parameter nu allows more direct control over the number of support vectors (see under "Hyper-parameters").

This model always predicts actual class labels. For probabilistic predictions, use instead ProbabilisticNuSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+NuSVC · MLJ

NuSVC

NuSVC

A model type for constructing a ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVC = @load NuSVC pkg=LIBSVM

Do model = NuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVC(kernel=...).

This model is a re-parameterization of the SVC classifier, where nu replaces cost, and is mathematically equivalent to it. The parameter nu allows more direct control over the number of support vectors (see under "Hyper-parameters").

This model always predicts actual class labels. For probabilistic predictions, use instead ProbabilisticNuSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 NuSVC = @load NuSVC pkg=LIBSVM                 ## model type
@@ -25,4 +25,4 @@
 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
  "virginica"
  "virginica"
- "virginica"

See also the classifiers SVC and LinearSVC, LIVSVM.jl and the original C implementation. documentation.

+ "virginica"

See also the classifiers SVC and LinearSVC, LIVSVM.jl and the original C implementation. documentation.

diff --git a/dev/models/NuSVR_LIBSVM/index.html b/dev/models/NuSVR_LIBSVM/index.html index ba851c420..2fad5a6f1 100644 --- a/dev/models/NuSVR_LIBSVM/index.html +++ b/dev/models/NuSVR_LIBSVM/index.html @@ -1,5 +1,5 @@ -NuSVR · MLJ

NuSVR

NuSVR

A model type for constructing a ν-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVR = @load NuSVR pkg=LIBSVM

Do model = NuSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is a re-parameterization of EpsilonSVR in which the epsilon hyper-parameter is replaced with a new parameter nu (denoted $ν$ in the cited reference) which attempts to control the number of support vectors directly.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

    • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be

    called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Denoted $ν$ in the cited paper. Changing nu changes the thickness of some neighborhood of the graph of the prediction function ("tube" or "slab") and a training error is said to occur when a data point (x, y) lies outside of that neighborhood.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+NuSVR · MLJ

NuSVR

NuSVR

A model type for constructing a ν-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVR = @load NuSVR pkg=LIBSVM

Do model = NuSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is a re-parameterization of EpsilonSVR in which the epsilon hyper-parameter is replaced with a new parameter nu (denoted $ν$ in the cited reference) which attempts to control the number of support vectors directly.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

    • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be

    called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Denoted $ν$ in the cited paper. Changing nu changes the thickness of some neighborhood of the graph of the prediction function ("tube" or "slab") and a training error is said to occur when a data point (x, y) lies outside of that neighborhood.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 NuSVR = @load NuSVR pkg=LIBSVM                 ## model type
@@ -22,4 +22,4 @@
 3-element Vector{Float64}:
   1.1211558175964662
   0.06677125944808422
- -0.6817578942749346

See also EpsilonSVR, LIVSVM.jl and the original C implementation documentation.

+ -0.6817578942749346

See also EpsilonSVR, LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/OCSVMDetector_OutlierDetectionPython/index.html b/dev/models/OCSVMDetector_OutlierDetectionPython/index.html index 08e7915ff..8c39a9822 100644 --- a/dev/models/OCSVMDetector_OutlierDetectionPython/index.html +++ b/dev/models/OCSVMDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -OCSVMDetector · MLJ

OCSVMDetector

OCSVMDetector(kernel = "rbf",
+OCSVMDetector · MLJ
+                 max_iter = -1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.ocsvm

diff --git a/dev/models/OPTICS_MLJScikitLearnInterface/index.html b/dev/models/OPTICS_MLJScikitLearnInterface/index.html index a01feb2ac..ae9ec3002 100644 --- a/dev/models/OPTICS_MLJScikitLearnInterface/index.html +++ b/dev/models/OPTICS_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -OPTICS · MLJ

OPTICS

OPTICS

A model type for constructing a optics, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OPTICS = @load OPTICS pkg=MLJScikitLearnInterface

Do model = OPTICS() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OPTICS(min_samples=...).

OPTICS (Ordering Points To Identify the Clustering Structure), closely related to `DBSCAN', finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.

+OPTICS · MLJ

OPTICS

OPTICS

A model type for constructing a optics, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OPTICS = @load OPTICS pkg=MLJScikitLearnInterface

Do model = OPTICS() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OPTICS(min_samples=...).

OPTICS (Ordering Points To Identify the Clustering Structure), closely related to `DBSCAN', finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.

diff --git a/dev/models/OneClassSVM_LIBSVM/index.html b/dev/models/OneClassSVM_LIBSVM/index.html index 159e50a8f..7c294218d 100644 --- a/dev/models/OneClassSVM_LIBSVM/index.html +++ b/dev/models/OneClassSVM_LIBSVM/index.html @@ -1,5 +1,5 @@ -OneClassSVM · MLJ

OneClassSVM

OneClassSVM

A model type for constructing a one-class support vector machine, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneClassSVM = @load OneClassSVM pkg=LIBSVM

Do model = OneClassSVM() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneClassSVM(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an outlier detection model delivering raw scores based on the decision function of a support vector machine. Like the NuSVC classifier, it uses the nu re-parameterization of the cost parameter appearing in standard support vector classification SVC.

To extract normalized scores ("probabilities") wrap the model using ProbabilisticDetector from OutlierDetection.jl. For threshold-based classification, wrap the probabilistic model using MLJ's BinaryThresholdPredictor. Examples of wrapping appear below.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • transform(mach, Xnew): return scores for outlierness, given features Xnew having the same scitype as X above. The greater the score, the more likely it is an outlier. This score is based on the SVM decision function. For normalized scores, wrap model using ProbabilisticDetector from OutlierDetection.jl and call predict instead, and for threshold-based classification, wrap again using BinaryThresholdPredictor. See the examples below.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • orientation: this equals 1 if the decision function for libsvm_model is increasing with increasing outlierness, and -1 if it is decreasing instead. Correspondingly, the libsvm_model attaches true to outliers in the first case, and false in the second. (The scores given in the MLJ report and generated by MLJ.transform already correct for this ambiguity, which is therefore only an issue for users directly accessing libsvm_model.)

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Generating raw scores for outlierness

using MLJ
+OneClassSVM · MLJ

OneClassSVM

OneClassSVM

A model type for constructing a one-class support vector machine, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneClassSVM = @load OneClassSVM pkg=LIBSVM

Do model = OneClassSVM() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneClassSVM(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an outlier detection model delivering raw scores based on the decision function of a support vector machine. Like the NuSVC classifier, it uses the nu re-parameterization of the cost parameter appearing in standard support vector classification SVC.

To extract normalized scores ("probabilities") wrap the model using ProbabilisticDetector from OutlierDetection.jl. For threshold-based classification, wrap the probabilistic model using MLJ's BinaryThresholdPredictor. Examples of wrapping appear below.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • transform(mach, Xnew): return scores for outlierness, given features Xnew having the same scitype as X above. The greater the score, the more likely it is an outlier. This score is based on the SVM decision function. For normalized scores, wrap model using ProbabilisticDetector from OutlierDetection.jl and call predict instead, and for threshold-based classification, wrap again using BinaryThresholdPredictor. See the examples below.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • orientation: this equals 1 if the decision function for libsvm_model is increasing with increasing outlierness, and -1 if it is decreasing instead. Correspondingly, the libsvm_model attaches true to outliers in the first case, and false in the second. (The scores given in the MLJ report and generated by MLJ.transform already correct for this ambiguity, which is therefore only an issue for users directly accessing libsvm_model.)

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Generating raw scores for outlierness

using MLJ
 import LIBSVM
 import StableRNGs.StableRNG
 
@@ -64,4 +64,4 @@
 julia> yhat = transform(mach, Xnew)
 2-element Vector{Float64}:
  -0.4825363352732942
- -0.4848772169720227

See also LIVSVM.jl and the original C implementation documentation. For an alternative source of outlier detection models with an MLJ interface, see OutlierDetection.jl.

+ -0.4848772169720227

See also LIVSVM.jl and the original C implementation documentation. For an alternative source of outlier detection models with an MLJ interface, see OutlierDetection.jl.

diff --git a/dev/models/OneHotEncoder_MLJModels/index.html b/dev/models/OneHotEncoder_MLJModels/index.html index 3a210d49f..c89ba0737 100644 --- a/dev/models/OneHotEncoder_MLJModels/index.html +++ b/dev/models/OneHotEncoder_MLJModels/index.html @@ -1,5 +1,5 @@ -OneHotEncoder · MLJ

OneHotEncoder

OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.
  • ordered_factor=false: when true, OrderedFactor features are universally excluded
  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach) are:

  • all_features: names of all features encountered in training
  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name
  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded
  • new_features: names of all output features

Example

using MLJ
+OneHotEncoder · MLJ

OneHotEncoder

OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.
  • ordered_factor=false: when true, OrderedFactor features are universally excluded
  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach) are:

  • all_features: names of all features encountered in training
  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name
  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded
  • new_features: names of all output features

Example

using MLJ
 
 X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
      grade=categorical(["A", "B", "A", "C"], ordered=true),
@@ -31,4 +31,4 @@
 │ grade__B     │ Continuous │
 │ height       │ Continuous │
 │ n_devices    │ Count      │
-└──────────────┴────────────┘

See also ContinuousEncoder.

+└──────────────┴────────────┘

See also ContinuousEncoder.

diff --git a/dev/models/OneRuleClassifier_OneRule/index.html b/dev/models/OneRuleClassifier_OneRule/index.html index ba93beb79..172f1be20 100644 --- a/dev/models/OneRuleClassifier_OneRule/index.html +++ b/dev/models/OneRuleClassifier_OneRule/index.html @@ -1,5 +1,5 @@ -OneRuleClassifier · MLJ

OneRuleClassifier

OneRuleClassifier

A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneRuleClassifier = @load OneRuleClassifier pkg=OneRule

Do model = OneRuleClassifier() to construct an instance with default hyper-parameters.

OneRuleClassifier implements the OneRule method for classification by Robert Holte ("Very simple classification rules perform well on most commonly used datasets" in: Machine Learning 11.1 (1993), pp. 63-90).

For more information see:
+OneRuleClassifier · MLJ

OneRuleClassifier

OneRuleClassifier

A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneRuleClassifier = @load OneRuleClassifier pkg=OneRule

Do model = OneRuleClassifier() to construct an instance with default hyper-parameters.

OneRuleClassifier implements the OneRule method for classification by Robert Holte ("Very simple classification rules perform well on most commonly used datasets" in: Machine Learning 11.1 (1993), pp. 63-90).

For more information see:
 
 - Witten, Ian H., Eibe Frank, and Mark A. Hall. 
   Data Mining Practical Machine Learning Tools and Techniques Third Edition. 
@@ -27,4 +27,4 @@
 
 yhat = MLJ.predict(mach, weather)       ## in a real context 'new' `weather` data would be used
 one_tree = fitted_params(mach).tree
-report(mach).error_rate

See also OneRule.jl.

+report(mach).error_rate

See also OneRule.jl.

diff --git a/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html index 9461f7dac..096405ae8 100644 --- a/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -OrthogonalMatchingPursuitCVRegressor · MLJ

OrthogonalMatchingPursuitCVRegressor

OrthogonalMatchingPursuitCVRegressor

A model type for constructing a orthogonal ,atching pursuit (OMP) model with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitCVRegressor = @load OrthogonalMatchingPursuitCVRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitCVRegressor(copy=...).

Hyper-parameters

  • copy = true
  • fit_intercept = true
  • normalize = false
  • max_iter = nothing
  • cv = 5
  • n_jobs = 1
  • verbose = false
+OrthogonalMatchingPursuitCVRegressor · MLJ

OrthogonalMatchingPursuitCVRegressor

OrthogonalMatchingPursuitCVRegressor

A model type for constructing a orthogonal ,atching pursuit (OMP) model with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitCVRegressor = @load OrthogonalMatchingPursuitCVRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitCVRegressor(copy=...).

Hyper-parameters

  • copy = true
  • fit_intercept = true
  • max_iter = nothing
  • cv = 5
  • n_jobs = 1
  • verbose = false
diff --git a/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html b/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html index 2a5342bee..e848bc881 100644 --- a/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -OrthogonalMatchingPursuitRegressor · MLJ

OrthogonalMatchingPursuitRegressor

OrthogonalMatchingPursuitRegressor

A model type for constructing a orthogonal matching pursuit regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitRegressor = @load OrthogonalMatchingPursuitRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitRegressor(n_nonzero_coefs=...).

Hyper-parameters

  • n_nonzero_coefs = nothing
  • tol = nothing
  • fit_intercept = true
  • normalize = false
  • precompute = auto
+OrthogonalMatchingPursuitRegressor · MLJ

OrthogonalMatchingPursuitRegressor

OrthogonalMatchingPursuitRegressor

A model type for constructing a orthogonal matching pursuit regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitRegressor = @load OrthogonalMatchingPursuitRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitRegressor(n_nonzero_coefs=...).

Hyper-parameters

  • n_nonzero_coefs = nothing
  • tol = nothing
  • fit_intercept = true
  • precompute = auto
diff --git a/dev/models/PCADetector_OutlierDetectionPython/index.html b/dev/models/PCADetector_OutlierDetectionPython/index.html index fc4fb139d..3cdba43e5 100644 --- a/dev/models/PCADetector_OutlierDetectionPython/index.html +++ b/dev/models/PCADetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -PCADetector · MLJ

PCADetector

PCADetector(n_components = nothing,
+PCADetector · MLJ
+               random_state = nothing)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.pca

diff --git a/dev/models/PCA_MultivariateStats/index.html b/dev/models/PCA_MultivariateStats/index.html index 4e52441eb..66b43e47c 100644 --- a/dev/models/PCA_MultivariateStats/index.html +++ b/dev/models/PCA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -PCA · MLJ

PCA

PCA

A model type for constructing a pca, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PCA = @load PCA pkg=MultivariateStats

Do model = PCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PCA(maxoutdim=...).

Principal component analysis learns a linear projection onto a lower dimensional space while preserving most of the initial variance seen in the training data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Together with variance_ratio, controls the output dimension outdim chosen by the model. Specifically, suppose that k is the smallest integer such that retaining the k most significant principal components accounts for variance_ratio of the total variance in the training data. Then outdim = min(outdim, maxoutdim). If maxoutdim=0 (default) then the effective maxoutdim is min(n, indim - 1) where n is the number of observations and indim the number of features in the training data.

  • variance_ratio::Float64=0.99: The ratio of variance preserved after the transformation

  • method=:auto: The method to use to solve the problem. Choices are

    • :svd: Support Vector Decomposition of the matrix.
    • :cov: Covariance matrix decomposition.
    • :auto: Use :cov if the matrices first dimension is smaller than its second dimension and otherwise use :svd
  • mean=nothing: if nothing, centering will be computed and applied, if set to 0 no centering (data is assumed pre-centered); if a vector is passed, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and output respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim = min(n, indim, maxoutdim) is the output dimension; here n is the number of observations.
  • tprincipalvar: Total variance of the principal components.
  • tresidualvar: Total residual variance.
  • tvar: Total observation variance (principal + residual variance).
  • mean: The mean of the untransformed training data, of length indim.
  • principalvars: The variance of the principal components. An AbstractVector of length outdim
  • loadings: The models loadings, weights for each variable used when calculating principal components. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
+PCA · MLJ

PCA

PCA

A model type for constructing a pca, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PCA = @load PCA pkg=MultivariateStats

Do model = PCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PCA(maxoutdim=...).

Principal component analysis learns a linear projection onto a lower dimensional space while preserving most of the initial variance seen in the training data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Together with variance_ratio, controls the output dimension outdim chosen by the model. Specifically, suppose that k is the smallest integer such that retaining the k most significant principal components accounts for variance_ratio of the total variance in the training data. Then outdim = min(outdim, maxoutdim). If maxoutdim=0 (default) then the effective maxoutdim is min(n, indim - 1) where n is the number of observations and indim the number of features in the training data.

  • variance_ratio::Float64=0.99: The ratio of variance preserved after the transformation

  • method=:auto: The method to use to solve the problem. Choices are

    • :svd: Support Vector Decomposition of the matrix.
    • :cov: Covariance matrix decomposition.
    • :auto: Use :cov if the matrices first dimension is smaller than its second dimension and otherwise use :svd
  • mean=nothing: if nothing, centering will be computed and applied, if set to 0 no centering (data is assumed pre-centered); if a vector is passed, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and output respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim = min(n, indim, maxoutdim) is the output dimension; here n is the number of observations.
  • tprincipalvar: Total variance of the principal components.
  • tresidualvar: Total residual variance.
  • tvar: Total observation variance (principal + residual variance).
  • mean: The mean of the untransformed training data, of length indim.
  • principalvars: The variance of the principal components. An AbstractVector of length outdim
  • loadings: The models loadings, weights for each variable used when calculating principal components. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
 
 PCA = @load PCA pkg=MultivariateStats
 
@@ -8,4 +8,4 @@
 model = PCA(maxoutdim=2)
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PPCA

+Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PPCA

diff --git a/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html b/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html index f4b7eb875..4edce036a 100644 --- a/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html +++ b/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html @@ -1,2 +1,2 @@ -PLSRegressor · MLJ

PLSRegressor

A Partial Least Squares Regressor. Contains PLS1, PLS2 (multi target) algorithms. Can be used mainly for regression.

+PLSRegressor · MLJ

PLSRegressor

A Partial Least Squares Regressor. Contains PLS1, PLS2 (multi target) algorithms. Can be used mainly for regression.

diff --git a/dev/models/PPCA_MultivariateStats/index.html b/dev/models/PPCA_MultivariateStats/index.html index 42cfff5ae..926b28ae6 100644 --- a/dev/models/PPCA_MultivariateStats/index.html +++ b/dev/models/PPCA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -PPCA · MLJ

PPCA

PPCA

A model type for constructing a probabilistic PCA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PPCA = @load PPCA pkg=MultivariateStats

Do model = PPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PPCA(maxoutdim=...).

Probabilistic principal component analysis is a dimension-reduction algorithm which represents a constrained form of the Gaussian distribution in which the number of free parameters can be restricted while still allowing the model to capture the dominant correlations in a data set. It is expressed as the maximum likelihood solution of a probabilistic latent variable model. For details, see Bishop (2006): C. M. Pattern Recognition and Machine Learning.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • method::Symbol=:ml: The method to use to solve the problem, one of :ml, :em, :bayes.
  • maxiter::Int=1000: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a principal component.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • tvat: The variance of the components.
  • loadings: The model's loadings matrix. A matrix of size (indim, outdim) where indim and outdim as as defined above.

Examples

using MLJ
+PPCA · MLJ

PPCA

PPCA

A model type for constructing a probabilistic PCA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PPCA = @load PPCA pkg=MultivariateStats

Do model = PPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PPCA(maxoutdim=...).

Probabilistic principal component analysis is a dimension-reduction algorithm which represents a constrained form of the Gaussian distribution in which the number of free parameters can be restricted while still allowing the model to capture the dominant correlations in a data set. It is expressed as the maximum likelihood solution of a probabilistic latent variable model. For details, see Bishop (2006): C. M. Pattern Recognition and Machine Learning.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • method::Symbol=:ml: The method to use to solve the problem, one of :ml, :em, :bayes.
  • maxiter::Int=1000: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a principal component.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • tvat: The variance of the components.
  • loadings: The model's loadings matrix. A matrix of size (indim, outdim) where indim and outdim as as defined above.

Examples

using MLJ
 
 PPCA = @load PPCA pkg=MultivariateStats
 
@@ -8,4 +8,4 @@
 model = PPCA(maxoutdim=2)
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PCA

+Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PCA

diff --git a/dev/models/PartLS_PartitionedLS/index.html b/dev/models/PartLS_PartitionedLS/index.html index a0094377b..5bcc6fb6a 100644 --- a/dev/models/PartLS_PartitionedLS/index.html +++ b/dev/models/PartLS_PartitionedLS/index.html @@ -1,5 +1,5 @@ -PartLS · MLJ

PartLS

PartLS

A model type for fitting a partitioned least squares model to data. Both an MLJ and native interfacew are provided.

MLJ Interface

From MLJ, the type can be imported using

PartLS = @load PartLS pkg=PartitionedLS

Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).

Train the machine using fit!(mach).

Hyper-parameters

  • Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).

  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

  • η: the regularization parameter. It controls the strength of the regularization.

  • ϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • T: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • rng: the random number generator to use.

    • If nothing, the global random number generator rand is used.
    • If an integer, the global number generator rand is used after seeding it with the given integer.
    • If an object of type AbstractRNG, the given random number generator is used.

Operations

  • predict(mach, Xnew): return the predictions of the model on new data Xnew

Fitted parameters

The fields of fitted_params(mach) are:

  • α: the values of the α variables. For each partition k, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
  • β: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.
  • t: the intercept term of the model.
  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

Examples

PartLS = @load PartLS pkg=PartitionedLS
+PartLS · MLJ

PartLS

PartLS

A model type for fitting a partitioned least squares model to data. Both an MLJ and native interface are provided.

MLJ Interface

From MLJ, the type can be imported using

PartLS = @load PartLS pkg=PartitionedLS

Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).
  • y: any vector with Continuous element scitype. Check scitype with scitype(y).

Train the machine using fit!(mach).

Hyper-parameters

  • Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).

  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

  • η: the regularization parameter. It controls the strength of the regularization.

  • ϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • T: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • rng: the random number generator to use.

    • If nothing, the global random number generator rand is used.
    • If an integer, the global number generator rand is used after seeding it with the given integer.
    • If an object of type AbstractRNG, the given random number generator is used.

Operations

  • predict(mach, Xnew): return the predictions of the model on new data Xnew

Fitted parameters

The fields of fitted_params(mach) are:

  • α: the values of the α variables. For each partition k, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
  • β: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.
  • t: the intercept term of the model.
  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

Examples

PartLS = @load PartLS pkg=PartitionedLS
 
 X = [[1. 2. 3.];
      [3. 3. 4.];
@@ -40,4 +40,4 @@
 
 ## fit using the optimal algorithm
 result = fit(Opt, X, y, P, η = 0.0)
-y_hat = predict(result.model, X)

For other fit keyword options, refer to the "Hyper-parameters" section for the MLJ interface.

+y_hat = predict(result.model, X)

For other fit keyword options, refer to the "Hyper-parameters" section for the MLJ interface.

diff --git a/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html b/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html index 7093cb6fd..55ddd0b5b 100644 --- a/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -PassiveAggressiveClassifier · MLJ

PassiveAggressiveClassifier

PassiveAggressiveClassifier

A model type for constructing a passive aggressive classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveClassifier = @load PassiveAggressiveClassifier pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 100
  • tol = 0.001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = hinge
  • n_jobs = nothing
  • random_state = 0
  • warm_start = false
  • class_weight = nothing
  • average = false
+PassiveAggressiveClassifier · MLJ

PassiveAggressiveClassifier

PassiveAggressiveClassifier

A model type for constructing a passive aggressive classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveClassifier = @load PassiveAggressiveClassifier pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 100
  • tol = 0.001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = hinge
  • n_jobs = nothing
  • random_state = 0
  • warm_start = false
  • class_weight = nothing
  • average = false
diff --git a/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html b/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html index 536b49d00..02c87cc58 100644 --- a/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -PassiveAggressiveRegressor · MLJ

PassiveAggressiveRegressor

PassiveAggressiveRegressor

A model type for constructing a passive aggressive regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveRegressor = @load PassiveAggressiveRegressor pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveRegressor(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = epsilon_insensitive
  • epsilon = 0.1
  • random_state = nothing
  • warm_start = false
  • average = false
+PassiveAggressiveRegressor · MLJ

PassiveAggressiveRegressor

PassiveAggressiveRegressor

A model type for constructing a passive aggressive regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveRegressor = @load PassiveAggressiveRegressor pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveRegressor(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = epsilon_insensitive
  • epsilon = 0.1
  • random_state = nothing
  • warm_start = false
  • average = false
diff --git a/dev/models/PegasosClassifier_BetaML/index.html b/dev/models/PegasosClassifier_BetaML/index.html index f609328d9..322cbcf17 100644 --- a/dev/models/PegasosClassifier_BetaML/index.html +++ b/dev/models/PegasosClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -PegasosClassifier · MLJ

PegasosClassifier

mutable struct PegasosClassifier <: MLJModelInterface.Probabilistic

The gradient-based linear "pegasos" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • learning_rate::Function: Learning rate [def: (epoch -> 1/sqrt(epoch))]
  • learning_rate_multiplicative::Float64: Multiplicative term of the learning rate [def: 0.5]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+PegasosClassifier · MLJ

PegasosClassifier

mutable struct PegasosClassifier <: MLJModelInterface.Probabilistic

The gradient-based linear "pegasos" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • learning_rate::Function: Learning rate [def: (epoch -> 1/sqrt(epoch))]
  • learning_rate_multiplicative::Float64: Multiplicative term of the learning rate [def: 0.5]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -28,4 +28,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>0.791, versicolor=>0.177, virginica=>0.0318)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.5, virginica=>0.246)
- UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)
diff --git a/dev/models/PerceptronClassifier_BetaML/index.html b/dev/models/PerceptronClassifier_BetaML/index.html index a0f7b16f6..645b51bb4 100644 --- a/dev/models/PerceptronClassifier_BetaML/index.html +++ b/dev/models/PerceptronClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -PerceptronClassifier · MLJ

PerceptronClassifier

mutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic

The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+PerceptronClassifier · MLJ

PerceptronClassifier

mutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic

The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -29,4 +29,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>1.27e-18, virginica=>1.86e-310)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>2.77e-57, versicolor=>1.1099999999999999e-82, virginica=>1.0)
- UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)
+ UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)
diff --git a/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html b/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html index d067a020b..39b75e648 100644 --- a/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -PerceptronClassifier · MLJ

PerceptronClassifier

PerceptronClassifier

A model type for constructing a perceptron classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PerceptronClassifier = @load PerceptronClassifier pkg=MLJScikitLearnInterface

Do model = PerceptronClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PerceptronClassifier(penalty=...).

Hyper-parameters

  • penalty = nothing
  • alpha = 0.0001
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • eta0 = 1.0
  • n_jobs = nothing
  • random_state = 0
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
+PerceptronClassifier · MLJ

PerceptronClassifier

PerceptronClassifier

A model type for constructing a perceptron classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PerceptronClassifier = @load PerceptronClassifier pkg=MLJScikitLearnInterface

Do model = PerceptronClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PerceptronClassifier(penalty=...).

Hyper-parameters

  • penalty = nothing
  • alpha = 0.0001
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • eta0 = 1.0
  • n_jobs = nothing
  • random_state = 0
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
diff --git a/dev/models/Pipeline_MLJBase/index.html b/dev/models/Pipeline_MLJBase/index.html new file mode 100644 index 000000000..231a9e401 --- /dev/null +++ b/dev/models/Pipeline_MLJBase/index.html @@ -0,0 +1,10 @@ + +Pipeline · MLJ

Pipeline

Pipeline(component1, component2, ... , componentk; options...)
+Pipeline(name1=component1, name2=component2, ..., namek=componentk; options...)
+component1 |> component2 |> ... |> componentk

Create an instance of a composite model type which sequentially composes the specified components in order. This means component1 receives inputs, whose output is passed to component2, and so forth. A "component" is either a Model instance, a model type (converted immediately to its default instance) or any callable object. Here the "output" of a model is what predict returns if it is Supervised, or what transform returns if it is Unsupervised.

Names for the component fields are automatically generated unless explicitly specified, as in

Pipeline(encoder=ContinuousEncoder(drop_last=false),
+         stand=Standardizer())

The Pipeline constructor accepts keyword options discussed further below.

Ordinary functions (and other callables) may be inserted in the pipeline as shown in the following example:

Pipeline(X->coerce(X, :age=>Continuous), OneHotEncoder, ConstantClassifier)

Syntactic sugar

The |> operator is overloaded to construct pipelines out of models, callables, and existing pipelines:

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels add=true
+PCA = @load PCA pkg=MultivariateStats add=true
+
+pipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer
+pipe2 = PCA |> LinearRegressor
+pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

  • prediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)
  • operation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)
  • cache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)
Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

diff --git a/dev/models/ProbabilisticNuSVC_LIBSVM/index.html b/dev/models/ProbabilisticNuSVC_LIBSVM/index.html index 163ffd268..50fc3d7ac 100644 --- a/dev/models/ProbabilisticNuSVC_LIBSVM/index.html +++ b/dev/models/ProbabilisticNuSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -ProbabilisticNuSVC · MLJ

ProbabilisticNuSVC

ProbabilisticNuSVC

A model type for constructing a probabilistic ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM

Do model = ProbabilisticNuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticNuSVC(kernel=...).

This model is identical to NuSVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+ProbabilisticNuSVC · MLJ

ProbabilisticNuSVC

ProbabilisticNuSVC

A model type for constructing a probabilistic ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM

Do model = ProbabilisticNuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticNuSVC(kernel=...).

This model is identical to NuSVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM    ## model type
@@ -27,4 +27,4 @@
 model = ProbabilisticNuSVC(kernel=k)
 mach = machine(model, X, y) |> fit!
 
-probs = predict(mach, Xnew)

See also the classifiers NuSVC, SVC, ProbabilisticSVC and LinearSVC. And see LIVSVM.jl and the original C implementation. documentation.

+probs = predict(mach, Xnew)

See also the classifiers NuSVC, SVC, ProbabilisticSVC and LinearSVC. And see LIVSVM.jl and the original C implementation. documentation.

diff --git a/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html b/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html index 53a738781..fcecffa3c 100644 --- a/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ProbabilisticSGDClassifier · MLJ

ProbabilisticSGDClassifier

ProbabilisticSGDClassifier

A model type for constructing a probabilistic sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSGDClassifier = @load ProbabilisticSGDClassifier pkg=MLJScikitLearnInterface

Do model = ProbabilisticSGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSGDClassifier(loss=...).

Hyper-parameters

  • loss = log_loss
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
+ProbabilisticSGDClassifier · MLJ

ProbabilisticSGDClassifier

ProbabilisticSGDClassifier

A model type for constructing a probabilistic sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSGDClassifier = @load ProbabilisticSGDClassifier pkg=MLJScikitLearnInterface

Do model = ProbabilisticSGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSGDClassifier(loss=...).

Hyper-parameters

  • loss = log_loss
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
diff --git a/dev/models/ProbabilisticSVC_LIBSVM/index.html b/dev/models/ProbabilisticSVC_LIBSVM/index.html index fe2772524..084988f2f 100644 --- a/dev/models/ProbabilisticSVC_LIBSVM/index.html +++ b/dev/models/ProbabilisticSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -ProbabilisticSVC · MLJ

ProbabilisticSVC

ProbabilisticSVC

A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM

Do model = ProbabilisticSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...).

This model is identical to SVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+ProbabilisticSVC · MLJ

ProbabilisticSVC

ProbabilisticSVC

A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM

Do model = ProbabilisticSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...).

This model is identical to SVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return probabilistic predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
@@ -32,4 +32,4 @@
 probs = predict(mach, Xnew)

Incorporating class weights

In either scenario above, we can do:

weights = Dict("virginica" => 1, "versicolor" => 20, "setosa" => 1)
 mach = machine(model, X, y, weights) |> fit!
 
-probs = predict(mach, Xnew)

See also the classifiers SVC, NuSVC and LinearSVC, and LIVSVM.jl and the original C implementation documentation.

+probs = predict(mach, Xnew)

See also the classifiers SVC, NuSVC and LinearSVC, and LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/QuantileRegressor_MLJLinearModels/index.html b/dev/models/QuantileRegressor_MLJLinearModels/index.html index 1ad0ca682..6af43fb7e 100644 --- a/dev/models/QuantileRegressor_MLJLinearModels/index.html +++ b/dev/models/QuantileRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -QuantileRegressor · MLJ

QuantileRegressor

QuantileRegressor

A model type for constructing a quantile regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels

Do model = QuantileRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to QuantileRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the QuantileRho function (indicating the quantile to use with default 0.5 for the median regression) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+QuantileRegressor · MLJ

QuantileRegressor

QuantileRegressor

A model type for constructing a quantile regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels

Do model = QuantileRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to QuantileRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the QuantileRho function (indicating the quantile to use with default 0.5 for the median regression) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(QuantileRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also RobustRegressor, HuberRegressor.

+fitted_params(mach)

See also RobustRegressor, HuberRegressor.

diff --git a/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html b/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html index b8b56f4a1..817ae1fba 100644 --- a/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RANSACRegressor · MLJ

RANSACRegressor

RANSACRegressor

A model type for constructing a ransac regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RANSACRegressor = @load RANSACRegressor pkg=MLJScikitLearnInterface

Do model = RANSACRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RANSACRegressor(estimator=...).

Hyper-parameters

  • estimator = nothing
  • min_samples = 5
  • residual_threshold = nothing
  • is_data_valid = nothing
  • is_model_valid = nothing
  • max_trials = 100
  • max_skips = 9223372036854775807
  • stop_n_inliers = 9223372036854775807
  • stop_score = Inf
  • stop_probability = 0.99
  • loss = absolute_error
  • random_state = nothing
+RANSACRegressor · MLJ

RANSACRegressor

RANSACRegressor

A model type for constructing a ransac regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RANSACRegressor = @load RANSACRegressor pkg=MLJScikitLearnInterface

Do model = RANSACRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RANSACRegressor(estimator=...).

Hyper-parameters

  • estimator = nothing
  • min_samples = 5
  • residual_threshold = nothing
  • is_data_valid = nothing
  • is_model_valid = nothing
  • max_trials = 100
  • max_skips = 9223372036854775807
  • stop_n_inliers = 9223372036854775807
  • stop_score = Inf
  • stop_probability = 0.99
  • loss = absolute_error
  • random_state = nothing
diff --git a/dev/models/RODDetector_OutlierDetectionPython/index.html b/dev/models/RODDetector_OutlierDetectionPython/index.html index 4e7923c17..46cf1feae 100644 --- a/dev/models/RODDetector_OutlierDetectionPython/index.html +++ b/dev/models/RODDetector_OutlierDetectionPython/index.html @@ -1,2 +1,2 @@ -RODDetector · MLJ
+RODDetector · MLJ
diff --git a/dev/models/ROSE_Imbalance/index.html b/dev/models/ROSE_Imbalance/index.html index 29dede12b..2a337ad7a 100644 --- a/dev/models/ROSE_Imbalance/index.html +++ b/dev/models/ROSE_Imbalance/index.html @@ -1,5 +1,5 @@ -ROSE · MLJ

ROSE

Initiate a ROSE model with the given hyper-parameters.

ROSE

A model type for constructing a rose, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ROSE = @load ROSE pkg=Imbalance

Do model = ROSE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ROSE(s=...).

ROSE implements the ROSE (Random Oversampling Examples) algorithm to correct for class imbalance as in G Menardi, N. Torelli, “Training and assessing classification rules with imbalanced data,” Data Mining and Knowledge Discovery, 28(1), pp.92-122, 2014.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ROSE()

Hyperparameters

  • s::float: A parameter that proportionally controls the bandwidth of the Gaussian kernel

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using ROSE, returning both the new and original observations

Example

using MLJ
+ROSE · MLJ

ROSE

Initiate a ROSE model with the given hyper-parameters.

ROSE

A model type for constructing a rose, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ROSE = @load ROSE pkg=Imbalance

Do model = ROSE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ROSE(s=...).

ROSE implements the ROSE (Random Oversampling Examples) algorithm to correct for class imbalance as in G Menardi, N. Torelli, “Training and assessing classification rules with imbalanced data,” Data Mining and Knowledge Discovery, 28(1), pp.92-122, 2014.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ROSE()

Hyperparameters

  • s::float: A parameter that proportionally controls the bandwidth of the Gaussian kernel

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using ROSE, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -27,4 +27,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/RandomForestClassifier_BetaML/index.html b/dev/models/RandomForestClassifier_BetaML/index.html index dbeb03f97..8ab63ab3b 100644 --- a/dev/models/RandomForestClassifier_BetaML/index.html +++ b/dev/models/RandomForestClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -RandomForestClassifier · MLJ

RandomForestClassifier

mutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic

A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimensions]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
+RandomForestClassifier · MLJ

RandomForestClassifier

mutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic

A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimensions]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -28,4 +28,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
- UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)
diff --git a/dev/models/RandomForestClassifier_DecisionTree/index.html b/dev/models/RandomForestClassifier_DecisionTree/index.html index f465bbcab..6706682d3 100644 --- a/dev/models/RandomForestClassifier_DecisionTree/index.html +++ b/dev/models/RandomForestClassifier_DecisionTree/index.html @@ -1,5 +1,5 @@ -RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a CART random forest classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(max_depth=...).

RandomForestClassifier implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a CART random forest classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(max_depth=...).

RandomForestClassifier implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 Forest = @load RandomForestClassifier pkg=DecisionTree
 forest = Forest(min_samples_split=6, n_subfeatures=3)
 
@@ -19,4 +19,4 @@
 feature_importances(mach)  ## `:impurity` feature importances
 forest.feature_importance = :split
 feature_importance(mach)   ## `:split` feature importances
-

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestClassifier.

+

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestClassifier.

diff --git a/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html b/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html index 2d3dc5fc0..93aec4c69 100644 --- a/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a random forest classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=MLJScikitLearnInterface

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

+RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a random forest classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=MLJScikitLearnInterface

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

diff --git a/dev/models/RandomForestImputer_BetaML/index.html b/dev/models/RandomForestImputer_BetaML/index.html index ad2fed99b..fa589efbc 100644 --- a/dev/models/RandomForestImputer_BetaML/index.html +++ b/dev/models/RandomForestImputer_BetaML/index.html @@ -1,5 +1,5 @@ -RandomForestImputer · MLJ

RandomForestImputer

mutable struct RandomForestImputer <: MLJModelInterface.Unsupervised

Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Union{Nothing, Int64}: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: nothing, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Union{Nothing, Int64}: The maximum number of (random) features to consider at each partitioning [def: nothing, i.e. square root of the data dimension]
  • forced_categorical_cols::Vector{Int64}: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]
  • splitting_criterion::Union{Nothing, Function}: Either gini, entropy or variance. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: nothing, i.e. gini for categorical labels (classification task) and variance for numerical labels(regression task)]. It can be an anonymous function.
  • recursive_passages::Int64: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+RandomForestImputer · MLJ

RandomForestImputer

mutable struct RandomForestImputer <: MLJModelInterface.Unsupervised

Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Union{Nothing, Int64}: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: nothing, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Union{Nothing, Int64}: The maximum number of (random) features to consider at each partitioning [def: nothing, i.e. square root of the data dimension]
  • forced_categorical_cols::Vector{Int64}: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]
  • splitting_criterion::Union{Nothing, Function}: Either gini, entropy or variance. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: nothing, i.e. gini for categorical labels (classification task) and variance for numerical labels(regression task)]. It can be an anonymous function.
  • recursive_passages::Int64: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
 
@@ -33,4 +33,4 @@
  2.88375   8.66125
  3.3      38.0
  3.98125  -2.3
- 5.2      -2.4
+ 5.2 -2.4
diff --git a/dev/models/RandomForestRegressor_BetaML/index.html b/dev/models/RandomForestRegressor_BetaML/index.html index 5516aad50..0dcc4c0c1 100644 --- a/dev/models/RandomForestRegressor_BetaML/index.html +++ b/dev/models/RandomForestRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -RandomForestRegressor · MLJ

RandomForestRegressor

mutable struct RandomForestRegressor <: MLJModelInterface.Deterministic

A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimension]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+RandomForestRegressor · MLJ

RandomForestRegressor

mutable struct RandomForestRegressor <: MLJModelInterface.Deterministic

A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimension]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -33,4 +33,4 @@
   ⋮    
  23.9  24.42
  22.0  22.4433
- 11.9  15.5833
+ 11.9 15.5833
diff --git a/dev/models/RandomForestRegressor_DecisionTree/index.html b/dev/models/RandomForestRegressor_DecisionTree/index.html index 72f5c5b2a..5d7113735 100644 --- a/dev/models/RandomForestRegressor_DecisionTree/index.html +++ b/dev/models/RandomForestRegressor_DecisionTree/index.html @@ -1,5 +1,5 @@ -RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a CART random forest regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(max_depth=...).

DecisionTreeRegressor implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a CART random forest regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(max_depth=...).

DecisionTreeRegressor implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 Forest = @load RandomForestRegressor pkg=DecisionTree
 forest = Forest(max_depth=4, min_samples_split=3)
 
@@ -10,4 +10,4 @@
 yhat = predict(mach, Xnew) ## new predictions
 
 fitted_params(mach).forest ## raw `Ensemble` object from DecisionTree.jl
-feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestRegressor.

+feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestRegressor.

diff --git a/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html b/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html index ef96d01fe..6c2626574 100644 --- a/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a random forest regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=MLJScikitLearnInterface

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

+RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a random forest regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=MLJScikitLearnInterface

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

diff --git a/dev/models/RandomOversampler_Imbalance/index.html b/dev/models/RandomOversampler_Imbalance/index.html index d72eed1b1..a10949a3a 100644 --- a/dev/models/RandomOversampler_Imbalance/index.html +++ b/dev/models/RandomOversampler_Imbalance/index.html @@ -1,5 +1,5 @@ -RandomOversampler · MLJ

RandomOversampler

Initiate a random oversampling model with the given hyper-parameters.

RandomOversampler

A model type for constructing a random oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomOversampler = @load RandomOversampler pkg=Imbalance

Do model = RandomOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomOversampler(ratios=...).

RandomOversampler implements naive oversampling by repeating existing observations with replacement.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomOverSampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using RandomOversampler, returning both the new and original observations

Example

using MLJ
+RandomOversampler · MLJ

RandomOversampler

Initiate a random oversampling model with the given hyper-parameters.

RandomOversampler

A model type for constructing a random oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomOversampler = @load RandomOversampler pkg=Imbalance

Do model = RandomOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomOversampler(ratios=...).

RandomOversampler implements naive oversampling by repeating existing observations with replacement.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomOverSampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using RandomOversampler, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -27,4 +27,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/RandomUndersampler_Imbalance/index.html b/dev/models/RandomUndersampler_Imbalance/index.html index d25bc6b01..0e0a889d1 100644 --- a/dev/models/RandomUndersampler_Imbalance/index.html +++ b/dev/models/RandomUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -RandomUndersampler · MLJ

RandomUndersampler

Initiate a random undersampling model with the given hyper-parameters.

RandomUndersampler

A model type for constructing a random undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomUndersampler = @load RandomUndersampler pkg=Imbalance

Do model = RandomUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomUndersampler(ratios=...).

RandomUndersampler implements naive undersampling by randomly removing existing observations.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomUndersampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using RandomUndersampler, returning both the new and original observations

Example

using MLJ
+RandomUndersampler · MLJ

RandomUndersampler

Initiate a random undersampling model with the given hyper-parameters.

RandomUndersampler

A model type for constructing a random undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomUndersampler = @load RandomUndersampler pkg=Imbalance

Do model = RandomUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomUndersampler(ratios=...).

RandomUndersampler implements naive undersampling by randomly removing existing observations.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomUndersampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using RandomUndersampler, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -28,4 +28,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
-1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
+1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)
diff --git a/dev/models/RandomWalkOversampler_Imbalance/index.html b/dev/models/RandomWalkOversampler_Imbalance/index.html index 04bdbfefc..4f7380e93 100644 --- a/dev/models/RandomWalkOversampler_Imbalance/index.html +++ b/dev/models/RandomWalkOversampler_Imbalance/index.html @@ -1,5 +1,5 @@ -RandomWalkOversampler · MLJ

RandomWalkOversampler

Initiate a RandomWalkOversampler model with the given hyper-parameters.

RandomWalkOversampler

A model type for constructing a random walk oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance

Do model = RandomWalkOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomWalkOversampler(ratios=...).

RandomWalkOversampler implements the random walk oversampling algorithm to correct for class imbalance as in Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 25, 4-20.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = RandomWalkOversampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and
 elements in continuous columns should subtype `Infinite` (i.e., have 
+RandomWalkOversampler · MLJ

RandomWalkOversampler

Initiate a RandomWalkOversampler model with the given hyper-parameters.

RandomWalkOversampler

A model type for constructing a random walk oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance

Do model = RandomWalkOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomWalkOversampler(ratios=...).

RandomWalkOversampler implements the random walk oversampling algorithm to correct for class imbalance as in Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 25, 4-20.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = RandomWalkOversampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and
 elements in continuous columns should subtype `Infinite` (i.e., have 
  [scitype](https://juliaai.github.io/ScientificTypes.jl/) `Count` or `Continuous`).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using RandomWalkOversampler, returning both the new and original observations

Example

using MLJ
 using ScientificTypes
 import Imbalance
@@ -36,4 +36,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html b/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html new file mode 100644 index 000000000..04733b448 --- /dev/null +++ b/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html @@ -0,0 +1,25 @@ + +RecursiveFeatureElimination · MLJ

RecursiveFeatureElimination

RecursiveFeatureElimination(model, n_features, step)

This model implements a recursive feature elimination algorithm for feature selection. It recursively removes features, training a base model on the remaining features and evaluating their importance until the desired number of features is selected.

Construct an instance with default hyper-parameters using the syntax rfe_model = RecursiveFeatureElimination(model=...). Provide keyword arguments to override hyper-parameter defaults.

Training data

In MLJ or MLJBase, bind an instance rfe_model to data with

mach = machine(rfe_model, X, y)

OR, if the base model supports weights, as

mach = machine(rfe_model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of the scitype as that required by the base model; check column scitypes with schema(X) and column scitypes required by base model with input_scitype(basemodel).
  • y is the target, which can be any table of responses whose element scitype is Continuous or Finite depending on the target_scitype required by the base model; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • model: A base model with a fit method that provides information on feature feature importance (i.e reports_feature_importances(model) == true)
  • n_features::Real = 0: The number of features to select. If 0, half of the features are selected. If a positive integer, the parameter is the absolute number of features to select. If a real number between 0 and 1, it is the fraction of features to select.
  • step::Real=1: If the value of step is at least 1, it signifies the quantity of features to eliminate in each iteration. Conversely, if step falls strictly within the range of 0.0 to 1.0, it denotes the proportion (rounded down) of features to remove during each iteration.

Operations

  • transform(mach, X): transform the input table X into a new table containing only

columns corresponding to features gotten from the RFE algorithm.

  • predict(mach, X): transform the input table X into a new table same as in
  • transform(mach, X) above and predict using the fitted base model on the transformed table.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_left: names of features remaining after recursive feature elimination.
  • model_fitresult: fitted parameters of the base model.

Report

The fields of report(mach) are:

  • ranking: The feature ranking of each features in the training dataset.
  • model_report: report for the fitted base model.
  • features: names of features seen during the training process.

Examples

using FeatureSelection, MLJ, StableRNGs
+
+RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree
+
+## Creates a dataset where the target only depends on the first 5 columns of the input table.
+A = rand(rng, 50, 10);
+y = 10 .* sin.(
+        pi .* A[:, 1] .* A[:, 2]
+    ) + 20 .* (A[:, 3] .- 0.5).^ 2 .+ 10 .* A[:, 4] .+ 5 * A[:, 5]);
+X = MLJ.table(A);
+
+## fit a rfe model
+rf = RandomForestRegressor()
+selector = RecursiveFeatureElimination(model = rf)
+mach = machine(selector, X, y)
+fit!(mach)
+
+## view the feature importances
+feature_importances(mach)
+
+## predict using the base model
+Xnew = MLJ.table(rand(rng, 50, 10));
+predict(mach, Xnew)
+
diff --git a/dev/models/Resampler_MLJBase/index.html b/dev/models/Resampler_MLJBase/index.html new file mode 100644 index 000000000..1e3441dc2 --- /dev/null +++ b/dev/models/Resampler_MLJBase/index.html @@ -0,0 +1,15 @@ + +Resampler · MLJ

Resampler

resampler = Resampler(
+    model=ConstantRegressor(),
+    resampling=CV(),
+    measure=nothing,
+    weights=nothing,
+    class_weights=nothing
+    operation=predict,
+    repeats = 1,
+    acceleration=default_resource(),
+    check_measure=true,
+    per_observation=true,
+    logger=nothing,
+    compact=false,
+)

Private method. Use at own risk.

Resampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See evaluate! for meaning of the options. Not intended for use by general user, who will ordinarily use evaluate! directly.

Given a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).

On subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see is_same_except).

If there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.

The sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

The sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

diff --git a/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html b/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html index 5d6dda046..047886696 100644 --- a/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeCVClassifier · MLJ

RidgeCVClassifier

RidgeCVClassifier

A model type for constructing a ridge regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVClassifier = @load RidgeCVClassifier pkg=MLJScikitLearnInterface

Do model = RidgeCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVClassifier(alphas=...).

Hyper-parameters

  • alphas = [0.1, 1.0, 10.0]
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • class_weight = nothing
  • store_cv_values = false
+RidgeCVClassifier · MLJ

RidgeCVClassifier

RidgeCVClassifier

A model type for constructing a ridge regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVClassifier = @load RidgeCVClassifier pkg=MLJScikitLearnInterface

Do model = RidgeCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVClassifier(alphas=...).

Hyper-parameters

  • alphas = [0.1, 1.0, 10.0]
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • class_weight = nothing
  • store_cv_values = false
diff --git a/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html index 15e1a4ee9..3bfe43b45 100644 --- a/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeCVRegressor · MLJ

RidgeCVRegressor

RidgeCVRegressor

A model type for constructing a ridge regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVRegressor = @load RidgeCVRegressor pkg=MLJScikitLearnInterface

Do model = RidgeCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVRegressor(alphas=...).

Hyper-parameters

  • alphas = (0.1, 1.0, 10.0)
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • gcv_mode = nothing
  • store_cv_values = false
+RidgeCVRegressor · MLJ

RidgeCVRegressor

RidgeCVRegressor

A model type for constructing a ridge regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVRegressor = @load RidgeCVRegressor pkg=MLJScikitLearnInterface

Do model = RidgeCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVRegressor(alphas=...).

Hyper-parameters

  • alphas = (0.1, 1.0, 10.0)
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • gcv_mode = nothing
  • store_cv_values = false
diff --git a/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html b/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html index fc4c3574d..fa734ed04 100644 --- a/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeClassifier · MLJ

RidgeClassifier

RidgeClassifier

A model type for constructing a ridge regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeClassifier = @load RidgeClassifier pkg=MLJScikitLearnInterface

Do model = RidgeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeClassifier(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = nothing
  • tol = 0.001
  • class_weight = nothing
  • solver = auto
  • random_state = nothing
+RidgeClassifier · MLJ

RidgeClassifier

RidgeClassifier

A model type for constructing a ridge regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeClassifier = @load RidgeClassifier pkg=MLJScikitLearnInterface

Do model = RidgeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeClassifier(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = nothing
  • tol = 0.001
  • class_weight = nothing
  • solver = auto
  • random_state = nothing
diff --git a/dev/models/RidgeRegressor_MLJLinearModels/index.html b/dev/models/RidgeRegressor_MLJLinearModels/index.html index f115c702a..f37f3ec93 100644 --- a/dev/models/RidgeRegressor_MLJLinearModels/index.html +++ b/dev/models/RidgeRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels

Do model = RidgeRegressor() to construct an instance with default hyper-parameters.

Ridge regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 $

where $n$ is the number of observations.

If scale_penalty_with_samples = false then the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient. If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
+RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels

Do model = RidgeRegressor() to construct an instance with default hyper-parameters.

Ridge regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 $

where $n$ is the number of observations.

If scale_penalty_with_samples = false then the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient. If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(RidgeRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also ElasticNetRegressor.

+fitted_params(mach)

See also ElasticNetRegressor.

diff --git a/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html b/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html index e7cacf47c..80b721820 100644 --- a/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJScikitLearnInterface

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • solver = auto
  • random_state = nothing
+RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJScikitLearnInterface

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • solver = auto
  • random_state = nothing
diff --git a/dev/models/RidgeRegressor_MultivariateStats/index.html b/dev/models/RidgeRegressor_MultivariateStats/index.html index 903fd5c7f..29480fdbf 100644 --- a/dev/models/RidgeRegressor_MultivariateStats/index.html +++ b/dev/models/RidgeRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(lambda=...).

RidgeRegressor adds a quadratic penalty term to least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(lambda=...).

RidgeRegressor adds a quadratic penalty term to least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 
 RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats
 pipe = Standardizer() |> RidgeRegressor(lambda=10)
@@ -8,4 +8,4 @@
 
 mach = machine(pipe, X, y) |> fit!
 yhat = predict(mach, X)
-training_error = l1(yhat, y) |> mean

See also LinearRegressor, MultitargetLinearRegressor, MultitargetRidgeRegressor

+training_error = l1(yhat, y) |> mean

See also LinearRegressor, MultitargetLinearRegressor, MultitargetRidgeRegressor

diff --git a/dev/models/RobustRegressor_MLJLinearModels/index.html b/dev/models/RobustRegressor_MLJLinearModels/index.html index 5223c73e4..34cce57fd 100644 --- a/dev/models/RobustRegressor_MLJLinearModels/index.html +++ b/dev/models/RobustRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -RobustRegressor · MLJ

RobustRegressor

RobustRegressor

A model type for constructing a robust regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RobustRegressor = @load RobustRegressor pkg=MLJLinearModels

Do model = RobustRegressor() to construct an instance with default hyper-parameters.

Robust regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is a robust loss function (e.g. the Huber function) and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • rho::MLJLinearModels.RobustRho: the type of robust loss, which can be any instance of MLJLinearModels.L where L is one of: AndrewsRho, BisquareRho, FairRho, HuberRho, LogisticRho, QuantileRho, TalwarRho, HuberRho, TalwarRho. Default: HuberRho(0.1)

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+RobustRegressor · MLJ

RobustRegressor

RobustRegressor

A model type for constructing a robust regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RobustRegressor = @load RobustRegressor pkg=MLJLinearModels

Do model = RobustRegressor() to construct an instance with default hyper-parameters.

Robust regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is a robust loss function (e.g. the Huber function) and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • rho::MLJLinearModels.RobustRho: the type of robust loss, which can be any instance of MLJLinearModels.L where L is one of: AndrewsRho, BisquareRho, FairRho, HuberRho, LogisticRho, QuantileRho, TalwarRho, HuberRho, TalwarRho. Default: HuberRho(0.1)

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(RobustRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also HuberRegressor, QuantileRegressor.

+fitted_params(mach)

See also HuberRegressor, QuantileRegressor.

diff --git a/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html b/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html index 58a9d9aca..97cfbb863 100644 --- a/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SGDClassifier · MLJ

SGDClassifier

SGDClassifier

A model type for constructing a sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDClassifier = @load SGDClassifier pkg=MLJScikitLearnInterface

Do model = SGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDClassifier(loss=...).

Hyper-parameters

  • loss = hinge
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
+SGDClassifier · MLJ

SGDClassifier

SGDClassifier

A model type for constructing a sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDClassifier = @load SGDClassifier pkg=MLJScikitLearnInterface

Do model = SGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDClassifier(loss=...).

Hyper-parameters

  • loss = hinge
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
diff --git a/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html b/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html index 875e007f6..888fac941 100644 --- a/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SGDRegressor · MLJ

SGDRegressor

SGDRegressor

A model type for constructing a stochastic gradient descent-based regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDRegressor = @load SGDRegressor pkg=MLJScikitLearnInterface

Do model = SGDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDRegressor(loss=...).

Hyper-parameters

  • loss = squared_error
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • random_state = nothing
  • learning_rate = invscaling
  • eta0 = 0.01
  • power_t = 0.25
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • warm_start = false
  • average = false
+SGDRegressor · MLJ

SGDRegressor

SGDRegressor

A model type for constructing a stochastic gradient descent-based regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDRegressor = @load SGDRegressor pkg=MLJScikitLearnInterface

Do model = SGDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDRegressor(loss=...).

Hyper-parameters

  • loss = squared_error
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • random_state = nothing
  • learning_rate = invscaling
  • eta0 = 0.01
  • power_t = 0.25
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • warm_start = false
  • average = false
diff --git a/dev/models/SMOTENC_Imbalance/index.html b/dev/models/SMOTENC_Imbalance/index.html index 9f9420b6b..57cdda7b2 100644 --- a/dev/models/SMOTENC_Imbalance/index.html +++ b/dev/models/SMOTENC_Imbalance/index.html @@ -1,5 +1,5 @@ -SMOTENC · MLJ

SMOTENC

Initiate a SMOTENC model with the given hyper-parameters.

SMOTENC

A model type for constructing a smotenc, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTENC = @load SMOTENC pkg=Imbalance

Do model = SMOTENC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTENC(k=...).

SMOTENC implements the SMOTENC algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTENC()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTENC algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • knn_tree: Decides the tree used in KNN computations. Either "Brute" or "Ball". BallTree can be much faster but may lead to inaccurate results.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTENC, returning both the new and original observations

Example

using MLJ
+SMOTENC · MLJ

SMOTENC

Initiate a SMOTENC model with the given hyper-parameters.

SMOTENC

A model type for constructing a smotenc, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTENC = @load SMOTENC pkg=Imbalance

Do model = SMOTENC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTENC(k=...).

SMOTENC implements the SMOTENC algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTENC()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTENC algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • knn_tree: Decides the tree used in KNN computations. Either "Brute" or "Ball". BallTree can be much faster but may lead to inaccurate results.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTENC, returning both the new and original observations

Example

using MLJ
 using ScientificTypes
 import Imbalance
 
@@ -36,4 +36,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/SMOTEN_Imbalance/index.html b/dev/models/SMOTEN_Imbalance/index.html index 3f38c4d0b..8ac5b7ea4 100644 --- a/dev/models/SMOTEN_Imbalance/index.html +++ b/dev/models/SMOTEN_Imbalance/index.html @@ -1,5 +1,5 @@ -SMOTEN · MLJ

SMOTEN

Initiate a SMOTEN model with the given hyper-parameters.

SMOTEN

A model type for constructing a smoten, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTEN = @load SMOTEN pkg=Imbalance

Do model = SMOTEN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTEN(k=...).

SMOTEN implements the SMOTEN algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTEN: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTEN()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTEN algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of integers or a table with element scitypes that subtype Finite. That is, for table inputs each column should have either OrderedFactor or Multiclass as the element scitype.
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTEN, returning both the new and original observations

Example

using MLJ
+SMOTEN · MLJ

SMOTEN

Initiate a SMOTEN model with the given hyper-parameters.

SMOTEN

A model type for constructing a smoten, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTEN = @load SMOTEN pkg=Imbalance

Do model = SMOTEN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTEN(k=...).

SMOTEN implements the SMOTEN algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTEN: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTEN()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTEN algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of integers or a table with element scitypes that subtype Finite. That is, for table inputs each column should have either OrderedFactor or Multiclass as the element scitype.
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTEN, returning both the new and original observations

Example

using MLJ
 using ScientificTypes
 import Imbalance
 
@@ -37,4 +37,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/SMOTE_Imbalance/index.html b/dev/models/SMOTE_Imbalance/index.html index 5ade5dc59..559c2f9c1 100644 --- a/dev/models/SMOTE_Imbalance/index.html +++ b/dev/models/SMOTE_Imbalance/index.html @@ -1,5 +1,5 @@ -SMOTE · MLJ

SMOTE

Initiate a SMOTE model with the given hyper-parameters.

SMOTE

A model type for constructing a smote, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTE = @load SMOTE pkg=Imbalance

Do model = SMOTE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTE(k=...).

SMOTE implements the SMOTE algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTE()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTE algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTE, returning both the new and original observations

Example

using MLJ
+SMOTE · MLJ

SMOTE

Initiate a SMOTE model with the given hyper-parameters.

SMOTE

A model type for constructing a smote, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTE = @load SMOTE pkg=Imbalance

Do model = SMOTE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTE(k=...).

SMOTE implements the SMOTE algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTE()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTE algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTE, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -28,4 +28,4 @@
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
-
+
diff --git a/dev/models/SODDetector_OutlierDetectionPython/index.html b/dev/models/SODDetector_OutlierDetectionPython/index.html index a4668d7f2..f00ae6a03 100644 --- a/dev/models/SODDetector_OutlierDetectionPython/index.html +++ b/dev/models/SODDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -SODDetector · MLJ

SODDetector

SODDetector(n_neighbors = 5,
+SODDetector · MLJ
+               alpha = 0.8)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sod

diff --git a/dev/models/SOSDetector_OutlierDetectionPython/index.html b/dev/models/SOSDetector_OutlierDetectionPython/index.html index 1b42fef17..938cf893b 100644 --- a/dev/models/SOSDetector_OutlierDetectionPython/index.html +++ b/dev/models/SOSDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -SOSDetector · MLJ

SOSDetector

SOSDetector(perplexity = 4.5,
+SOSDetector · MLJ
+               eps = 1e-5)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sos

diff --git a/dev/models/SRRegressor_SymbolicRegression/index.html b/dev/models/SRRegressor_SymbolicRegression/index.html index 50db2aeb4..02d3ab807 100644 --- a/dev/models/SRRegressor_SymbolicRegression/index.html +++ b/dev/models/SRRegressor_SymbolicRegression/index.html @@ -1,11 +1,11 @@ -SRRegressor · MLJ

SRRegressor

SRRegressor

A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SRRegressor = @load SRRegressor pkg=SymbolicRegression

Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(binary_operators=...).

Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
  • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous.

Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

Hyper-parameters

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
    +SRRegressor · MLJ

    SRRegressor

    SRRegressor

    A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

    From MLJ, the type can be imported using

    SRRegressor = @load SRRegressor pkg=SymbolicRegression

    Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(binary_operators=...).

    Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

    Training data

    In MLJ or MLJBase, bind an instance model to data with

    mach = machine(model, X, y)

    OR

    mach = machine(model, X, y, w)

    Here:

    • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.
    • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
    • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous.

    Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

    Hyper-parameters

    • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

    • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

    • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

    • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

    • batch_size: What batch size to use if using batching.

    • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

    • loss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

        function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
             prediction, flag = eval_tree_array(tree, dataset.X, options)
             if !flag
                 return L(Inf)
             end
             return sum((prediction .- dataset.y) .^ 2) / dataset.n
      -  end
    • node_type::Type{N}=Node: The type of node to use for the search. For example, Node or GraphNode.

    • populations: How many populations of equations to use.

    • population_size: How many equations in each population.

    • ncycles_per_iteration: How many generations to consider per iteration.

    • tournament_selection_n: Number of expressions considered in each tournament.

    • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

    • topn: Number of equations to return to the host process, and to consider for the hall of fame.

    • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

    • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

    • complexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.

    • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

    • maxsize: Maximum size of equations during the search.

    • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

    • parsimony: A multiplicative factor for how much complexity is punished.

    • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

    • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

    • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

    • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

    • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

    • bumper: Whether to use Bumper.jl for faster evaluation. Experimental!

    • migration: Whether to migrate equations between processes.

    • hof_migration: Whether to migrate equations from the hall of fame to processes.

    • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

    • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

    • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

    • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

    • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).

    • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

    • optimizer_probability: Probability of performing optimization of constants at the end of a given iteration.

    • optimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.

    • optimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 0 which means no limit.

    • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

    • output_file: What file to store equations to, as a backup.

    • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

    • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

    • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.

    • crossover_probability: Probability of performing crossover.

    • annealing: Whether to use simulated annealing.

    • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

    • verbosity: Whether to print debugging statements or not.

    • print_precision: How many digits to print when printing equations. By default, this is 5.

    • save_to_file: Whether to save equations to a file during the search.

    • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

    • una_constraints: Likewise, for unary operators.

    • seed: What random seed to use. nothing uses no seed.

    • progress: Whether to use a progress bar output (verbosity will have no effect).

    • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

    • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

    • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

    • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

    • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

    • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

    • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

    • niterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.

    • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".

    • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).

    • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.

    • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.

    • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.

    • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.

    • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.

    • selection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.

    • dimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.

    Operations

    • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.
    • predict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.

    Fitted parameters

    The fields of fitted_params(mach) are:

    • best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
    • equations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). T is equal to the element type of the passed data.
    • equation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.

    Report

    The fields of report(mach) are:

    • best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
    • equations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity).
    • equation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.
    • complexities::Vector{Int}: The complexity of each expression in the Pareto frontier.
    • losses::Vector{L}: The loss of each expression in the Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.
    • scores::Vector{L}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.

    Examples

    using MLJ
    +  end
  • populations: How many populations of equations to use.

  • population_size: How many equations in each population.

  • ncycles_per_iteration: How many generations to consider per iteration.

  • tournament_selection_n: Number of expressions considered in each tournament.

  • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

  • topn: Number of equations to return to the host process, and to consider for the hall of fame.

  • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

  • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

  • complexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.

  • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

  • maxsize: Maximum size of equations during the search.

  • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

  • parsimony: A multiplicative factor for how much complexity is punished.

  • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

  • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

  • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

  • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

  • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

  • migration: Whether to migrate equations between processes.

  • hof_migration: Whether to migrate equations from the hall of fame to processes.

  • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

  • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

  • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

  • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

  • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

  • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is "BFGS", but "NelderMead" is also supported.

  • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

  • output_file: What file to store equations to, as a backup.

  • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

  • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

  • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.

  • crossover_probability: Probability of performing crossover.

  • annealing: Whether to use simulated annealing.

  • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

  • verbosity: Whether to print debugging statements or not.

  • print_precision: How many digits to print when printing equations. By default, this is 5.

  • save_to_file: Whether to save equations to a file during the search.

  • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

  • una_constraints: Likewise, for unary operators.

  • seed: What random seed to use. nothing uses no seed.

  • progress: Whether to use a progress bar output (verbosity will have no effect).

  • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

  • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

  • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

  • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

  • enable_autodiff: Whether to enable automatic differentiation functionality. This is turned off by default. If turned on, this will be turned off if one of the operators does not have well-defined gradients.

  • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

  • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

  • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

  • niterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.

  • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".

  • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).

  • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.

  • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.

  • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.

  • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.

  • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.

  • selection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.

  • dimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.
  • predict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.

Fitted parameters

The fields of fitted_params(mach) are:

  • best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). T is equal to the element type of the passed data.
  • equation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.

Report

The fields of report(mach) are:

  • best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity).
  • equation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.
  • complexities::Vector{Int}: The complexity of each expression in the Pareto frontier.
  • losses::Vector{L}: The loss of each expression in the Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.
  • scores::Vector{L}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.

Examples

using MLJ
 SRRegressor = @load SRRegressor pkg=SymbolicRegression
 X, y = @load_boston
 model = SRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)
@@ -26,4 +26,4 @@
 y_hat = predict(mach, X)
 ## View the equation used:
 r = report(mach)
-println("Equation used:", r.equation_strings[r.best_idx])

See also MultitargetSRRegressor.

+println("Equation used:", r.equation_strings[r.best_idx])

See also MultitargetSRRegressor.

diff --git a/dev/models/SVC_LIBSVM/index.html b/dev/models/SVC_LIBSVM/index.html index a200eaa4a..228c7989b 100644 --- a/dev/models/SVC_LIBSVM/index.html +++ b/dev/models/SVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -SVC · MLJ

SVC

SVC

A model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVC = @load SVC pkg=LIBSVM

Do model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).

This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+SVC · MLJ

SVC

SVC

A model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVC = @load SVC pkg=LIBSVM

Do model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).

This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
@@ -33,4 +33,4 @@
 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
  "versicolor"
  "versicolor"
- "versicolor"

See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.

+ "versicolor"

See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html b/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html index b77d1ae7f..34338165c 100644 --- a/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMClassifier · MLJ

SVMClassifier

SVMClassifier

A model type for constructing a C-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMClassifier = @load SVMClassifier pkg=MLJScikitLearnInterface

Do model = SVMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
+SVMClassifier · MLJ

SVMClassifier

SVMClassifier

A model type for constructing a C-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMClassifier = @load SVMClassifier pkg=MLJScikitLearnInterface

Do model = SVMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
diff --git a/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html b/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html index deb9d1cf3..9d93c4a48 100644 --- a/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMLinearClassifier · MLJ

SVMLinearClassifier

SVMLinearClassifier

A model type for constructing a linear support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearClassifier = @load SVMLinearClassifier pkg=MLJScikitLearnInterface

Do model = SVMLinearClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • loss = squared_hinge
  • dual = true
  • tol = 0.0001
  • C = 1.0
  • multi_class = ovr
  • fit_intercept = true
  • intercept_scaling = 1.0
  • random_state = nothing
  • max_iter = 1000
+SVMLinearClassifier · MLJ

SVMLinearClassifier

SVMLinearClassifier

A model type for constructing a linear support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearClassifier = @load SVMLinearClassifier pkg=MLJScikitLearnInterface

Do model = SVMLinearClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • loss = squared_hinge
  • dual = true
  • tol = 0.0001
  • C = 1.0
  • multi_class = ovr
  • fit_intercept = true
  • intercept_scaling = 1.0
  • random_state = nothing
  • max_iter = 1000
diff --git a/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html b/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html index 79232e7c0..1f7787400 100644 --- a/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMLinearRegressor · MLJ

SVMLinearRegressor

SVMLinearRegressor

A model type for constructing a linear support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearRegressor = @load SVMLinearRegressor pkg=MLJScikitLearnInterface

Do model = SVMLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 0.0
  • tol = 0.0001
  • C = 1.0
  • loss = epsilon_insensitive
  • fit_intercept = true
  • intercept_scaling = 1.0
  • dual = true
  • random_state = nothing
  • max_iter = 1000
+SVMLinearRegressor · MLJ

SVMLinearRegressor

SVMLinearRegressor

A model type for constructing a linear support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearRegressor = @load SVMLinearRegressor pkg=MLJScikitLearnInterface

Do model = SVMLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 0.0
  • tol = 0.0001
  • C = 1.0
  • loss = epsilon_insensitive
  • fit_intercept = true
  • intercept_scaling = 1.0
  • dual = true
  • random_state = nothing
  • max_iter = 1000
diff --git a/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html b/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html index 7b838f105..178773256 100644 --- a/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMNuClassifier · MLJ

SVMNuClassifier

SVMNuClassifier

A model type for constructing a nu-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuClassifier = @load SVMNuClassifier pkg=MLJScikitLearnInterface

Do model = SVMNuClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuClassifier(nu=...).

Hyper-parameters

  • nu = 0.5
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
+SVMNuClassifier · MLJ

SVMNuClassifier

SVMNuClassifier

A model type for constructing a nu-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuClassifier = @load SVMNuClassifier pkg=MLJScikitLearnInterface

Do model = SVMNuClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuClassifier(nu=...).

Hyper-parameters

  • nu = 0.5
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
diff --git a/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html b/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html index 31797dd28..5d932197e 100644 --- a/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMNuRegressor · MLJ

SVMNuRegressor

SVMNuRegressor

A model type for constructing a nu-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuRegressor = @load SVMNuRegressor pkg=MLJScikitLearnInterface

Do model = SVMNuRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuRegressor(nu=...).

Hyper-parameters

  • nu = 0.5
  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
+SVMNuRegressor · MLJ

SVMNuRegressor

SVMNuRegressor

A model type for constructing a nu-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuRegressor = @load SVMNuRegressor pkg=MLJScikitLearnInterface

Do model = SVMNuRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuRegressor(nu=...).

Hyper-parameters

  • nu = 0.5
  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
diff --git a/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html b/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html index 45c5d0aa9..b62ee3258 100644 --- a/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMRegressor · MLJ

SVMRegressor

SVMRegressor

A model type for constructing a epsilon-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMRegressor = @load SVMRegressor pkg=MLJScikitLearnInterface

Do model = SVMRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMRegressor(kernel=...).

Hyper-parameters

  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • tol = 0.001
  • C = 1.0
  • epsilon = 0.1
  • shrinking = true
  • cache_size = 200
  • max_iter = -1
+SVMRegressor · MLJ

SVMRegressor

SVMRegressor

A model type for constructing a epsilon-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMRegressor = @load SVMRegressor pkg=MLJScikitLearnInterface

Do model = SVMRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMRegressor(kernel=...).

Hyper-parameters

  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • tol = 0.001
  • C = 1.0
  • epsilon = 0.1
  • shrinking = true
  • cache_size = 200
  • max_iter = -1
diff --git a/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html b/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html index ce6aa61ab..d101000db 100644 --- a/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html +++ b/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html @@ -1,5 +1,5 @@ -SelfOrganizingMap · MLJ

SelfOrganizingMap

SelfOrganizingMap

A model type for constructing a self organizing map, based on SelfOrganizingMaps.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SelfOrganizingMap = @load SelfOrganizingMap pkg=SelfOrganizingMaps

Do model = SelfOrganizingMap() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SelfOrganizingMap(k=...).

SelfOrganizingMaps implements Kohonen's Self Organizing Map, Proceedings of the IEEE; Kohonen, T.; (1990):"The self-organizing map"

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X) where

  • X: an AbstractMatrix or Table of input features whose columns are of scitype Continuous.

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • k=10: Number of nodes along once side of SOM grid. There are total nodes.
  • η=0.5: Learning rate. Scales adjust made to winning node and its neighbors during each round of training.
  • σ²=0.05: The (squared) neighbor radius. Used to determine scale for neighbor node adjustments.
  • grid_type=:rectangular Node grid geometry. One of (:rectangular, :hexagonal, :spherical).
  • η_decay=:exponential Learning rate schedule function. One of (:exponential, :asymptotic)
  • σ_decay=:exponential Neighbor radius schedule function. One of (:exponential, :asymptotic, :none)
  • neighbor_function=:gaussian Kernel function used to make adjustment to neighbor weights. Scale is set by σ². One of (:gaussian, :mexican_hat).
  • matching_distance=euclidean Distance function from Distances.jl used to determine winning node.
  • Nepochs=1 Number of times to repeat training on the shuffled dataset.

Operations

  • transform(mach, Xnew): returns the coordinates of the winning SOM node for each instance of Xnew. For SOM of gridtype :rectangular and :hexagonal, these are cartesian coordinates. For gridtype :spherical, these are the latitude and longitude in radians.

Fitted parameters

The fields of fitted_params(mach) are:

  • coords: The coordinates of each of the SOM nodes (points in the domain of the map) with shape (k², 2)
  • weights: Array of weight vectors for the SOM nodes (corresponding points in the map's range) of shape (k², input dimension)

Report

The fields of report(mach) are:

  • classes: the index of the winning node for each instance of the training data X interpreted as a class label

Examples

using MLJ
+SelfOrganizingMap · MLJ

SelfOrganizingMap

SelfOrganizingMap

A model type for constructing a self organizing map, based on SelfOrganizingMaps.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SelfOrganizingMap = @load SelfOrganizingMap pkg=SelfOrganizingMaps

Do model = SelfOrganizingMap() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SelfOrganizingMap(k=...).

SelfOrganizingMaps implements Kohonen's Self Organizing Map, Proceedings of the IEEE; Kohonen, T.; (1990):"The self-organizing map"

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X) where

  • X: an AbstractMatrix or Table of input features whose columns are of scitype Continuous.

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • k=10: Number of nodes along once side of SOM grid. There are total nodes.
  • η=0.5: Learning rate. Scales adjust made to winning node and its neighbors during each round of training.
  • σ²=0.05: The (squared) neighbor radius. Used to determine scale for neighbor node adjustments.
  • grid_type=:rectangular Node grid geometry. One of (:rectangular, :hexagonal, :spherical).
  • η_decay=:exponential Learning rate schedule function. One of (:exponential, :asymptotic)
  • σ_decay=:exponential Neighbor radius schedule function. One of (:exponential, :asymptotic, :none)
  • neighbor_function=:gaussian Kernel function used to make adjustment to neighbor weights. Scale is set by σ². One of (:gaussian, :mexican_hat).
  • matching_distance=euclidean Distance function from Distances.jl used to determine winning node.
  • Nepochs=1 Number of times to repeat training on the shuffled dataset.

Operations

  • transform(mach, Xnew): returns the coordinates of the winning SOM node for each instance of Xnew. For SOM of gridtype :rectangular and :hexagonal, these are cartesian coordinates. For gridtype :spherical, these are the latitude and longitude in radians.

Fitted parameters

The fields of fitted_params(mach) are:

  • coords: The coordinates of each of the SOM nodes (points in the domain of the map) with shape (k², 2)
  • weights: Array of weight vectors for the SOM nodes (corresponding points in the map's range) of shape (k², input dimension)

Report

The fields of report(mach) are:

  • classes: the index of the winning node for each instance of the training data X interpreted as a class label

Examples

using MLJ
 som = @load SelfOrganizingMap pkg=SelfOrganizingMaps
 model = som()
 X, y = make_regression(50, 3) ## synthetic data
@@ -7,4 +7,4 @@
 X̃ = transform(mach, X)
 
 rpt = report(mach)
-classes = rpt.classes
+classes = rpt.classes
diff --git a/dev/models/SimpleImputer_BetaML/index.html b/dev/models/SimpleImputer_BetaML/index.html index b3cd6b340..c22328ac0 100644 --- a/dev/models/SimpleImputer_BetaML/index.html +++ b/dev/models/SimpleImputer_BetaML/index.html @@ -1,5 +1,5 @@ -SimpleImputer · MLJ

SimpleImputer

mutable struct SimpleImputer <: MLJModelInterface.Unsupervised

Impute missing values using feature (column) mean, with optional record normalisation (using l-norm norms), from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • statistic::Function: The descriptive statistic of the column (feature) to use as imputed value [def: mean]
  • norm::Union{Nothing, Int64}: Normalise the feature mean by l-norm norm of the records [default: nothing]. Use it (e.g. norm=1 to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).

Example:

julia> using MLJ
+SimpleImputer · MLJ

SimpleImputer

mutable struct SimpleImputer <: MLJModelInterface.Unsupervised

Impute missing values using feature (column) mean, with optional record normalisation (using l-norm norms), from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • statistic::Function: The descriptive statistic of the column (feature) to use as imputed value [def: mean]
  • norm::Union{Nothing, Int64}: Normalise the feature mean by l-norm norm of the records [default: nothing]. Use it (e.g. norm=1 to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).

Example:

julia> using MLJ
 
 julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
 
@@ -26,4 +26,4 @@
  0.280952    1.69524
  3.3        38.0
  0.0750839  -2.3
- 5.2        -2.4
+ 5.2 -2.4
diff --git a/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html b/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html index 2cdde8b10..5dd586e86 100644 --- a/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html +++ b/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SpectralClustering · MLJ

SpectralClustering

SpectralClustering

A model type for constructing a spectral clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SpectralClustering = @load SpectralClustering pkg=MLJScikitLearnInterface

Do model = SpectralClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SpectralClustering(n_clusters=...).

Apply clustering to a projection of the normalized Laplacian. In practice spectral clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plane.

+SpectralClustering · MLJ

SpectralClustering

SpectralClustering

A model type for constructing a spectral clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SpectralClustering = @load SpectralClustering pkg=MLJScikitLearnInterface

Do model = SpectralClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SpectralClustering(n_clusters=...).

Apply clustering to a projection of the normalized Laplacian. In practice spectral clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plane.

diff --git a/dev/models/StableForestClassifier_SIRUS/index.html b/dev/models/StableForestClassifier_SIRUS/index.html index 399cf0d02..282d742a3 100644 --- a/dev/models/StableForestClassifier_SIRUS/index.html +++ b/dev/models/StableForestClassifier_SIRUS/index.html @@ -1,2 +1,2 @@ -StableForestClassifier · MLJ

StableForestClassifier

StableForestClassifier

A model type for constructing a stable forest classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestClassifier = @load StableForestClassifier pkg=SIRUS

Do model = StableForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestClassifier(rng=...).

StableForestClassifier implements the random forest classifier with a stabilized forest structure (Bénard et al., 2021). This stabilization increases stability when extracting rules. The impact on the predictive accuracy compared to standard random forests should be relatively small.

Note

Just like normal random forests, this model is not easily explainable. If you are interested in an explainable model, use the StableRulesClassifier or StableRulesRegressor.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableForestClassifier · MLJ

StableForestClassifier

StableForestClassifier

A model type for constructing a stable forest classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestClassifier = @load StableForestClassifier pkg=SIRUS

Do model = StableForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestClassifier(rng=...).

StableForestClassifier implements the random forest classifier with a stabilized forest structure (Bénard et al., 2021). This stabilization increases stability when extracting rules. The impact on the predictive accuracy compared to standard random forests should be relatively small.

Note

Just like normal random forests, this model is not easily explainable. If you are interested in an explainable model, use the StableRulesClassifier or StableRulesRegressor.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/StableForestRegressor_SIRUS/index.html b/dev/models/StableForestRegressor_SIRUS/index.html index 44a065f0b..2fcef423c 100644 --- a/dev/models/StableForestRegressor_SIRUS/index.html +++ b/dev/models/StableForestRegressor_SIRUS/index.html @@ -1,2 +1,2 @@ -StableForestRegressor · MLJ

StableForestRegressor

StableForestRegressor

A model type for constructing a stable forest regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestRegressor = @load StableForestRegressor pkg=SIRUS

Do model = StableForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestRegressor(rng=...).

StableForestRegressor implements the random forest regressor with a stabilized forest structure (Bénard et al., 2021).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableForestRegressor · MLJ

StableForestRegressor

StableForestRegressor

A model type for constructing a stable forest regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestRegressor = @load StableForestRegressor pkg=SIRUS

Do model = StableForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestRegressor(rng=...).

StableForestRegressor implements the random forest regressor with a stabilized forest structure (Bénard et al., 2021).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/StableRulesClassifier_SIRUS/index.html b/dev/models/StableRulesClassifier_SIRUS/index.html index 89f349782..43de939c6 100644 --- a/dev/models/StableRulesClassifier_SIRUS/index.html +++ b/dev/models/StableRulesClassifier_SIRUS/index.html @@ -1,2 +1,2 @@ -StableRulesClassifier · MLJ

StableRulesClassifier

StableRulesClassifier

A model type for constructing a stable rules classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesClassifier = @load StableRulesClassifier pkg=SIRUS

Do model = StableRulesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesClassifier(rng=...).

StableRulesClassifier implements the explainable rule-based model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableRulesClassifier · MLJ

StableRulesClassifier

StableRulesClassifier

A model type for constructing a stable rules classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesClassifier = @load StableRulesClassifier pkg=SIRUS

Do model = StableRulesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesClassifier(rng=...).

StableRulesClassifier implements the explainable rule-based model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/StableRulesRegressor_SIRUS/index.html b/dev/models/StableRulesRegressor_SIRUS/index.html index 9053bd03e..8c869f0a4 100644 --- a/dev/models/StableRulesRegressor_SIRUS/index.html +++ b/dev/models/StableRulesRegressor_SIRUS/index.html @@ -1,2 +1,2 @@ -StableRulesRegressor · MLJ

StableRulesRegressor

StableRulesRegressor

A model type for constructing a stable rules regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesRegressor = @load StableRulesRegressor pkg=SIRUS

Do model = StableRulesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesRegressor(rng=...).

StableRulesRegressor implements the explainable rule-based regression model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableRulesRegressor · MLJ

StableRulesRegressor

StableRulesRegressor

A model type for constructing a stable rules regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesRegressor = @load StableRulesRegressor pkg=SIRUS

Do model = StableRulesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesRegressor(rng=...).

StableRulesRegressor implements the explainable rule-based regression model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/Stack_MLJBase/index.html b/dev/models/Stack_MLJBase/index.html new file mode 100644 index 000000000..3779b8e9b --- /dev/null +++ b/dev/models/Stack_MLJBase/index.html @@ -0,0 +1,12 @@ + +Stack · MLJ

Stack

Union{Types...}

A type union is an abstract type which includes all instances of any of its argument types. The empty union Union{} is the bottom type of Julia.

Examples

julia> IntOrString = Union{Int,AbstractString}
+Union{Int64, AbstractString}
+
+julia> 1 isa IntOrString
+true
+
+julia> "Hello!" isa IntOrString
+true
+
+julia> 1.0 isa IntOrString
+false
diff --git a/dev/models/Standardizer_MLJModels/index.html b/dev/models/Standardizer_MLJModels/index.html index 1bd2aa705..ca13cacde 100644 --- a/dev/models/Standardizer_MLJModels/index.html +++ b/dev/models/Standardizer_MLJModels/index.html @@ -1,5 +1,5 @@ -Standardizer · MLJ

Standardizer

Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype
    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).
    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.
  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized
  • means - the corresponding untransformed mean values
  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
+Standardizer · MLJ

Standardizer

Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype
    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).
    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.
  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized
  • means - the corresponding untransformed mean values
  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
 
 X = (ordinal1 = [1, 2, 3],
      ordinal2 = coerce([:x, :y, :x], OrderedFactor),
@@ -34,4 +34,4 @@
  ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],
  ordinal3 = [10.0, 20.0, 30.0],
  ordinal4 = [1.0, 0.0, -1.0],
- nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

+ nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

diff --git a/dev/models/SubspaceLDA_MultivariateStats/index.html b/dev/models/SubspaceLDA_MultivariateStats/index.html index d3bd7884b..4f5eb262c 100644 --- a/dev/models/SubspaceLDA_MultivariateStats/index.html +++ b/dev/models/SubspaceLDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -SubspaceLDA · MLJ

SubspaceLDA

SubspaceLDA

A model type for constructing a subpace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats

Do model = SubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SubspaceLDA(normalize=...).

Multiclass subspace linear discriminant analysis (LDA) is a variation on ordinary LDA suitable for high dimensional data, as it avoids storing scatter matrices. For details, refer the MultivariateStats.jl documentation.

In addition to dimension reduction (using transform) probabilistic classification is provided (using predict). In the case of classification, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.
  • outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool)

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
+SubspaceLDA · MLJ

SubspaceLDA

SubspaceLDA

A model type for constructing a subpace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats

Do model = SubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SubspaceLDA(normalize=...).

Multiclass subspace linear discriminant analysis (LDA) is a variation on ordinary LDA suitable for high dimensional data, as it avoids storing scatter matrices. For details, refer the MultivariateStats.jl documentation.

In addition to dimension reduction (using transform) probabilistic classification is provided (using predict). In the case of classification, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.
  • outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool)

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
 
 SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats
 
@@ -10,4 +10,4 @@
 
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
-labels = predict_mode(mach, X)

See also LDA, BayesianLDA, BayesianSubspaceLDA

+labels = predict_mode(mach, X)

See also LDA, BayesianLDA, BayesianSubspaceLDA

diff --git a/dev/models/TSVDTransformer_TSVD/index.html b/dev/models/TSVDTransformer_TSVD/index.html index b3ec166e5..298e10570 100644 --- a/dev/models/TSVDTransformer_TSVD/index.html +++ b/dev/models/TSVDTransformer_TSVD/index.html @@ -1,2 +1,2 @@ -TSVDTransformer · MLJ
+TSVDTransformer · MLJ
diff --git a/dev/models/TfidfTransformer_MLJText/index.html b/dev/models/TfidfTransformer_MLJText/index.html index f8d514261..4a98aae51 100644 --- a/dev/models/TfidfTransformer_MLJText/index.html +++ b/dev/models/TfidfTransformer_MLJText/index.html @@ -1,5 +1,5 @@ -TfidfTransformer · MLJ

TfidfTransformer

TfidfTransformer

A model type for constructing a TF-IFD transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TfidfTransformer = @load TfidfTransformer pkg=MLJText

Do model = TfidfTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TfidfTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of TF-IDF scores. Here "TF" means term-frequency while "IDF" means inverse document frequency (defined below). The TF-IDF score is the product of the two. This is a common term weighting scheme in information retrieval, that has also found good use in document classification. The goal of using TF-IDF instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary and IDF learned in training, return the matrix of TF-IDF scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the strings used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.

Examples

TfidfTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
+TfidfTransformer · MLJ

TfidfTransformer

TfidfTransformer

A model type for constructing a TF-IFD transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TfidfTransformer = @load TfidfTransformer pkg=MLJText

Do model = TfidfTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TfidfTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of TF-IDF scores. Here "TF" means term-frequency while "IDF" means inverse document frequency (defined below). The TF-IDF score is the product of the two. This is a common term weighting scheme in information retrieval, that has also found good use in document classification. The goal of using TF-IDF instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary and IDF learned in training, return the matrix of TF-IDF scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the strings used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.

Examples

TfidfTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
 import TextAnalysis
 
 TfidfTransformer = @load TfidfTransformer pkg=MLJText
@@ -43,4 +43,4 @@
 MLJ.fit!(mach)
 fitted_params(mach)
 
-tfidf_mat = transform(mach, ngram_docs)

See also CountTransformer, BM25Transformer

+tfidf_mat = transform(mach, ngram_docs)

See also CountTransformer, BM25Transformer

diff --git a/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html b/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html index 8b5590dc0..b86506917 100644 --- a/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -TheilSenRegressor · MLJ

TheilSenRegressor

TheilSenRegressor

A model type for constructing a Theil-Sen regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TheilSenRegressor = @load TheilSenRegressor pkg=MLJScikitLearnInterface

Do model = TheilSenRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TheilSenRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • max_subpopulation = 10000
  • n_subsamples = nothing
  • max_iter = 300
  • tol = 0.001
  • random_state = nothing
  • n_jobs = nothing
  • verbose = false
+TheilSenRegressor · MLJ

TheilSenRegressor

TheilSenRegressor

A model type for constructing a Theil-Sen regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TheilSenRegressor = @load TheilSenRegressor pkg=MLJScikitLearnInterface

Do model = TheilSenRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TheilSenRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • max_subpopulation = 10000
  • n_subsamples = nothing
  • max_iter = 300
  • tol = 0.001
  • random_state = nothing
  • n_jobs = nothing
  • verbose = false
diff --git a/dev/models/TomekUndersampler_Imbalance/index.html b/dev/models/TomekUndersampler_Imbalance/index.html index 21ec6b81e..e00f8c042 100644 --- a/dev/models/TomekUndersampler_Imbalance/index.html +++ b/dev/models/TomekUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -TomekUndersampler · MLJ

TomekUndersampler

Initiate a tomek undersampling model with the given hyper-parameters.

TomekUndersampler

A model type for constructing a tomek undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TomekUndersampler = @load TomekUndersampler pkg=Imbalance

Do model = TomekUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TomekUndersampler(min_ratios=...).

TomekUndersampler undersamples by removing any point that is part of a tomek link in the data. As defined in, Ivan Tomek. Two modifications of cnn. IEEE Trans. Systems, Man and Cybernetics, 6:769–772, 1976.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = TomekUndersampler()

Hyperparameters

  • min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.

    • Can be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float minimum ratio for that class
  • force_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • try_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using TomekUndersampler, returning both the new and original observations

Example

using MLJ
+TomekUndersampler · MLJ

TomekUndersampler

Initiate a tomek undersampling model with the given hyper-parameters.

TomekUndersampler

A model type for constructing a tomek undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TomekUndersampler = @load TomekUndersampler pkg=Imbalance

Do model = TomekUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TomekUndersampler(min_ratios=...).

TomekUndersampler undersamples by removing any point that is part of a tomek link in the data. As defined in, Ivan Tomek. Two modifications of cnn. IEEE Trans. Systems, Man and Cybernetics, 6:769–772, 1976.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = TomekUndersampler()

Hyperparameters

  • min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.

    • Can be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float minimum ratio for that class
  • force_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • try_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using TomekUndersampler, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -25,4 +25,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 22 (115.8%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 36 (189.5%)
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 36 (189.5%)
diff --git a/dev/models/TransformedTargetModel_MLJBase/index.html b/dev/models/TransformedTargetModel_MLJBase/index.html new file mode 100644 index 000000000..6485b5c49 --- /dev/null +++ b/dev/models/TransformedTargetModel_MLJBase/index.html @@ -0,0 +1,4 @@ + +TransformedTargetModel · MLJ

TransformedTargetModel

TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)

Wrap the supervised or semi-supervised model in a transformation of the target variable.

Here transformer one of the following:

  • The Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.
  • A callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.

Specify cache=false to prioritize memory over speed, or to guarantee data anonymity.

Specify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).

Examples

A model that normalizes the target before applying ridge regression, with predictions returned on the original scale:

@load RidgeRegressor pkg=MLJLinearModels
+model = RidgeRegressor()
+tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
diff --git a/dev/models/TunedModel_MLJTuning/index.html b/dev/models/TunedModel_MLJTuning/index.html new file mode 100644 index 000000000..f1b9c51c7 --- /dev/null +++ b/dev/models/TunedModel_MLJTuning/index.html @@ -0,0 +1,14 @@ + +TunedModel · MLJ

TunedModel

tuned_model = TunedModel(; model=<model to be mutated>,
+                         tuning=RandomSearch(),
+                         resampling=Holdout(),
+                         range=nothing,
+                         measure=nothing,
+                         n=default_n(tuning, range),
+                         operation=nothing,
+                         other_options...)

Construct a model wrapper for hyper-parameter optimization of a supervised learner, specifying the tuning strategy and model whose hyper-parameters are to be mutated.

tuned_model = TunedModel(; models=<models to be compared>,
+                         resampling=Holdout(),
+                         measure=nothing,
+                         n=length(models),
+                         operation=nothing,
+                         other_options...)

Construct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.

See below for a complete list of options.

Training

Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:

  • Instigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.
  • Fit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.

Search space

The range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.

The number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.

Measures (metrics)

If more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.

Important. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.

Accessing the fitted parameters and other training (tuning) outcomes

A Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).

Once a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:

keyvalue
best_modeloptimal model instance
best_fitted_paramslearned parameters of the optimal model

The named tuple report(mach) includes these keys/values:

keyvalue
best_modeloptimal model instance
best_history_entrycorresponding entry in the history, including performance estimate
best_reportreport generated by fitting the optimal model to all data
historytuning strategy-specific history of all evaluations

plus other key/value pairs specific to the tuning strategy.

Each element of history is a property-accessible object with these properties:

keyvalue
measurevector of measures (metrics)
measurementvector of measurements, one per measure
per_foldvector of vectors of unaggregated per-fold measurements
evaluationfull PerformanceEvaluation/CompactPerformaceEvaluation object

Complete list of key-word options

  • model: Supervised model prototype that is cloned and mutated to generate models for evaluation
  • models: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.
  • tuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.
  • resampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations
  • measure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history
  • weights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).
  • class_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).
  • repeats=1: for generating train/test sets multiple times in resampling ("Monte Carlo" resampling); see evaluate! for details
  • operation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.
  • range: range object; tuning strategy documentation describes supported types
  • selection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).
  • n: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified
  • train_best=true: whether to train the optimal model
  • acceleration=default_resource(): mode of parallelization for tuning strategies that support this
  • acceleration_resampling=CPU1(): mode of parallelization for resampling
  • check_measure=true: whether to check measure is compatible with the specified model and operation)
  • cache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.
  • compact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.
diff --git a/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html b/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html index 6185c15aa..2115f8e88 100644 --- a/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html +++ b/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateBoxCoxTransformer · MLJ

UnivariateBoxCoxTransformer

UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try
  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach
  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent
  • c: the learned shift

Examples

using MLJ
+UnivariateBoxCoxTransformer · MLJ

UnivariateBoxCoxTransformer

UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try
  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach
  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent
  • c: the learned shift

Examples

using MLJ
 using UnicodePlots
 using Random
 Random.seed!(123)
@@ -38,4 +38,4 @@
    [ 3.0,  4.0) ┤▎ 1
                 └                                        ┘
                                  Frequency
-
+
diff --git a/dev/models/UnivariateDiscretizer_MLJModels/index.html b/dev/models/UnivariateDiscretizer_MLJModels/index.html index cfaf956b7..7f5b542de 100644 --- a/dev/models/UnivariateDiscretizer_MLJModels/index.html +++ b/dev/models/UnivariateDiscretizer_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateDiscretizer · MLJ

UnivariateDiscretizer

UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach
  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)
  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Example

using MLJ
+UnivariateDiscretizer · MLJ

UnivariateDiscretizer

UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach
  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)
  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Example

using MLJ
 using Random
 Random.seed!(123)
 
@@ -30,4 +30,4 @@
  0.012731354778359405
  0.0056265330571125816
  0.005738175684445124
- 0.006835652575801987
+ 0.006835652575801987
diff --git a/dev/models/UnivariateFillImputer_MLJModels/index.html b/dev/models/UnivariateFillImputer_MLJModels/index.html index e76655f9c..9e0773fbf 100644 --- a/dev/models/UnivariateFillImputer_MLJModels/index.html +++ b/dev/models/UnivariateFillImputer_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateFillImputer · MLJ

UnivariateFillImputer

UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
+UnivariateFillImputer · MLJ

UnivariateFillImputer

UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
 imputer = UnivariateFillImputer()
 
 x_continuous = [1.0, 2.0, missing, 3.0]
@@ -34,4 +34,4 @@
 3-element Vector{Int64}:
  2
  2
- 5

For imputing tabular data, use FillImputer.

+ 5

For imputing tabular data, use FillImputer.

diff --git a/dev/models/UnivariateStandardizer_MLJModels/index.html b/dev/models/UnivariateStandardizer_MLJModels/index.html index 957667442..4e80aab7c 100644 --- a/dev/models/UnivariateStandardizer_MLJModels/index.html +++ b/dev/models/UnivariateStandardizer_MLJModels/index.html @@ -1,2 +1,2 @@ -UnivariateStandardizer · MLJ

UnivariateStandardizer

UnivariateStandardizer()

Transformer type for standardizing (whitening) single variable data.

This model may be deprecated in the future. Consider using Standardizer, which handles both tabular and univariate data.

+UnivariateStandardizer · MLJ

UnivariateStandardizer

UnivariateStandardizer()

Transformer type for standardizing (whitening) single variable data.

This model may be deprecated in the future. Consider using Standardizer, which handles both tabular and univariate data.

diff --git a/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html b/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html index a73800f30..459a6df23 100644 --- a/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html +++ b/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateTimeTypeToContinuous · MLJ

UnivariateTimeTypeToContinuous

UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.
  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
+UnivariateTimeTypeToContinuous · MLJ

UnivariateTimeTypeToContinuous

UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.
  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
 using Dates
 
 x = [Date(2001, 1, 1) + Day(i) for i in 0:4]
@@ -15,4 +15,4 @@
  52.42857142857143
  52.57142857142857
  52.714285714285715
- 52.857142
+ 52.857142
diff --git a/dev/models/XGBoostClassifier_XGBoost/index.html b/dev/models/XGBoostClassifier_XGBoost/index.html index 4765672f2..eee371ba3 100644 --- a/dev/models/XGBoostClassifier_XGBoost/index.html +++ b/dev/models/XGBoostClassifier_XGBoost/index.html @@ -1,2 +1,2 @@ -XGBoostClassifier · MLJ

XGBoostClassifier

XGBoostClassifier

A model type for constructing a eXtreme Gradient Boosting Classifier, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost

Do model = XGBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostClassifier(test=...).

Univariate classification using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector Finite target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

+XGBoostClassifier · MLJ

XGBoostClassifier

XGBoostClassifier

A model type for constructing a eXtreme Gradient Boosting Classifier, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost

Do model = XGBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostClassifier(test=...).

Univariate classification using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector Finite target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

diff --git a/dev/models/XGBoostCount_XGBoost/index.html b/dev/models/XGBoostCount_XGBoost/index.html index f2917235a..98981c7e5 100644 --- a/dev/models/XGBoostCount_XGBoost/index.html +++ b/dev/models/XGBoostCount_XGBoost/index.html @@ -1,2 +1,2 @@ -XGBoostCount · MLJ

XGBoostCount

XGBoostCount

A model type for constructing a eXtreme Gradient Boosting Count Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostCount = @load XGBoostCount pkg=XGBoost

Do model = XGBoostCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostCount(test=...).

Univariate discrete regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector continuous target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

+XGBoostCount · MLJ

XGBoostCount

XGBoostCount

A model type for constructing a eXtreme Gradient Boosting Count Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostCount = @load XGBoostCount pkg=XGBoost

Do model = XGBoostCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostCount(test=...).

Univariate discrete regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector continuous target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

diff --git a/dev/models/XGBoostRegressor_XGBoost/index.html b/dev/models/XGBoostRegressor_XGBoost/index.html index 2e79caf8c..626bf178c 100644 --- a/dev/models/XGBoostRegressor_XGBoost/index.html +++ b/dev/models/XGBoostRegressor_XGBoost/index.html @@ -1,2 +1,2 @@ -XGBoostRegressor · MLJ

XGBoostRegressor

XGBoostRegressor

A model type for constructing a eXtreme Gradient Boosting Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostRegressor = @load XGBoostRegressor pkg=XGBoost

Do model = XGBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostRegressor(test=...).

Univariate continuous regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features whose columns have Continuous element scitype; check column scitypes with schema(X).
  • y: is an AbstractVector target with Continuous elements; check the scitype with scitype(y).

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

+XGBoostRegressor · MLJ

XGBoostRegressor

XGBoostRegressor

A model type for constructing a eXtreme Gradient Boosting Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostRegressor = @load XGBoostRegressor pkg=XGBoost

Do model = XGBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostRegressor(test=...).

Univariate continuous regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features whose columns have Continuous element scitype; check column scitypes with schema(X).
  • y: is an AbstractVector target with Continuous elements; check the scitype with scitype(y).

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

diff --git a/dev/modifying_behavior/index.html b/dev/modifying_behavior/index.html index da0421f4a..22615135c 100644 --- a/dev/modifying_behavior/index.html +++ b/dev/modifying_behavior/index.html @@ -1,4 +1,4 @@ -Modifying Behavior · MLJ

Modifying Behavior

To modify behavior of MLJ you will need to clone the relevant component package (e.g., MLJBase.jl) - or a fork thereof - and modify your local julia environment to use your local clone in place of the official release. For example, you might proceed something like this:

using Pkg
+Modifying Behavior · MLJ

Modifying Behavior

To modify behavior of MLJ you will need to clone the relevant component package (e.g., MLJBase.jl) - or a fork thereof - and modify your local julia environment to use your local clone in place of the official release. For example, you might proceed something like this:

using Pkg
 Pkg.activate("my_MLJ_enf", shared=true)
-Pkg.develop("path/to/my/local/MLJBase")

To test your local clone, do

Pkg.test("MLJBase")

For more on package management, see here.

+Pkg.develop("path/to/my/local/MLJBase")

To test your local clone, do

Pkg.test("MLJBase")

For more on package management, see here.

diff --git a/dev/more_on_probabilistic_predictors/index.html b/dev/more_on_probabilistic_predictors/index.html deleted file mode 100644 index cbf0dae9c..000000000 --- a/dev/more_on_probabilistic_predictors/index.html +++ /dev/null @@ -1,26 +0,0 @@ - -More on Probabilistic Predictors · MLJ

More on Probabilistic Predictors

Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.

The positive class is always the second class returned when calling levels on the training target y.

MLJModels.BinaryThresholdPredictorType
BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
-rng = Xoshiro(123)
-
-diabetes = OpenML.load(43582)
-outcome, X = unpack(diabetes, ==(:Outcome), rng=rng);
-y = coerce(Int.(outcome), OrderedFactor);

Choosing a probabilistic classifier:

EvoTreesClassifier = @load EvoTreesClassifier
-prob_predictor = EvoTreesClassifier()

Wrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:

point_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)
-Xnew, _ = make_moons(3, rng=rng)
-mach = machine(point_predictor, X, y) |> fit!
-predict(mach, X)[1:3] # [0, 0, 0]

Estimating performance:

balanced = BalancedAccuracy(adjusted=true)
-e = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])
-e.measurement[1] # 0.405 ± 0.089

Wrapping in tuning strategy to learn threshold that maximizes balanced accuracy:

r = range(point_predictor, :threshold, lower=0.1, upper=0.9)
-tuned_point_predictor = TunedModel(
-    point_predictor,
-    tuning=RandomSearch(rng=rng),
-    resampling=CV(nfolds=6),
-    range = r,
-    measure=balanced,
-    n=30,
-)
-mach2 = machine(tuned_point_predictor, X, y) |> fit!
-optimized_point_predictor = report(mach2).best_model
-optimized_point_predictor.threshold # 0.260
-predict(mach2, X)[1:3] # [1, 1, 0]

Estimating the performance of the auto-thresholding model (nested resampling here):

e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
-e.measurement[1] # 0.477 ± 0.110
source
diff --git a/dev/objects.inv b/dev/objects.inv index c02e8f77e..ba4a1891a 100644 Binary files a/dev/objects.inv and b/dev/objects.inv differ diff --git a/dev/openml_integration/index.html b/dev/openml_integration/index.html index c922dec94..6e6082e39 100644 --- a/dev/openml_integration/index.html +++ b/dev/openml_integration/index.html @@ -1,2 +1,2 @@ -OpenML Integration · MLJ

OpenML Integration

The OpenML platform provides an integration platform for carrying out and comparing machine learning solutions across a broad collection of public datasets and software platforms.

Integration with OpenML API is presently limited to querying and downloading datasets.

Documentation is here.

+OpenML Integration · MLJ

OpenML Integration

The OpenML platform provides an integration platform for carrying out and comparing machine learning solutions across a broad collection of public datasets and software platforms.

Integration with OpenML API is presently limited to querying and downloading datasets.

Documentation is here.

diff --git a/dev/performance_measures/index.html b/dev/performance_measures/index.html index adc97cedc..dda99db19 100644 --- a/dev/performance_measures/index.html +++ b/dev/performance_measures/index.html @@ -1,8 +1,8 @@ -Performance Measures · MLJ

Performance Measures

Introduction

In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are collectively referred to as measures. These measures are provided by the package StatisticalMeasures.jl but are immediately available to the MLJ user. Here's a simple example of direct application of the log_loss measures to compute a training loss:

using MLJ
+Performance Measures · MLJ

Performance Measures

Introduction

In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are collectively referred to as measures. These measures are provided by the package StatisticalMeasures.jl but are immediately available to the MLJ user. Here's a simple example of direct application of the log_loss measures to compute a training loss:

using MLJ
 X, y = @load_iris
 DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
 tree = DecisionTreeClassifier(max_depth=2)
 mach = machine(tree, X, y) |> fit!
 yhat = predict(mach, X)
-log_loss(yhat, y)
0.143176310291424

For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.

A list of all measures, ready to use after running using MLJ or using StatisticalMeasures, is here. Alternatively, call measures() (experimental) to generate a dictionary keyed on available measure constructors, with measure metadata as values.

Custom measures

Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an "atomic" measure can be transformed into a multi-target measure using this package.

Uses of measures

In MLJ, measures are specified:

and elsewhere.

Using LossFunctions.jl

In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.

Receiver operator characteristics

A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve method:

StatisticalMeasures.roc_curveFunction
roc_curve(ŷ, y) -> false_positive_rates, true_positive_rates, thresholds

Return data for plotting the receiver operator characteristic (ROC curve) for a binary classification problem.

Here is a vector of UnivariateFinite distributions (from CategoricalDistributions.jl) over the two values taken by the ground truth observations y, a CategoricalVector.

If there are k unique probabilities, then there are correspondingly k thresholds and k+1 "bins" over which the false positive and true positive rates are constant.:

  • [0.0 - thresholds[1]]
  • [thresholds[1] - thresholds[2]]
  • ...
  • [thresholds[k] - 1]

Consequently, true_positive_rates and false_positive_rates have length k+1 if thresholds has length k.

To plot the curve using your favorite plotting backend, do something like plot(false_positive_rates, true_positive_rates).

Core algorithm: Functions.roc_curve

See also AreaUnderCurve.

source

Migration guide for changes to measures in MLJBase 1.0

Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:

Breaking behavior likely relevant to many users

  • If using MLJBase without MLJ, then, in Julia 1.9 or higher, StatisticalMeasures must be explicitly imported to use measures that were previously part of MLJBase. If using MLJ, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below).

  • All measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping Unaggregated) no longer do so. To get per-observation measurements, use the new method StatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights]).

  • The default measure for regression models (used in evaluate/evaluate! when measures is unspecified) is changed from rms to l2=LPLoss(2) (mean sum of squares).

  • MeanAbsoluteError has been removed and instead mae is an alias for LPLoss(p=1).

  • Measures that previously skipped NaN values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not support missing now do.

  • Aliases for measure types have been removed. For example RMSE (alias for RootMeanSquaredError) is gone. Aliases for instances, such as rms and cross_entropy persist. The exception is precision, for which ppv can be used in its place. (This is to avoid conflict with Base.precision, which was previously pirated.)

  • info(measure) has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measure rms, do import StatisticalMeasures as SM; SM.orientation(rms).

  • Behavior of the measures() method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples; measures(predicate) is decommissioned, but measures(needle) is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.)

  • Measures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly import LossFunctions and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples.

  • Some user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.

  • Measures with a "feature argument" X, as in some_measure(ŷ, y, X), are no longer supported. See What is a measure? for allowed signatures in measures.

Packages implementing the MLJ model interface

The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:

  • Add StatisticalMeasures as test dependency, and add using StatisticalMeasures to your runtests.jl (and/or included submodules).

  • If measures are qualified, as in MLJBase.rms, then the qualification must be removed or changed to StatisticalMeasures.rms, etc.

  • Be aware that the default measure used in methods such as evaluate!, when measure is not specified, is changed from rms to l2 for regression models.

  • Be aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.

Breaking behavior possibly relevant to some developers

  • The abstract measure types Aggregated, Unaggregated, Measure have been decommissioned. (A measure is now defined purely by its calling behavior.)

  • What were previously exported as measure types are now only constructors.

  • target_scitype(measure) is decommissioned. Related is StatisticalMeasures.observation_scitype(measure) which declares an upper bound on the allowed scitype of a single observation.

  • prediction_type(measure) is decommissioned. Instead use StatisticalMeasures.kind_of_proxy(measure).

  • The trait reports_each_observation is decommissioned. Related is StatisticalMeasures.can_report_unaggregated; if false the new measurements method simply returns n copies of the aggregated measurement, where n is the number of observations provided, instead of individual observation-dependent measurements.

  • aggregation(measure) has been decommissioned. Instead use StatisticalMeasures.external_mode_of_aggregation(measure).

  • instances(measure) has been decommissioned; query docstrings for measure aliases, or follow this example: aliases = measures()[RootMeanSquaredError].aliases.

  • is_feature_dependent(measure) has been decommissioned. Measures consuming feature data are not longer supported; see above.

  • distribution_type(measure) has been decommissioned.

  • docstring(measure) has been decommissioned.

  • Behavior of aggregate has changed.

  • The following traits, previously exported by MLJBase and MLJ, cannot be applied to measures: supports_weights, supports_class_weights, orientation, human_name. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as in import StatisticalMeasures; StatisticalMeasures.orientation(measure)).

+log_loss(yhat, y)
0.143176310291424

For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.

A list of all measures, ready to use after running using MLJ or using StatisticalMeasures, is here. Alternatively, call measures() (experimental) to generate a dictionary keyed on available measure constructors, with measure metadata as values.

Custom measures

Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an "atomic" measure can be transformed into a multi-target measure using this package.

Uses of measures

In MLJ, measures are specified:

and elsewhere.

Using LossFunctions.jl

In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.

Receiver operator characteristics

A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve method:

StatisticalMeasures.roc_curveFunction
roc_curve(ŷ, y) -> false_positive_rates, true_positive_rates, thresholds

Return data for plotting the receiver operator characteristic (ROC curve) for a binary classification problem.

Here is a vector of UnivariateFinite distributions (from CategoricalDistributions.jl) over the two values taken by the ground truth observations y, a CategoricalVector.

If there are k unique probabilities, then there are correspondingly k thresholds and k+1 "bins" over which the false positive and true positive rates are constant.:

  • [0.0 - thresholds[1]]
  • [thresholds[1] - thresholds[2]]
  • ...
  • [thresholds[k] - 1]

Consequently, true_positive_rates and false_positive_rates have length k+1 if thresholds has length k.

To plot the curve using your favorite plotting backend, do something like plot(false_positive_rates, true_positive_rates).

Core algorithm: Functions.roc_curve

See also AreaUnderCurve.

source

Migration guide for changes to measures in MLJBase 1.0

Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:

Breaking behavior likely relevant to many users

  • If using MLJBase without MLJ, then, in Julia 1.9 or higher, StatisticalMeasures must be explicitly imported to use measures that were previously part of MLJBase. If using MLJ, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below).

  • All measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping Unaggregated) no longer do so. To get per-observation measurements, use the new method StatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights]).

  • The default measure for regression models (used in evaluate/evaluate! when measures is unspecified) is changed from rms to l2=LPLoss(2) (mean sum of squares).

  • MeanAbsoluteError has been removed and instead mae is an alias for LPLoss(p=1).

  • Measures that previously skipped NaN values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not support missing now do.

  • Aliases for measure types have been removed. For example RMSE (alias for RootMeanSquaredError) is gone. Aliases for instances, such as rms and cross_entropy persist. The exception is precision, for which ppv can be used in its place. (This is to avoid conflict with Base.precision, which was previously pirated.)

  • info(measure) has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measure rms, do import StatisticalMeasures as SM; SM.orientation(rms).

  • Behavior of the measures() method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples; measures(predicate) is decommissioned, but measures(needle) is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.)

  • Measures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly import LossFunctions and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples.

  • Some user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.

  • Measures with a "feature argument" X, as in some_measure(ŷ, y, X), are no longer supported. See What is a measure? for allowed signatures in measures.

Packages implementing the MLJ model interface

The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:

  • Add StatisticalMeasures as test dependency, and add using StatisticalMeasures to your runtests.jl (and/or included submodules).

  • If measures are qualified, as in MLJBase.rms, then the qualification must be removed or changed to StatisticalMeasures.rms, etc.

  • Be aware that the default measure used in methods such as evaluate!, when measure is not specified, is changed from rms to l2 for regression models.

  • Be aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.

Breaking behavior possibly relevant to some developers

  • The abstract measure types Aggregated, Unaggregated, Measure have been decommissioned. (A measure is now defined purely by its calling behavior.)

  • What were previously exported as measure types are now only constructors.

  • target_scitype(measure) is decommissioned. Related is StatisticalMeasures.observation_scitype(measure) which declares an upper bound on the allowed scitype of a single observation.

  • prediction_type(measure) is decommissioned. Instead use StatisticalMeasures.kind_of_proxy(measure).

  • The trait reports_each_observation is decommissioned. Related is StatisticalMeasures.can_report_unaggregated; if false the new measurements method simply returns n copies of the aggregated measurement, where n is the number of observations provided, instead of individual observation-dependent measurements.

  • aggregation(measure) has been decommissioned. Instead use StatisticalMeasures.external_mode_of_aggregation(measure).

  • instances(measure) has been decommissioned; query docstrings for measure aliases, or follow this example: aliases = measures()[RootMeanSquaredError].aliases.

  • is_feature_dependent(measure) has been decommissioned. Measures consuming feature data are not longer supported; see above.

  • distribution_type(measure) has been decommissioned.

  • docstring(measure) has been decommissioned.

  • Behavior of aggregate has changed.

  • The following traits, previously exported by MLJBase and MLJ, cannot be applied to measures: supports_weights, supports_class_weights, orientation, human_name. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as in import StatisticalMeasures; StatisticalMeasures.orientation(measure)).

diff --git a/dev/preparing_data/index.html b/dev/preparing_data/index.html index 357f0f561..0b101fbf9 100644 --- a/dev/preparing_data/index.html +++ b/dev/preparing_data/index.html @@ -1,5 +1,5 @@ -Preparing Data · MLJ

Preparing Data

Splitting data

MLJ has two tools for splitting data. To split data vertically (that is, to split by observations) use partition. This is commonly applied to a vector of observation indices, but can also be applied to datasets themselves, provided they are vectors, matrices or tables.

To split tabular data horizontally (i.e., break up a table based on feature names) use unpack.

MLJBase.partitionFunction
partition(X, fractions...;
+Preparing Data · MLJ

Preparing Data

Splitting data

MLJ has two tools for splitting data. To split data vertically (that is, to split by observations) use partition. This is commonly applied to a vector of observation indices, but can also be applied to datasets themselves, provided they are vectors, matrices or tables.

To split tabular data horizontally (i.e., break up a table based on feature names) use unpack.

MLJBase.partitionFunction
partition(X, fractions...;
           shuffle=nothing,
           rng=Random.GLOBAL_RNG,
           stratify=nothing,
@@ -13,7 +13,7 @@
 ([1 6], [2 7; 3 8], [4 9; 5 10])
 
 julia> X, y = make_blobs() # a table and vector
-julia> Xtrain, Xtest = partition(X, 0.8, stratify=y)

Here's an example of synchronized partitioning of multiple objects:

julia> (Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)

Keywords

  • shuffle=nothing: if set to true, shuffles the rows before taking fractions.

  • rng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.

  • stratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.

  • multi=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).

source
MLJBase.unpackFunction
unpack(table, f1, f2, ... fk;
+julia> Xtrain, Xtest = partition(X, 0.8, stratify=y)

Here's an example of synchronized partitioning of multiple objects:

julia> (Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)

Keywords

  • shuffle=nothing: if set to true, shuffles the rows before taking fractions.

  • rng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.

  • stratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.

  • multi=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).

source
MLJBase.unpackFunction
unpack(table, f1, f2, ... fk;
        wrap_singles=false,
        shuffle=false,
        rng::Union{AbstractRNG,Int,Nothing}=nothing,
@@ -42,12 +42,12 @@
 julia> W  # the column(s) left over
 2-element Vector{String}:
  "A"
- "B"

Whenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.

If coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.

If shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.

source

Bridging the gap between data type and model requirements

As outlined in Getting Started, it is important that the scientific type of data matches the requirements of the model of interest. For example, while the majority of supervised learning models require input features to be Continuous, newcomers to MLJ are sometimes surprised at the disappointing results of model queries such as this one:

X = (height   = [185, 153, 163, 114, 180],
+ "B"

Whenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.

If coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.

If shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.

source

Bridging the gap between data type and model requirements

As outlined in Getting Started, it is important that the scientific type of data matches the requirements of the model of interest. For example, while the majority of supervised learning models require input features to be Continuous, newcomers to MLJ are sometimes surprised at the disappointing results of model queries such as this one:

X = (height   = [185, 153, 163, 114, 180],
      time     = [2.3, 4.5, 4.2, 1.8, 7.1],
      mark     = ["D", "A", "C", "B", "A"],
      admitted = ["yes", "no", missing, "yes"]);
 y = [12.4, 12.5, 12.0, 31.9, 43.0]
-models(matching(X, y))
4-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
+models(matching(X, y))
4-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :constructor, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
  (name = ConstantRegressor, package_name = MLJModels, ... )
  (name = DecisionTreeRegressor, package_name = BetaML, ... )
  (name = DeterministicConstantRegressor, package_name = MLJModels, ... )
@@ -106,4 +106,4 @@
 │ admitted__no  │ Continuous │ Float64 │
 │ admitted__yes │ Continuous │ Float64 │
 └───────────────┴────────────┴─────────┘
-

Such transformations can also be combined in a pipeline; see Linear Pipelines.

Scientific type coercion

Scientific type coercion is documented in detail at ScientificTypesBase.jl. See also the tutorial at the this MLJ Workshop (specifically, here) and this Data Science in Julia tutorial.

Also relevant is the section, Working with Categorical Data.

Data transformation

MLJ's Built-in transformers are documented at Transformers and Other Unsupervised Models. The most relevant in the present context are: ContinuousEncoder, OneHotEncoder, FeatureSelector and FillImputer. A Gaussian mixture models imputer is provided by BetaML, which can be loaded with

MissingImputator = @load MissingImputator pkg=BetaML

This MLJ Workshop, and the "End-to-end examples" in Data Science in Julia tutorials give further illustrations of data preprocessing in MLJ.

+

Such transformations can also be combined in a pipeline; see Linear Pipelines.

Scientific type coercion

Scientific type coercion is documented in detail at ScientificTypesBase.jl. See also the tutorial at the this MLJ Workshop (specifically, here) and this Data Science in Julia tutorial.

Also relevant is the section, Working with Categorical Data.

Data transformation

MLJ's Built-in transformers are documented at Transformers and Other Unsupervised Models. The most relevant in the present context are: ContinuousEncoder, OneHotEncoder, FeatureSelector and FillImputer. A Gaussian mixture models imputer is provided by BetaML, which can be loaded with

MissingImputator = @load MissingImputator pkg=BetaML

This MLJ Workshop, and the "End-to-end examples" in Data Science in Julia tutorials give further illustrations of data preprocessing in MLJ.

diff --git a/dev/quick_start_guide_to_adding_models/index.html b/dev/quick_start_guide_to_adding_models/index.html index b4bda0a81..9858cbe46 100644 --- a/dev/quick_start_guide_to_adding_models/index.html +++ b/dev/quick_start_guide_to_adding_models/index.html @@ -1,2 +1,2 @@ -Quick-Start Guide to Adding Models · MLJ
+Quick-Start Guide to Adding Models · MLJ
diff --git a/dev/search_index.js b/dev/search_index.js index 5de475047..1b4d80874 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"models/LDA_MultivariateStats/#LDA_MultivariateStats","page":"LDA","title":"LDA","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"LDA","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"A model type for constructing a linear discriminant analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"LDA = @load LDA pkg=MultivariateStats","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Do model = LDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LDA(method=...).","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Multiclass linear discriminant analysis learns a projection in a space of features to a lower dimensional space, in a way that attempts to preserve as much as possible the degree to which the classes of a discrete target variable can be discriminated. This can be used either for dimension reduction of the features (see transform below) or for probabilistic classification of the target (see predict below).","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"In the case of prediction, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here \"distance\" is computed using a user-specified distance function.","category":"page"},{"location":"models/LDA_MultivariateStats/#Training-data","page":"LDA","title":"Training data","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Here:","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LDA_MultivariateStats/#Hyper-parameters","page":"LDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"method::Symbol=:gevd: The solver, one of :gevd or :whiten methods.\ncov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.\ncov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).\noutdim::Int=0: The output dimension, i.e dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.\nregcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.\ndist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.","category":"page"},{"location":"models/LDA_MultivariateStats/#Operations","page":"LDA","title":"Operations","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/LDA_MultivariateStats/#Fitted-parameters","page":"LDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).","category":"page"},{"location":"models/LDA_MultivariateStats/#Report","page":"LDA","title":"Report","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The mean of the untransformed training data. A vector of length indim.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).\nclass_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).\nclass_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nSb: The between class scatter matrix.\nSw: The within class scatter matrix.","category":"page"},{"location":"models/LDA_MultivariateStats/#Examples","page":"LDA","title":"Examples","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"using MLJ\n\nLDA = @load LDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = LDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)\n","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"See also BayesianLDA, SubspaceLDA, BayesianSubspaceLDA","category":"page"},{"location":"models/NuSVC_LIBSVM/#NuSVC_LIBSVM","page":"NuSVC","title":"NuSVC","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"NuSVC","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"A model type for constructing a ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"NuSVC = @load NuSVC pkg=LIBSVM","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"Do model = NuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVC(kernel=...).","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"This model is a re-parameterization of the SVC classifier, where nu replaces cost, and is mathematically equivalent to it. The parameter nu allows more direct control over the number of support vectors (see under \"Hyper-parameters\").","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"This model always predicts actual class labels. For probabilistic predictions, use instead ProbabilisticNuSVC.","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/NuSVC_LIBSVM/#Training-data","page":"NuSVC","title":"Training data","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"where","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/NuSVC_LIBSVM/#Hyper-parameters","page":"NuSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\nnu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/NuSVC_LIBSVM/#Operations","page":"NuSVC","title":"Operations","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/NuSVC_LIBSVM/#Fitted-parameters","page":"NuSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/NuSVC_LIBSVM/#Report","page":"NuSVC","title":"Report","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/NuSVC_LIBSVM/#Examples","page":"NuSVC","title":"Examples","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/#Using-a-built-in-kernel","page":"NuSVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"using MLJ\nimport LIBSVM\n\nNuSVC = @load NuSVC pkg=LIBSVM ## model type\nmodel = NuSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/NuSVC_LIBSVM/#User-defined-kernels","page":"NuSVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = NuSVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"See also the classifiers SVC and LinearSVC, LIVSVM.jl and the original C implementation. documentation.","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#KMedoidsClusterer_BetaML","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"mutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#Parameters:","page":"KMedoidsClusterer","title":"Parameters:","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"n_classes::Int64: Number of classes to discriminate the data [def: 3]\ndist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.\ninitialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:\n\"random\": randomly in the X space\n\"grid\": using a grid approach\n\"shuffle\": selecting randomly within the available points [default]\n\"given\": using a provided set of initial representatives provided in the initial_representatives parameter\ninitial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy=\"given\") [default: nothing]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"Similar to K-Means, but the \"representatives\" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#Notes:","page":"KMedoidsClusterer","title":"Notes:","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"data must be numerical\nonline fitting (re-fitting with new data) is supported","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#Example:","page":"KMedoidsClusterer","title":"Example:","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load KMedoidsClusterer pkg = \"BetaML\" verbosity=0\nBetaML.Clustering.KMedoidsClusterer\n\njulia> model = modelType()\nKMedoidsClusterer(\n n_classes = 3, \n dist = BetaML.Clustering.var\"#39#41\"(), \n initialisation_strategy = \"shuffle\", \n initial_representatives = nothing, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(KMedoidsClusterer(n_classes = 3, …), …).\n\njulia> classes_est = predict(mach, X);\n\njulia> hcat(y,classes_est)\n150×2 CategoricalArrays.CategoricalArray{Union{Int64, String},2,UInt32}:\n \"setosa\" 3\n \"setosa\" 3\n \"setosa\" 3\n ⋮ \n \"virginica\" 1\n \"virginica\" 1\n \"virginica\" 2","category":"page"},{"location":"benchmarking/#Benchmarking","page":"Benchmarking","title":"Benchmarking","text":"","category":"section"},{"location":"benchmarking/","page":"Benchmarking","title":"Benchmarking","text":"This feature not yet available.","category":"page"},{"location":"benchmarking/","page":"Benchmarking","title":"Benchmarking","text":"CONTRIBUTE.md","category":"page"},{"location":"weights/#Weights","page":"Weights","title":"Weights","text":"","category":"section"},{"location":"weights/","page":"Weights","title":"Weights","text":"In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"There are two kinds of weights in use in MLJ:","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations\nclass weights refer to dictionaries keyed on the target classes (levels) for use in classification problems","category":"page"},{"location":"weights/#Specifying-weights-in-training","page":"Weights","title":"Specifying weights in training","text":"","category":"section"},{"location":"weights/","page":"Weights","title":"Weights","text":"To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"KNNRegressor = @load KNNRegressor\nmodel = KNNRegressor()\nX, y = make_regression(10, 3)\nw = rand(length(y))\n\nmach = machine(model, X, y, w) |> fit!","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"Note that model supports per observation weights if supports_weights(model) is true. To list all such models, do","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"models() do m\n m.supports_weights\nend","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"The model model supports class weights if supports_class_weights(model) is true.","category":"page"},{"location":"weights/#Specifying-weights-in-performance-evaluation","page":"Weights","title":"Specifying weights in performance evaluation","text":"","category":"section"},{"location":"weights/","page":"Weights","title":"Weights","text":"When calling a measure (metric) that supports weights, provide the weights as the last argument, as in","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"_, y = @load_iris\nŷ = shuffle(y)\nw = Dict(\"versicolor\" => 1, \"setosa\" => 2, \"virginica\"=> 3)\nmacro_f1score(ŷ, y, w)","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"To pass weights to all the measures listed in an evaluate!/evaluate call, use the keyword specifiers weights=... or class_weights=.... For details, see Evaluating Model Performance.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#NeuralNetworkClassifier_MLJFlux","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"NeuralNetworkClassifier","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"A model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Do model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"NeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Training-data","page":"NeuralNetworkClassifier","title":"Training data","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Here:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Hyper-parameters","page":"NeuralNetworkClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Operations","page":"NeuralNetworkClassifier","title":"Operations","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Fitted-parameters","page":"NeuralNetworkClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Report","page":"NeuralNetworkClassifier","title":"Report","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Examples","page":"NeuralNetworkClassifier","title":"Examples","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"using MLJ\nusing Flux\nimport RDatasets","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"First, we can load the data:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123); ## a vector and a table\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier()","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Next, we can train the model:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"mach = machine(clf, X, y)\nfit!(mach)","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"We can train the model in an incremental fashion, altering the learning rate as we go, provided optimizer_changes_trigger_retraining is false (the default). Here, we also change the number of (total) iterations:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"clf.optimiser.eta = clf.optimiser.eta * 2\nclf.epochs = clf.epochs + 5\n\nfit!(mach, verbosity=2) ## trains 5 more epochs","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"We can inspect the mean training loss using the cross_entropy function:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"training_loss = cross_entropy(predict(mach, X), y) |> mean","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"And we can access the Flux chain (model) using fitted_params:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"chain = fitted_params(mach).chain","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Finally, we can see how the out-of-sample performance changes over time, using MLJ's learning_curve function:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"r = range(clf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(clf, X, y,\n range=r,\n resampling=Holdout(fraction_train=0.7),\n measure=cross_entropy)\nusing Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\")\n","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"See also ImageClassifier.","category":"page"},{"location":"models/HBOSDetector_OutlierDetectionPython/#HBOSDetector_OutlierDetectionPython","page":"HBOSDetector","title":"HBOSDetector","text":"","category":"section"},{"location":"models/HBOSDetector_OutlierDetectionPython/","page":"HBOSDetector","title":"HBOSDetector","text":"HBOSDetector(n_bins = 10,\n alpha = 0.1,\n tol = 0.5)","category":"page"},{"location":"models/HBOSDetector_OutlierDetectionPython/","page":"HBOSDetector","title":"HBOSDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.hbos","category":"page"},{"location":"models/DBSCAN_Clustering/#DBSCAN_Clustering","page":"DBSCAN","title":"DBSCAN","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"A model type for constructing a DBSCAN clusterer (density-based spatial clustering of applications with noise), based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN = @load DBSCAN pkg=Clustering","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(radius=...).","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN is a clustering algorithm that groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. Point types - core, boundary or noise - are accessed from the machine report (see below).","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"In MLJ or MLJBase, create a machine with","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"mach = machine(model)","category":"page"},{"location":"models/DBSCAN_Clustering/#Hyper-parameters","page":"DBSCAN","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"radius=1.0: query radius.\nleafsize=20: number of points binned in each leaf node of the nearest neighbor k-d tree.\nmin_neighbors=1: minimum number of a core point neighbors.\nmin_cluster_size=1: minimum number of points in a valid cluster.","category":"page"},{"location":"models/DBSCAN_Clustering/#Operations","page":"DBSCAN","title":"Operations","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Note that points of type noise will always get a label of 0.","category":"page"},{"location":"models/DBSCAN_Clustering/#Report","page":"DBSCAN","title":"Report","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"After calling predict(mach), the fields of report(mach) are:","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"point_types: A CategoricalVector with the DBSCAN point type classification, one element per row of X. Elements are either 'C' (core), 'B' (boundary), or 'N' (noise).\nnclusters: The number of clusters (excluding the noise \"cluster\")\ncluster_labels: The unique list of cluster labels\nclusters: A vector of Clustering.DbscanCluster objects from Clustering.jl, which have these fields:\nsize: number of points in a cluster (core + boundary)\ncore_indices: indices of points in the cluster core\nboundary_indices: indices of points on the cluster boundary","category":"page"},{"location":"models/DBSCAN_Clustering/#Examples","page":"DBSCAN","title":"Examples","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"using MLJ\n\nX, labels = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X\ny = map(labels) do label\n label == 0 ? \"cookie\" : \"monster\"\nend;\ny = coerce(y, Multiclass);\n\nDBSCAN = @load DBSCAN pkg=Clustering\nmodel = DBSCAN(radius=0.13, min_cluster_size=5)\nmach = machine(model)\n\n## compute and output cluster assignments for observations in `X`:\nyhat = predict(mach, X)\n\n## get DBSCAN point types:\nreport(mach).point_types\nreport(mach).nclusters\n\n## compare cluster labels with actual labels:\ncompare = zip(yhat, y) |> collect;\ncompare[1:10] ## clusters align with classes\n\n## visualize clusters, noise in red:\npoints = zip(X.x1, X.x2) |> collect\ncolors = map(yhat) do i\n i == 0 ? :red :\n i == 1 ? :blue :\n i == 2 ? :green :\n i == 3 ? :yellow :\n :black\nend\nusing Plots\nscatter(points, color=colors)","category":"page"},{"location":"glossary/#Glossary","page":"Glossary","title":"Glossary","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Note: This glossary includes some detail intended mainly for MLJ developers.","category":"page"},{"location":"glossary/#Basics","page":"Glossary","title":"Basics","text":"","category":"section"},{"location":"glossary/#hyperparameters","page":"Glossary","title":"hyperparameters","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Parameters on which some learning algorithm depends, specified before the algorithm is applied, and where learning is interpreted in the broadest sense. For example, PCA feature reduction is a \"preprocessing\" transformation \"learning\" a projection from training data, governed by a dimension hyperparameter. Hyperparameters in our sense may specify configuration (eg, number of parallel processes) even when this does not affect the end-product of learning. (But we exclude verbosity level.)","category":"page"},{"location":"glossary/#model-(object-of-abstract-type-Model)","page":"Glossary","title":"model (object of abstract type Model)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Object collecting together hyperpameters of a single algorithm. Models are classified either as supervised or unsupervised models (eg, \"transformers\"), with corresponding subtypes Supervised <: Model and Unsupervised <: Model.","category":"page"},{"location":"glossary/#fitresult-(type-generally-defined-outside-of-MLJ)","page":"Glossary","title":"fitresult (type generally defined outside of MLJ)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Also known as \"learned\" or \"fitted\" parameters, these are \"weights\", \"coefficients\", or similar parameters learned by an algorithm, after adopting the prescribed hyper-parameters. For example, decision trees of a random forest, the coefficients and intercept of a linear model, or the projection matrices of a PCA dimension-reduction algorithm.","category":"page"},{"location":"glossary/#operation","page":"Glossary","title":"operation","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Data-manipulating operations (methods) using some fitresult. For supervised learners, the predict, predict_mean, predict_median, or predict_mode methods; for transformers, the transform or inverse_transform method. An operation may also refer to an ordinary data-manipulating method that does not depend on a fit-result (e.g., a broadcasted logarithm) which is then called static operation for clarity. An operation that is not static is dynamic.","category":"page"},{"location":"glossary/#machine-(object-of-type-Machine)","page":"Glossary","title":"machine (object of type Machine)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"An object consisting of:","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"A model\nA fit-result (undefined until training)\nTraining arguments (one for each data argument of the model's associated fit method). A training argument is data used for training (subsampled by specifying rows=... in fit!) but also in evaluation (subsampled by specifying rows=... in predict, predict_mean, etc). Generally, there are two training arguments for supervised models, and just one for unsupervised models. Each argument is either a Source node, wrapping concrete data supplied to the machine constructor, or a Node, in the case of a learning network (see below). Both kinds of nodes can be called with an optional rows=... keyword argument to (lazily) return concrete data.","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"In addition, machines store \"report\" metadata, for recording algorithm-specific statistics of training (eg, an internal estimate of generalization error, feature importances); and they cache information allowing the fit-result to be updated without repeating unnecessary information.","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Machines are trained by calls to a fit! method which may be passed an optional argument specifying the rows of data to be used in training.","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"For more, see the Machines section.","category":"page"},{"location":"glossary/#Learning-Networks-and-Composite-Models","page":"Glossary","title":"Learning Networks and Composite Models","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Note: Multiple machines in a learning network may share the same model, and multiple learning nodes may share the same machine.","category":"page"},{"location":"glossary/#source-node-(object-of-type-Source)","page":"Glossary","title":"source node (object of type Source)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"A container for training data and point of entry for new data in a learning network (see below).","category":"page"},{"location":"glossary/#node-(object-of-type-Node)","page":"Glossary","title":"node (object of type Node)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Essentially a machine (whose arguments are possibly other nodes) wrapped in an associated operation (e.g., predict or inverse_transform). It consists primarily of:","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"An operation, static or dynamic.\nA machine, or nothing if the operation is static.\nUpstream connections to other nodes, specified by a list of arguments (one for each argument of the operation). These are the arguments on which the operation \"acts\" when the node N is called, as in N().","category":"page"},{"location":"glossary/#learning-network","page":"Glossary","title":"learning network","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"A directed acyclic graph implicit in the connections of a collection of source(s) and nodes. ","category":"page"},{"location":"glossary/#wrapper","page":"Glossary","title":"wrapper","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Any model with one or more other models as hyper-parameters.","category":"page"},{"location":"glossary/#composite-model","page":"Glossary","title":"composite model","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Any wrapper, or any learning network, \"exported\" as a model (see Composing Models).","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/#ProbabilisticSGDClassifier_MLJScikitLearnInterface","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"","category":"section"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"ProbabilisticSGDClassifier","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"A model type for constructing a probabilistic sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"ProbabilisticSGDClassifier = @load ProbabilisticSGDClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"Do model = ProbabilisticSGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSGDClassifier(loss=...).","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"ProbabilisticSGDClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"loss = log_loss\npenalty = l2\nalpha = 0.0001\nl1_ratio = 0.15\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\nepsilon = 0.1\nn_jobs = nothing\nrandom_state = nothing\nlearning_rate = optimal\neta0 = 0.0\npower_t = 0.5\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nclass_weight = nothing\nwarm_start = false\naverage = false","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/#HuberRegressor_MLJScikitLearnInterface","page":"HuberRegressor","title":"HuberRegressor","text":"","category":"section"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"A model type for constructing a Huber regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor = @load HuberRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"Do model = HuberRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HuberRegressor(epsilon=...).","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"HuberRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"epsilon = 1.35\nmax_iter = 100\nalpha = 0.0001\nwarm_start = false\nfit_intercept = true\ntol = 1.0e-5","category":"page"},{"location":"models/KPLSRegressor_PartialLeastSquaresRegressor/#KPLSRegressor_PartialLeastSquaresRegressor","page":"KPLSRegressor","title":"KPLSRegressor","text":"","category":"section"},{"location":"models/KPLSRegressor_PartialLeastSquaresRegressor/","page":"KPLSRegressor","title":"KPLSRegressor","text":"A Kernel Partial Least Squares Regressor. A Kernel PLS2 NIPALS algorithms. Can be used mainly for regression.","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#EpsilonSVR_LIBSVM","page":"EpsilonSVR","title":"EpsilonSVR","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"EpsilonSVR","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"A model type for constructing a ϵ-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"EpsilonSVR = @load EpsilonSVR pkg=LIBSVM","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"Do model = EpsilonSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EpsilonSVR(kernel=...).","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"This model is an adaptation of the classifier SVC to regression, but has an additional parameter epsilon (denoted ϵ in the cited reference).","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Training-data","page":"EpsilonSVR","title":"Training data","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"where","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Hyper-parameters","page":"EpsilonSVR","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\nepsilon=0.1 (range (0, Inf)): the parameter denoted ϵ in the cited reference; epsilon is the thickness of the penalty-free neighborhood of the graph of the prediction function (\"slab\" or \"tube\"). Specifically, a data point (x, y) incurs no training loss unless it is outside this neighborhood; the further away it is from the this neighborhood, the greater the loss penalty.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Operations","page":"EpsilonSVR","title":"Operations","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Fitted-parameters","page":"EpsilonSVR","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"libsvm_model: the trained model object created by the LIBSVM.jl package","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Report","page":"EpsilonSVR","title":"Report","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Examples","page":"EpsilonSVR","title":"Examples","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/#Using-a-built-in-kernel","page":"EpsilonSVR","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"using MLJ\nimport LIBSVM\n\nEpsilonSVR = @load EpsilonSVR pkg=LIBSVM ## model type\nmodel = EpsilonSVR(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = make_regression(rng=123) ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew, _ = make_regression(3, rng=123)\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 0.2512132502584155\n 0.007340201523624579\n -0.2482949812264707","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#User-defined-kernels","page":"EpsilonSVR","title":"User-defined kernels","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = EpsilonSVR(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 1.1121225361666656\n 0.04667702229741916\n -0.6958148424680672","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"See also NuSVR, LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#EvoSplineRegressor_EvoLinear","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"EvoSplineRegressor(; kwargs...)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"A model type for constructing a EvoSplineRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Keyword-arguments","page":"EvoSplineRegressor","title":"Keyword arguments","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"loss=:mse: loss function to be minimised. Can be one of:\n:mse\n:logistic\n:poisson\n:gamma\n:tweedie\nnrounds=10: maximum number of training rounds.\neta=1: Learning rate. Typically in the range [1e-2, 1].\nL1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.\nL2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.\nrng=123: random seed. Not used at the moment.\nupdater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.\ndevice=:cpu: Only :cpu is supported at the moment.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Internal-API","page":"EvoSplineRegressor","title":"Internal API","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Do config = EvoSplineRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"EvoSplineRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Training-model","page":"EvoSplineRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"A model is built using fit:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"config = EvoSplineRegressor()\nm = fit(config; x, y, w)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Inference","page":"EvoSplineRegressor","title":"Inference","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument. ","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"preds = m(x)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#MLJ-Interface","page":"EvoSplineRegressor","title":"MLJ Interface","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"EvoSplineRegressor = @load EvoSplineRegressor pkg=EvoLinear","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoSplineRegressor(loss=...).","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Training-model-2","page":"EvoSplineRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where: ","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Operations","page":"EvoSplineRegressor","title":"Operations","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"predict(mach, Xnew): return predictions of the target given","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Fitted-parameters","page":"EvoSplineRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":":fitresult: the SplineModel object returned by EvoSplineRegressor fitting algorithm.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Report","page":"EvoSplineRegressor","title":"Report","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":":coef: Vector of coefficients (βs) associated to each of the features.\n:bias: Value of the bias.\n:names: Names of each of the features.","category":"page"},{"location":"models/RandomForestRegressor_BetaML/#RandomForestRegressor_BetaML","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"","category":"section"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"mutable struct RandomForestRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/RandomForestRegressor_BetaML/#Hyperparameters:","page":"RandomForestRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"n_trees::Int64: Number of (decision) trees in the forest [def: 30]\nmax_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimension]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.\nβ::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour \"better\" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/RandomForestRegressor_BetaML/#Example:","page":"RandomForestRegressor","title":"Example:","text":"","category":"section"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load RandomForestRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Trees.RandomForestRegressor\n\njulia> model = modelType()\nRandomForestRegressor(\n n_trees = 30, \n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.variance, \n β = 0.0, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(RandomForestRegressor(n_trees = 30, …), …).\n\njulia> ŷ = predict(mach, X);\n\njulia> hcat(y,ŷ)\n506×2 Matrix{Float64}:\n 24.0 25.8433\n 21.6 22.4317\n 34.7 35.5742\n 33.4 33.9233\n ⋮ \n 23.9 24.42\n 22.0 22.4433\n 11.9 15.5833","category":"page"},{"location":"models/KMeans_ParallelKMeans/#KMeans_ParallelKMeans","page":"KMeans","title":"KMeans","text":"","category":"section"},{"location":"models/KMeans_ParallelKMeans/","page":"KMeans","title":"KMeans","text":"Parallel & lightning fast implementation of all available variants of the KMeans clustering algorithm in native Julia. Compatible with Julia 1.3+","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/#BisectingKMeans_MLJScikitLearnInterface","page":"BisectingKMeans","title":"BisectingKMeans","text":"","category":"section"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"BisectingKMeans","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"A model type for constructing a bisecting k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"BisectingKMeans = @load BisectingKMeans pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"Do model = BisectingKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BisectingKMeans(n_clusters=...).","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"Bisecting K-Means clustering.","category":"page"},{"location":"logging_workflows/#Logging-Workflows","page":"Logging Workflows","title":"Logging Workflows","text":"","category":"section"},{"location":"logging_workflows/#MLflow-integration","page":"Logging Workflows","title":"MLflow integration","text":"","category":"section"},{"location":"logging_workflows/","page":"Logging Workflows","title":"Logging Workflows","text":"MLflow is a popular, language-agnostic, tool for externally logging the outcomes of machine learning experiments, including those carried out using MLJ.","category":"page"},{"location":"logging_workflows/","page":"Logging Workflows","title":"Logging Workflows","text":"MLJ logging examples are given in the MLJFlow.jl documentation. MLJ includes and re-exports all the methods of MLJFlow.jl, so there is no need to import MLJFlow.jl if using MLJ.","category":"page"},{"location":"logging_workflows/","page":"Logging Workflows","title":"Logging Workflows","text":"warning: Warning\nMLJFlow.jl is a new package still under active development and should be regarded as experimental. At this time, breaking changes to MLJFlow.jl will not necessarily trigger new breaking releases of MLJ.jl.","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/#ComplementNBClassifier_MLJScikitLearnInterface","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"","category":"section"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"ComplementNBClassifier","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"A model type for constructing a Complement naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"ComplementNBClassifier = @load ComplementNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"Do model = ComplementNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ComplementNBClassifier(alpha=...).","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"Similar to MultinomialNBClassifier but with more robust assumptions. Suited for imbalanced datasets.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#RobustRegressor_MLJLinearModels","page":"RobustRegressor","title":"RobustRegressor","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"RobustRegressor","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"A model type for constructing a robust regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"RobustRegressor = @load RobustRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Do model = RobustRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Robust regression is a linear model with objective function","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"$","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"where ρ is a robust loss function (e.g. the Huber function) and n is the number of observations.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"If scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"$","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":".","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#Training-data","page":"RobustRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"where:","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#Hyperparameters","page":"RobustRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"rho::MLJLinearModels.RobustRho: the type of robust loss, which can be any instance of MLJLinearModels.L where L is one of: AndrewsRho, BisquareRho, FairRho, HuberRho, LogisticRho, QuantileRho, TalwarRho, HuberRho, TalwarRho. Default: HuberRho(0.1)\nlambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#Example","page":"RobustRegressor","title":"Example","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(RobustRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"See also HuberRegressor, QuantileRegressor.","category":"page"},{"location":"controlling_iterative_models/#Controlling-Iterative-Models","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the \"warm restart\" behavior described in Machines.","category":"page"},{"location":"controlling_iterative_models/#Basic-use","page":"Controlling Iterative Models","title":"Basic use","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"As in Tuning Models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"using MLJ\n\nX, y = make_moons(100, rng=123, noise=0.5)\nEvoTreeClassifier = @load EvoTreeClassifier verbosity=0\n\niterated_model = IteratedModel(model=EvoTreeClassifier(rng=123, eta=0.005),\n resampling=Holdout(),\n measures=log_loss,\n controls=[Step(5),\n Patience(2),\n NumberLimit(100)],\n retrain=true)\n\nmach = machine(iterated_model, X, y)\nnothing # hide","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"fit!(mach)","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means \"Compute 5 more iterations\" (in this case starting from none); Patience(2) means \"Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss\"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Because iteration is implemented as a wrapper, the \"self-iterating\" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);\nmap(e.report_per_fold) do r\n r.n_iterations\nend","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Alternatively, one might wrap the self-iterating model in a tuning strategy, using TunedModel; see Tuning Models. In this way, the optimization of some other hyper-parameter is realized simultaneously with that of the iteration parameter, which will frequently be more efficient than a direct two-parameter search.","category":"page"},{"location":"controlling_iterative_models/#Controls-provided","page":"Controlling Iterative Models","title":"Controls provided","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In the table below, mach is the training machine being iterated, constructed by binding the supplied data to the model specified in the IteratedModel wrapper, but trained in each iteration on a subset of the data, according to the value of the resampling hyper-parameter of the wrapper (using all data if resampling=nothing).","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"control description can trigger a stop\nStep(n=1) Train model for n more iterations no\nTimeLimit(t=0.5) Stop after t hours yes\nNumberLimit(n=100) Stop after n applications of the control yes\nNumberSinceBest(n=6) Stop when best loss occurred n control applications ago yes\nInvalidValue() Stop when NaN, Inf or -Inf loss/training loss encountered yes\nThreshold(value=0.0) Stop when loss < value yes\nGL(alpha=2.0) † Stop after the \"generalization loss (GL)\" exceeds alpha yes\nPQ(alpha=0.75, k=5) † Stop after \"progress-modified GL\" exceeds alpha yes\nPatience(n=5) † Stop after n consecutive loss increases yes\nWarmup(c; n=1) Wait for n loss updates before checking criteria c no\nInfo(f=identity) Log to Info the value of f(mach), where mach is current machine no\nWarn(predicate; f=\"\") Log to Warn the value of f or f(mach), if predicate(mach) holds no\nError(predicate; f=\"\") Log to Error the value of f or f(mach), if predicate(mach) holds and then stop yes\nCallback(f=mach->nothing) Call f(mach) yes\nWithNumberDo(f=n->@info(n)) Call f(n + 1) where n is the number of complete control cycles so far yes\nWithIterationsDo(f=i->@info(\"iterations: $i\")) Call f(i), where i is total number of iterations yes\nWithLossDo(f=x->@info(\"loss: $x\")) Call f(loss) where loss is the current loss yes\nWithTrainingLossesDo(f=v->@info(v)) Call f(v) where v is the current batch of training losses yes\nWithEvaluationDo(f->e->@info(\"evaluation: $e)) Call f(e) where e is the current performance evaluation object yes\nWithFittedParamsDo(f->fp->@info(\"fitted_params: $fp)) Call f(fp) where fp is fitted parameters of training machine yes\nWithReportDo(f->e->@info(\"report: $e)) Call f(r) where r is the training machine report yes\nWithModelDo(f->m->@info(\"model: $m)) Call f(m) where m is the model, which may be mutated by f yes\nWithMachineDo(f->mach->@info(\"report: $mach)) Call f(mach) wher mach is the training machine in its current state yes\nSave(filename=\"machine.jls\") Save current training machine to machine1.jls, machine2.jsl, etc yes","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Table 1. Atomic controls. Some advanced options are omitted.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"† For more on these controls see Prechelt, Lutz (1998): \"Early Stopping - But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Stopping option. All the following controls trigger a stop if the provided function f returns true and stop_if_true=true is specified in the constructor: Callback, WithNumberDo, WithLossDo, WithTrainingLossesDo.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"There are also three control wrappers to modify a control's behavior:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"wrapper description\nIterationControl.skip(control, predicate=1) Apply control every predicate applications of the control wrapper (can also be a function; see doc-string)\nIterationControl.louder(control, by=1) Increase the verbosity level of control by the specified value (negative values lower verbosity)\nIterationControl.with_state_do(control; f=...) Apply control and call f(x) where x is the internal state of control; useful for debugging. Default f logs state to Info. Warning: internal control state is not yet part of the public API.\nIterationControl.composite(controls...) Apply each control in controls in sequence; used internally by IterationControl.jl","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Table 2. Wrapped controls","category":"page"},{"location":"controlling_iterative_models/#Using-training-losses,-and-controlling-model-tuning","page":"Controlling Iterative Models","title":"Using training losses, and controlling model tuning","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Some iterative models report a training loss, as a byproduct of a fit! call and these can be used in two ways:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"To supplement an out-of-sample estimate of the loss in deciding when to stop, as in the PQ stopping criterion (see Prechelt, Lutz (1998))); or\nAs a (generally less reliable) substitute for an out-of-sample loss, when wishing to train exclusively on all supplied data.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"To have IteratedModel bind all data to the training machine and use training losses in place of an out-of-sample loss, specify resampling=nothing. To check if MyFavoriteIterativeModel reports training losses, load the model code and inspect supports_training_losses(MyFavoriteIterativeModel) (or do info(\"MyFavoriteIterativeModel\"))","category":"page"},{"location":"controlling_iterative_models/#Controlling-model-tuning","page":"Controlling Iterative Models","title":"Controlling model tuning","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"An example of scenario 2 occurs when controlling hyperparameter optimization (model tuning). Recall that MLJ's TunedModel wrapper is implemented as an iterative model. Moreover, this wrapper reports, as a training loss, the lowest value of the optimization objective function so far (typically the lowest value of an out-of-sample loss, or -1 times an out-of-sample score). One may want to simply end the hyperparameter search when this value meets the NumberSinceBest stopping criterion discussed below, say, rather than introducing an extra layer of resampling to first \"learn\" the optimal value of the iteration parameter.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In the following example, we conduct a RandomSearch for the optimal value of the regularization parameter lambda in a RidgeRegressor using 6-fold cross-validation. By wrapping our \"self-tuning\" version of the regressor as an IteratedModel, with resampling=nothing and NumberSinceBest(20) in the controls, we terminate the search when the number of lambda values tested since the previous best cross-validation loss reaches 20.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"using MLJ\n\nX, y = @load_boston;\nRidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels verbosity=0\nmodel = RidgeRegressor()\nr = range(model, :lambda, lower=-1, upper=2, scale=x->10^x)\nself_tuning_model = TunedModel(model=model,\n tuning=RandomSearch(rng=123),\n resampling=CV(nfolds=6),\n range=r,\n measure=mae);\niterated_model = IteratedModel(model=self_tuning_model,\n resampling=nothing,\n control=[Step(1), NumberSinceBest(20), NumberLimit(1000)])\nmach = machine(iterated_model, X, y)\nnothing # hide","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"fit!(mach)","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"report(mach).model_report.best_model","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"We can use mach here to directly obtain predictions using the optimal model (trained on all data), as in","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"predict(mach, selectrows(X, 1:4))","category":"page"},{"location":"controlling_iterative_models/#Custom-controls","page":"Controlling Iterative Models","title":"Custom controls","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Under the hood, control in MLJIteration is implemented using IterationControl.jl. Rather than iterating a training machine directly, we iterate a wrapped version of this object, which includes other information that a control may want to access, such as the MLJ evaluation object. This information is summarized under The training machine wrapper below.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Controls must implement two update! methods, one for initializing the control's state on the first application of the control (this state being external to the control struct) and one for all subsequent control applications, which generally updates the state as well. There are two optional methods: done, for specifying conditions triggering a stop, and takedown for specifying actions to perform at the end of controlled training.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"We summarize the training algorithm, as it relates to controls, after giving a simple example.","category":"page"},{"location":"controlling_iterative_models/#Example-1-Non-uniform-iteration-steps","page":"Controlling Iterative Models","title":"Example 1 - Non-uniform iteration steps","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Below we define a control, IterateFromList(list), to train, on each application of the control, until the iteration count reaches the next value in a user-specified list, triggering a stop when the list is exhausted. For example, to train on iteration counts on a log scale, one might use IterateFromList([round(Int, 10^x) for x in range(1, 2, length=10)].","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In the code, wrapper is an object that wraps the training machine (see above). The variable n is a counter for control cycles (unused in this example).","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"import IterationControl # or MLJ.IterationControl\n\nstruct IterateFromList\n list::Vector{<:Int} # list of iteration parameter values\n IterateFromList(v) = new(unique(sort(v)))\nend\n\nfunction IterationControl.update!(control::IterateFromList, wrapper, verbosity, n)\n Δi = control.list[1]\n verbosity > 1 && @info \"Training $Δi more iterations. \"\n MLJIteration.train!(wrapper, Δi) # trains the training machine\n return (index = 2, )\nend\n\nfunction IterationControl.update!(control::IterateFromList, wrapper, verbosity, n, state)\n index = state.positioin_in_list\n Δi = control.list[i] - wrapper.n_iterations\n verbosity > 1 && @info \"Training $Δi more iterations. \"\n MLJIteration.train!(wrapper, Δi)\n return (index = index + 1, )\nend","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"The first update method will be called the first time the control is applied, returning an initialized state = (index = 2,), which is passed to the second update method, which is called on subsequent control applications (and which returns the updated state).","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"A done method articulates the criterion for stopping:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.done(control::IterateFromList, state) =\n state.index > length(control.list)","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"For the sake of illustration, we'll implement a takedown method; its return value is included in the IteratedModel report:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.takedown(control::IterateFromList, verbosity, state)\n verbosity > 1 && = @info \"Stepped through these values of the \"*\n \"iteration parameter: $(control.list)\"\n return (iteration_values=control.list, )\nend","category":"page"},{"location":"controlling_iterative_models/#The-training-machine-wrapper","page":"Controlling Iterative Models","title":"The training machine wrapper","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"A training machine wrapper has these properties:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"wrapper.machine - the training machine, type Machine\nwrapper.model - the mutable atomic model, coinciding with wrapper.machine.model\nwrapper.n_cycles - the number IterationControl.train!(wrapper, _) calls so far; generally the current control cycle count\nwrapper.n_iterations - the total number of iterations applied to the model so far\nwrapper.Δiterations - the number of iterations applied in the last IterationControl.train!(wrapper, _) call\nwrapper.loss - the out-of-sample loss (based on the first measure in measures)\nwrapper.training_losses - the last batch of training losses (if reported by model), an abstract vector of length wrapper.Δiteration.\nwrapper.evaluation - the complete MLJ performance evaluation object, which has the following properties: measure, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold (here there is only one fold). For further details, see Evaluating Model Performance.","category":"page"},{"location":"controlling_iterative_models/#The-training-algorithm","page":"Controlling Iterative Models","title":"The training algorithm","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Here now is a simplified description of the training of an IteratedModel. First, the atomic model is bound in a machine - the training machine above - to a subset of the supplied data, and then wrapped in an object called wrapper below. To train the training machine machine for i more iterations, and update the other data in the wrapper, requires the call MLJIteration.train!(wrapper, i). Only controls can make this call (e.g., Step(...), or IterateFromList(...) above). If we assume for simplicity there is only a single control, called control, then training proceeds as follows:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"n = 1 # initialize control cycle counter\nstate = update!(control, wrapper, verbosity, n)\nfinished = done(control, state)\n\n# subsequent training events:\nwhile !finished\n n += 1\n state = update!(control, wrapper, verbosity, n, state)\n finished = done(control, state)\nend\n\n# finalization:\nreturn takedown(control, verbosity, state)","category":"page"},{"location":"controlling_iterative_models/#Example-2-Cyclic-learning-rates","page":"Controlling Iterative Models","title":"Example 2 - Cyclic learning rates","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"The control below implements a triangular cyclic learning rate policy, close to that described in L. N. Smith (2019): \"Cyclical Learning Rates for Training Neural Networks,\" 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, pp. 464-472. [In that paper learning rates are mutated (slowly) during each training iteration (epoch) while here mutations can occur once per iteration of the model, at most.]","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"For the sake of illustration, we suppose the iterative model, model, specified in the IteratedModel constructor, has a field called :learning_parameter, and that mutating this parameter does not trigger cold-restarts.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"struct CycleLearningRate{F<:AbstractFloat}\n stepsize::Int\n lower::F\n upper::F\nend\n\n# return one cycle of learning rate values:\nfunction one_cycle(c::CycleLearningRate)\n rise = range(c.lower, c.upper, length=c.stepsize + 1)\n fall = reverse(rise)\n return vcat(rise[1:end - 1], fall[1:end - 1])\nend\n\nfunction IterationControl.update!(control::CycleLearningRate,\n wrapper,\n verbosity,\n n,\n state = (learning_rates=nothing, ))\n rates = n == 0 ? one_cycle(control) : state.learning_rates\n index = mod(n, length(rates)) + 1\n r = rates[index]\n verbosity > 1 && @info \"learning rate: $r\"\n wrapper.model.iteration_control = r\n return (learning_rates = rates,)\nend","category":"page"},{"location":"controlling_iterative_models/#API-Reference","page":"Controlling Iterative Models","title":"API Reference","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"MLJIteration.IteratedModel","category":"page"},{"location":"controlling_iterative_models/#MLJIteration.IteratedModel","page":"Controlling Iterative Models","title":"MLJIteration.IteratedModel","text":"IteratedModel(model=nothing,\n controls=Any[Step(1), Patience(5), GL(2.0), TimeLimit(Dates.Millisecond(108000)), InvalidValue()],\n retrain=false,\n resampling=Holdout(),\n measure=nothing,\n weights=nothing,\n class_weights=nothing,\n operation=predict,\n verbosity=1,\n check_measure=true,\n iteration_parameter=nothing,\n cache=true)\n\nWrap the specified model <: Supervised in the specified iteration controls. Training a machine bound to the wrapper iterates a corresonding machine bound to model. Here model should support iteration.\n\nTo list all controls, do MLJIteration.CONTROLS. Controls are summarized at https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.\n\nTo make out-of-sample losses available to the controls, the machine bound to model is only trained on part of the data, as iteration proceeds. See details on training below. Specify retrain=true to ensure the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped.\n\nSpecify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). Otherwise, resampling must have type Holdout (eg, Holdout(fraction_train=0.8, rng=123)).\n\nAssuming retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected. If retrain=false (default) and resampling is not nothing, then iterated_model behaves like the original model trained on a subset of the provided data.\n\nControlled iteration can be continued with new fit! calls (warm restart) by mutating a control, or by mutating the iteration parameter of model, which is otherwise ignored.\n\nTraining\n\nGiven an instance iterated_model of IteratedModel, calling fit!(mach) on a machine mach = machine(iterated_model, data...) performs the following actions:\n\nAssuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy, which must have type Holdout.\nA clone of the wrapped model, iterated_model.model, is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).\nThe iteration parameter of the clone is set to 0.\nThe specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).\nOnce a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false or resampling === nothing, in which case mach_production coincides with train_mach.\n\nPrediction\n\nCalling predict(mach, Xnew) returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.\n\nControls\n\nA control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control\n\nCallback(mach -> mach.model.eta = 1.05*mach.model.eta)\n\nHowever, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.\n\nWarm restarts\n\nIf iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.\n\nSpecifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#Controls","page":"Controlling Iterative Models","title":"Controls","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.Step\nEarlyStopping.TimeLimit\nEarlyStopping.NumberLimit\nEarlyStopping.NumberSinceBest\nEarlyStopping.InvalidValue\nEarlyStopping.Threshold\nEarlyStopping.GL\nEarlyStopping.PQ\nEarlyStopping.Patience\nIterationControl.Info\nIterationControl.Warn\nIterationControl.Error\nIterationControl.Callback\nIterationControl.WithNumberDo\nMLJIteration.WithIterationsDo\nIterationControl.WithLossDo\nIterationControl.WithTrainingLossesDo\nMLJIteration.WithEvaluationDo\nMLJIteration.WithFittedParamsDo\nMLJIteration.WithReportDo\nMLJIteration.WithModelDo\nMLJIteration.WithMachineDo\nMLJIteration.Save","category":"page"},{"location":"controlling_iterative_models/#IterationControl.Step","page":"Controlling Iterative Models","title":"IterationControl.Step","text":"Step(; n=1)\n\nAn iteration control, as in, Step(2). \n\nTrain for n more iterations. Will never trigger a stop. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.TimeLimit","page":"Controlling Iterative Models","title":"EarlyStopping.TimeLimit","text":"TimeLimit(; t=0.5)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nStopping is triggered after t hours have elapsed since the stopping criterion was initiated.\n\nAny Julia built-in Real type can be used for t. Subtypes of Period may also be used, as in TimeLimit(t=Minute(30)).\n\nInternally, t is rounded to nearest millisecond. ``\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.NumberLimit","page":"Controlling Iterative Models","title":"EarlyStopping.NumberLimit","text":"NumberLimit(; n=100)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered by n consecutive loss updates, excluding \"training\" loss updates.\n\nIf wrapped in a stopper::EarlyStopper, this is the number of calls to done!(stopper).\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.NumberSinceBest","page":"Controlling Iterative Models","title":"EarlyStopping.NumberSinceBest","text":"NumberSinceBest(; n=6)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered when the number of calls to the control, since the lowest value of the loss so far, is n.\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.InvalidValue","page":"Controlling Iterative Models","title":"EarlyStopping.InvalidValue","text":"InvalidValue()\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nStop if a loss (or training loss) is NaN, Inf or -Inf (or, more precisely, if isnan(loss) or isinf(loss) is true).\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.Threshold","page":"Controlling Iterative Models","title":"EarlyStopping.Threshold","text":"Threshold(; value=0.0)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered as soon as the loss drops below value.\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.GL","page":"Controlling Iterative Models","title":"EarlyStopping.GL","text":"GL(; alpha=2.0)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered when the (rescaled) generalization loss exceeds the threshold alpha.\n\nTerminology. Suppose E_1 E_2 E_t are a sequence of losses, for example, out-of-sample estimates of the loss associated with some iterative machine learning algorithm. Then the generalization loss at time t, is given by\n\nGL_t = 100 (E_t - E_opt) over E_opt\n\nwhere E_opt is the minimum value of the sequence.\n\nReference: Prechelt, Lutz (1998): \"Early Stopping- But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.PQ","page":"Controlling Iterative Models","title":"EarlyStopping.PQ","text":"PQ(; alpha=0.75, k=5, tol=eps(Float64))\n\nA stopping criterion for training iterative supervised learners.\n\nA stop is triggered when Prechelt's progress-modified generalization loss exceeds the threshold PQ_T alpha, or if the training progress drops below P_j tol. Here k is the number of training (in-sample) losses used to estimate the training progress.\n\nContext and explanation of terminology\n\nThe training progress at time j is defined by\n\nP_j = 1000 M - mm\n\nwhere M is the mean of the last k training losses F_1 F_2 F_k and m is the minimum value of those losses.\n\nThe progress-modified generalization loss at time t is then given by\n\nPQ_t = GL_t P_t\n\nwhere GL_t is the generalization loss at time t; see GL.\n\nPQ will stop when the following are true:\n\nAt least k training samples have been collected via done!(c::PQ, loss; training = true) or update_training(c::PQ, loss, state)\nThe last update was an out-of-sample update. (done!(::PQ, loss; training=true) is always false)\nThe progress-modified generalization loss exceeds the threshold PQ_t alpha OR the training progress stalls P_j tol.\n\nReference: Prechelt, Lutz (1998): \"Early Stopping- But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.Patience","page":"Controlling Iterative Models","title":"EarlyStopping.Patience","text":"Patience(; n=5)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered by n consecutive increases in the loss.\n\nDenoted \"UPs\" in Prechelt, Lutz (1998): \"Early Stopping- But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Info","page":"Controlling Iterative Models","title":"IterationControl.Info","text":"Info(f=identity)\n\nAn iteration control, as in, Info(my_loss_function). \n\nLog to Info the value of f(m), where m is the object being iterated. If IterativeControl.expose(m) has been overloaded, then log f(expose(m)) instead.\n\nCan be suppressed by setting the global verbosity level sufficiently low. \n\nSee also Warn, Error. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Warn","page":"Controlling Iterative Models","title":"IterationControl.Warn","text":"Warn(predicate; f=\"\")\n\nAn iteration control, as in, Warn(m -> length(m.cache) > 100, f=\"Memory low\"). \n\nIf predicate(m) is true, then log to Warn the value of f (or f(IterationControl.expose(m)) if f is a function). Here m is the object being iterated.\n\nCan be suppressed by setting the global verbosity level sufficiently low.\n\nSee also Info, Error. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Error","page":"Controlling Iterative Models","title":"IterationControl.Error","text":"Error(predicate; f=\"\", exception=nothing))\n\nAn iteration control, as in, Error(m -> isnan(m.bias), f=\"Bias overflow!\"). \n\nIf predicate(m) is true, then log at the Error level the value of f (or f(IterationControl.expose(m)) if f is a function) and stop iteration at the end of the current control cycle. Here m is the object being iterated.\n\nSpecify exception=... to throw an immediate execption, without waiting to the end of the control cycle.\n\nSee also Info, Warn. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Callback","page":"Controlling Iterative Models","title":"IterationControl.Callback","text":"Callback(f=_->nothing, stop_if_true=false, stop_message=nothing, raw=false)\n\nAn iteration control, as in, Callback(m->put!(v, my_loss_function(m)). \n\nCall f(IterationControl.expose(m)), where m is the object being iterated, unless raw=true, in which case call f(m) (guaranteed if expose has not been overloaded.) If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.WithNumberDo","page":"Controlling Iterative Models","title":"IterationControl.WithNumberDo","text":"WithNumberDo(f=n->@info(\"number: $n\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithNumberDo(n->put!(my_channel, n)). \n\nCall f(n + 1), where n is the number of complete control cycles. of the control (so, n = 1, 2, 3, ..., unless control is wrapped in a IterationControl.skip)`.\n\nIf stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithIterationsDo","page":"Controlling Iterative Models","title":"MLJIteration.WithIterationsDo","text":"WithIterationsDo(f=x->@info(\"iterations: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithIterationsDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the current number of model iterations (generally more than the number of control cycles). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.WithLossDo","page":"Controlling Iterative Models","title":"IterationControl.WithLossDo","text":"WithLossDo(f=x->@info(\"loss: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithLossDo(x->put!(my_losses, x)). \n\nCall f(loss), where loss is current loss.\n\nIf stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.WithTrainingLossesDo","page":"Controlling Iterative Models","title":"IterationControl.WithTrainingLossesDo","text":"WithTrainingLossesDo(f=v->@info(\"training: $v\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithTrainingLossesDo(v->put!(my_losses, last(v)). \n\nCall f(training_losses), where training_losses is the vector of most recent batch of training losses.\n\nIf stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithEvaluationDo","page":"Controlling Iterative Models","title":"MLJIteration.WithEvaluationDo","text":"WithEvaluationDo(f=x->@info(\"evaluation: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithEvaluationDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the latest performance evaluation, as returned by evaluate!(train_mach, resampling=..., ...). Not valid if resampling=nothing. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithFittedParamsDo","page":"Controlling Iterative Models","title":"MLJIteration.WithFittedParamsDo","text":"WithFittedParamsDo(f=x->@info(\"fitted_params: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithFittedParamsDo(x->put!(my_channel, x)). \n\nCall f(x), where x = fitted_params(mach) is the fitted parameters of the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithReportDo","page":"Controlling Iterative Models","title":"MLJIteration.WithReportDo","text":"WithReportDo(f=x->@info(\"report: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithReportDo(x->put!(my_channel, x)). \n\nCall f(x), where x = report(mach) is the report associated with the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithModelDo","page":"Controlling Iterative Models","title":"MLJIteration.WithModelDo","text":"WithModelDo(f=x->@info(\"model: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithModelDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the model associated with the training machine; f may mutate x, as in f(x) = (x.learning_rate *= 0.9). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithMachineDo","page":"Controlling Iterative Models","title":"MLJIteration.WithMachineDo","text":"WithMachineDo(f=x->@info(\"machine: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithMachineDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the training machine in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.Save","page":"Controlling Iterative Models","title":"MLJIteration.Save","text":"Save(filename=\"machine.jls\")\n\nAn iteration control, as in, Save(\"run3/machine.jls\"). \n\nSave the current state of the machine being iterated to disk, using the provided filename, decorated with a number, as in \"run3/machine42.jls\". The default behaviour uses the Serialization module but this can be changed by setting the method=save_fn(::String, ::Any) argument where save_fn is any serialization method. For more on what is meant by \"the machine being iterated\", see IteratedModel.\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#Control-wrappers","page":"Controlling Iterative Models","title":"Control wrappers","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.skip\nIterationControl.louder\nIterationControl.with_state_do\nIterationControl.composite","category":"page"},{"location":"controlling_iterative_models/#IterationControl.skip","page":"Controlling Iterative Models","title":"IterationControl.skip","text":"IterationControl.skip(control, predicate=1)\n\nAn iteration control wrapper.\n\nIf predicate is an integer, k: Apply control on every k calls to apply the wrapped control, starting with the kth call.\n\nIf predicate is a function: Apply control as usual when predicate(n + 1) is true but otherwise skip. Here n is the number of control cycles applied so far.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#IterationControl.louder","page":"Controlling Iterative Models","title":"IterationControl.louder","text":"IterationControl.louder(control, by=1)\n\nWrap control to make in more (or less) verbose. The same as control, but as if the global verbosity were increased by the value by.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#IterationControl.with_state_do","page":"Controlling Iterative Models","title":"IterationControl.with_state_do","text":"IterationControl.with_state_do(control,\n f=x->@info \"$(typeof(control)) state: $x\")\n\nWrap control to give access to it's internal state. Acts exactly like control except that f is called on the internal state of control. If f is not specified, the control type and state are logged to Info at every update (useful for debugging new controls).\n\nWarning. The internal state of a control is not yet considered part of the public interface and could change between in any pre 1.0 release of IterationControl.jl.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#IterationControl.composite","page":"Controlling Iterative Models","title":"IterationControl.composite","text":"composite(controls...)\n\nConstruct an iteration control that applies the specified controls in sequence.\n\n\n\n\n\n","category":"function"},{"location":"models/SODDetector_OutlierDetectionPython/#SODDetector_OutlierDetectionPython","page":"SODDetector","title":"SODDetector","text":"","category":"section"},{"location":"models/SODDetector_OutlierDetectionPython/","page":"SODDetector","title":"SODDetector","text":"SODDetector(n_neighbors = 5,\n ref_set = 10,\n alpha = 0.8)","category":"page"},{"location":"models/SODDetector_OutlierDetectionPython/","page":"SODDetector","title":"SODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sod","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#RandomUndersampler_Imbalance","page":"RandomUndersampler","title":"RandomUndersampler","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"Initiate a random undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"RandomUndersampler","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"A model type for constructing a random undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"RandomUndersampler = @load RandomUndersampler pkg=Imbalance","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"Do model = RandomUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomUndersampler(ratios=...).","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"RandomUndersampler implements naive undersampling by randomly removing existing observations. ","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Training-data","page":"RandomUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"For default values of the hyper-parameters, model can be constructed by model = RandomUndersampler()","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Hyperparameters","page":"RandomUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"ratios=1.0: A parameter that controls the amount of undersampling to be done for each class\nCan be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Transform-Inputs","page":"RandomUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Transform-Outputs","page":"RandomUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Operations","page":"RandomUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"transform(mach, X, y): resample the data X and y using RandomUndersampler, returning both the new and original observations","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Example","page":"RandomUndersampler","title":"Example","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y; ref=\"minority\")\n 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load RandomUndersampler\nRandomUndersampler = @load RandomUndersampler pkg=Imbalance\n\n## wrap the model in a machine\nundersampler = RandomUndersampler(ratios=Dict(0=>1.0, 1=> 1.0, 2=>1.0), \n rng=42)\nmach = machine(undersampler)\n\n## provide the data to transform (there is nothing to fit)\nX_under, y_under = transform(mach, X, y)\n \njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) ","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/#RandomForestRegressor_MLJScikitLearnInterface","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"","category":"section"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A model type for constructing a random forest regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor = @load RandomForestRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(n_estimators=...).","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.","category":"page"},{"location":"models/FillImputer_MLJModels/#FillImputer_MLJModels","page":"FillImputer","title":"FillImputer","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"FillImputer","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"FillImputer = @load FillImputer pkg=MLJModels","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"Use this model to impute missing values in tabular data. A fixed \"filler\" value is learned from the training data, one for each column of the table.","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"For imputing missing values in a vector, use UnivariateFillImputer instead.","category":"page"},{"location":"models/FillImputer_MLJModels/#Training-data","page":"FillImputer","title":"Training data","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"mach = machine(model, X)","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"where","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/FillImputer_MLJModels/#Hyper-parameters","page":"FillImputer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as \"impute all\".\ncontinuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values","category":"page"},{"location":"models/FillImputer_MLJModels/#Operations","page":"FillImputer","title":"Operations","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach","category":"page"},{"location":"models/FillImputer_MLJModels/#Fitted-parameters","page":"FillImputer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"features_seen_in_fit: the names of features (columns) encountered during training\nunivariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)\nfiller_given_feature: dictionary of filler values, keyed on feature (column) names","category":"page"},{"location":"models/FillImputer_MLJModels/#Examples","page":"FillImputer","title":"Examples","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"using MLJ\nimputer = FillImputer()\n\nX = (a = [1.0, 2.0, missing, 3.0, missing],\n b = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass),\n c = [1, 1, 2, missing, 3])\n\nschema(X)\njulia> schema(X)\n┌───────┬───────────────────────────────┐\n│ names │ scitypes │\n├───────┼───────────────────────────────┤\n│ a │ Union{Missing, Continuous} │\n│ b │ Union{Missing, Multiclass{2}} │\n│ c │ Union{Missing, Count} │\n└───────┴───────────────────────────────┘\n\nmach = machine(imputer, X)\nfit!(mach)\n\njulia> fitted_params(mach).filler_given_feature\n(filler = 2.0,)\n\njulia> fitted_params(mach).filler_given_feature\nDict{Symbol, Any} with 3 entries:\n :a => 2.0\n :b => \"y\"\n :c => 2\n\njulia> transform(mach, X)\n(a = [1.0, 2.0, 2.0, 3.0, 2.0],\n b = CategoricalValue{String, UInt32}[\"y\", \"n\", \"y\", \"y\", \"y\"],\n c = [1, 1, 2, 2, 3],)","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"See also UnivariateFillImputer.","category":"page"},{"location":"composing_models/#Composing-Models","page":"Composing Models","title":"Composing Models","text":"","category":"section"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Three common ways of combining multiple models together have out-of-the-box implementations in MLJ:","category":"page"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Linear Pipelines (Pipeline)- for unbranching chains that take the output of one model (e.g., dimension reduction, such as PCA) and make it the input of the next model in the chain (e.g., a classification model, such as EvoTreeClassifier). To include transformations of the target variable in a supervised pipeline model, see Target Transformations.\nHomogeneous Ensembles (EnsembleModel) - for blending the predictions of multiple supervised models all of the same type, but which receive different views of the training data to reduce overall variance. The technique implemented here is known as observation bagging. \nModel Stacking - (Stack) for combining the predictions of a smaller number of models of possibly different types, with the help of an adjudicating model.","category":"page"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Additionally, more complicated model compositions are possible using:","category":"page"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Learning Networks - \"blueprints\" for combining models in flexible ways; these are simple transformations of your existing workflows which can be \"exported\" to define new, stand-alone model types.","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/#OPTICS_MLJScikitLearnInterface","page":"OPTICS","title":"OPTICS","text":"","category":"section"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"OPTICS","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"A model type for constructing a optics, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"OPTICS = @load OPTICS pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"Do model = OPTICS() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OPTICS(min_samples=...).","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"OPTICS (Ordering Points To Identify the Clustering Structure), closely related to `DBSCAN', finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#OneHotEncoder_MLJModels","page":"OneHotEncoder","title":"OneHotEncoder","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"OneHotEncoder","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"OneHotEncoder = @load OneHotEncoder pkg=MLJModels","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"New data to be transformed may lack features present in the fit data, but no new features can be present.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Training-data","page":"OneHotEncoder","title":"Training data","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"mach = machine(model, X)","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"where","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Hyper-parameters","page":"OneHotEncoder","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.\nordered_factor=false: when true, OrderedFactor features are universally excluded\ndrop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Fitted-parameters","page":"OneHotEncoder","title":"Fitted parameters","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"all_features: names of all features encountered in training\nfitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name\nref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Report","page":"OneHotEncoder","title":"Report","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"The fields of report(mach) are:","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"features_to_be_encoded: names of input features to be encoded\nnew_features: names of all output features","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Example","page":"OneHotEncoder","title":"Example","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"using MLJ\n\nX = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n└───────────┴──────────────────┘\n\nhot = OneHotEncoder(drop_last=true)\nmach = fit!(machine(hot, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade__A │ Continuous │\n│ grade__B │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Count │\n└──────────────┴────────────┘","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"See also ContinuousEncoder.","category":"page"},{"location":"internals/#internals_section","page":"Internals","title":"Internals","text":"","category":"section"},{"location":"internals/#The-machine-interface,-simplified","page":"Internals","title":"The machine interface, simplified","text":"","category":"section"},{"location":"internals/","page":"Internals","title":"Internals","text":"The following is a simplified description of the Machine interface. It predates the introduction of an optional data front-end for models (see Implementing a data front-end). See also the Glossary","category":"page"},{"location":"internals/#The-Machine-type","page":"Internals","title":"The Machine type","text":"","category":"section"},{"location":"internals/","page":"Internals","title":"Internals","text":"mutable struct Machine{M fit!\n\nXnew, _ = make_regression(3, 9)\nyhat = predict(mach, Xnew) ## new predictions","category":"page"},{"location":"models/MultitargetLinearRegressor_MultivariateStats/","page":"MultitargetLinearRegressor","title":"MultitargetLinearRegressor","text":"See also LinearRegressor, RidgeRegressor, MultitargetRidgeRegressor","category":"page"},{"location":"models/CDDetector_OutlierDetectionPython/#CDDetector_OutlierDetectionPython","page":"CDDetector","title":"CDDetector","text":"","category":"section"},{"location":"models/CDDetector_OutlierDetectionPython/","page":"CDDetector","title":"CDDetector","text":"CDDetector(whitening = true,\n rule_of_thumb = false)","category":"page"},{"location":"models/CDDetector_OutlierDetectionPython/","page":"CDDetector","title":"CDDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cd","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#ConstantRegressor_MLJModels","page":"ConstantRegressor","title":"ConstantRegressor","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"ConstantRegressor","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"This \"dummy\" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution returned is the one of the type specified that best fits the training target data. Use predict_mean or predict_median to predict the mean or median values instead. If not specified, a normal distribution is fit.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"Almost any reasonable model is expected to outperform ConstantRegressor which is used almost exclusively for testing and establishing performance baselines.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"In MLJ (or MLJModels) do model = ConstantRegressor() or model = ConstantRegressor(distribution=...) to construct a model instance.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Training-data","page":"ConstantRegressor","title":"Training data","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"In MLJ (or MLJBase) bind an instance model to data with","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"Here:","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"X is any table of input features (eg, a DataFrame)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with schema(y)","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Hyper-parameters","page":"ConstantRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"distribution_type=Distributions.Normal: The distribution to be fit to the target data. Must be a subtype of Distributions.ContinuousUnivariateDistribution.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Operations","page":"ConstantRegressor","title":"Operations","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.\npredict_mean(mach, Xnew): Return instead the means of the probabilistic predictions returned above.\npredict_median(mach, Xnew): Return instead the medians of the probabilistic predictions returned above.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Fitted-parameters","page":"ConstantRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"target_distribution: The distribution fit to the supplied target data.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Examples","page":"ConstantRegressor","title":"Examples","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"using MLJ\n\nX, y = make_regression(10, 2) ## synthetic data: a table and vector\nregressor = ConstantRegressor()\nmach = machine(regressor, X, y) |> fit!\n\nfitted_params(mach)\n\nXnew, _ = make_regression(3, 2)\npredict(mach, Xnew)\npredict_mean(mach, Xnew)\n","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"See also ConstantClassifier","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/#ElasticNetRegressor_MLJScikitLearnInterface","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"A model type for constructing a elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetRegressor(alpha=...).","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"ElasticNetRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"alpha = 1.0\nl1_ratio = 0.5\nfit_intercept = true\nprecompute = false\nmax_iter = 1000\ncopy_X = true\ntol = 0.0001\nwarm_start = false\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#SubspaceLDA_MultivariateStats","page":"SubspaceLDA","title":"SubspaceLDA","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"SubspaceLDA","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"A model type for constructing a subpace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Do model = SubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SubspaceLDA(normalize=...).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Multiclass subspace linear discriminant analysis (LDA) is a variation on ordinary LDA suitable for high dimensional data, as it avoids storing scatter matrices. For details, refer the MultivariateStats.jl documentation.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"In addition to dimension reduction (using transform) probabilistic classification is provided (using predict). In the case of classification, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here \"distance\" is computed using a user-specified distance function.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Training-data","page":"SubspaceLDA","title":"Training data","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Here:","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Hyper-parameters","page":"SubspaceLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.\noutdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.\ndist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Operations","page":"SubspaceLDA","title":"Operations","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Fitted-parameters","page":"SubspaceLDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Report","page":"SubspaceLDA","title":"Report","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The mean of the untransformed training data. A vector of length indim.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool)","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nexplained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Examples","page":"SubspaceLDA","title":"Examples","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"using MLJ\n\nSubspaceLDA = @load SubspaceLDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = SubspaceLDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"See also LDA, BayesianLDA, BayesianSubspaceLDA","category":"page"},{"location":"generating_synthetic_data/#Generating-Synthetic-Data","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"Here synthetic data means artificially generated data, with no reference to a \"real world\" data set. Not to be confused \"fake data\" obtained by resampling from a distribution fit to some actual real data.","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"MLJ has a set of functions - make_blobs, make_circles, make_moons and make_regression (closely resembling functions in scikit-learn of the same name) - for generating synthetic data sets. These are useful for testing machine learning models (e.g., testing user-defined composite models; see Composing Models)","category":"page"},{"location":"generating_synthetic_data/#Generating-Gaussian-blobs","page":"Generating Synthetic Data","title":"Generating Gaussian blobs","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_blobs","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_blobs","page":"Generating Synthetic Data","title":"MLJBase.make_blobs","text":"X, y = make_blobs(n=100, p=2; kwargs...)\n\nGenerate Gaussian blobs for clustering and classification problems.\n\nReturn value\n\nBy default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\ncenters=3: either a number of centers or a c x p matrix with c pre-determined centers,\ncluster_std=1.0: the standard deviation(s) of each blob,\ncenter_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])\ndfBlobs = DataFrame(X)\ndfBlobs.y = y\nfirst(dfBlobs, 3)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using VegaLite\ndfBlobs |> @vlplot(:point, x=:x1, y=:x2, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"dfBlobs |> @vlplot(:point, x=:x1, y=:x3, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/#Generating-concentric-circles","page":"Generating Synthetic Data","title":"Generating concentric circles","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_circles","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_circles","page":"Generating Synthetic Data","title":"MLJBase.make_circles","text":"X, y = make_circles(n=100; kwargs...)\n\nGenerate n labeled points close to two concentric circles for classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0: standard deviation of the Gaussian noise added to the data,\nfactor=0.8: ratio of the smaller radius over the larger one,\n\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_circles(100; noise=0.5, factor=0.3)\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_circles(100; noise=0.05, factor=0.3)\ndfCircles = DataFrame(X)\ndfCircles.y = y\nfirst(dfCircles, 3)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using VegaLite\ndfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/#Sampling-from-two-interleaved-half-circles","page":"Generating Synthetic Data","title":"Sampling from two interleaved half-circles","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_moons","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_moons","page":"Generating Synthetic Data","title":"MLJBase.make_moons","text":"make_moons(n::Int=100; kwargs...)\n\nGenerates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0.1: standard deviation of the Gaussian noise added to the data,\nxshift=1.0: horizontal translation of the second center with respect to the first one.\nyshift=0.3: vertical translation of the second center with respect to the first one. \neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_moons(100; noise=0.5)\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_moons(100; noise=0.05)\ndfHalfCircles = DataFrame(X)\ndfHalfCircles.y = y\nfirst(dfHalfCircles, 3)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using VegaLite\ndfHalfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/#Regression-data-generated-from-noisy-linear-models","page":"Generating Synthetic Data","title":"Regression data generated from noisy linear models","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_regression","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_regression","page":"Generating Synthetic Data","title":"MLJBase.make_regression","text":"make_regression(n, p; kwargs...)\n\nGenerate Gaussian input features and a linear response with Gaussian noise, for use with regression models.\n\nReturn value\n\nBy default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.\n\nKeywords\n\nintercept=true: Whether to generate data from a model with intercept.\nn_targets=1: Number of columns in the target.\nsparse=0: Proportion of the generating weight vector that is sparse.\nnoise=0.1: Standard deviation of the Gaussian noise added to the response (target).\noutliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)\nas_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.\neltype=Float64: Element type for X and y. Must subtype AbstractFloat.\nbinary=false: Whether the target should be binarized (via a sigmoid).\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). \n\nExample\n\nX, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)\ndfRegression = DataFrame(X)\ndfRegression.y = y\nfirst(dfRegression, 3)","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/#FeatureAgglomeration_MLJScikitLearnInterface","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"","category":"section"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"FeatureAgglomeration","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"A model type for constructing a feature agglomeration, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"FeatureAgglomeration = @load FeatureAgglomeration pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"Do model = FeatureAgglomeration() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureAgglomeration(n_clusters=...).","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"Similar to AgglomerativeClustering, but recursively merges features instead of samples.\"","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/#SVMRegressor_MLJScikitLearnInterface","page":"SVMRegressor","title":"SVMRegressor","text":"","category":"section"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"SVMRegressor","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"A model type for constructing a epsilon-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"SVMRegressor = @load SVMRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"Do model = SVMRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMRegressor(kernel=...).","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"kernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\ntol = 0.001\nC = 1.0\nepsilon = 0.1\nshrinking = true\ncache_size = 200\nmax_iter = -1","category":"page"},{"location":"models/SimpleImputer_BetaML/#SimpleImputer_BetaML","page":"SimpleImputer","title":"SimpleImputer","text":"","category":"section"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"mutable struct SimpleImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"Impute missing values using feature (column) mean, with optional record normalisation (using l-norm norms), from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/SimpleImputer_BetaML/#Hyperparameters:","page":"SimpleImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"statistic::Function: The descriptive statistic of the column (feature) to use as imputed value [def: mean]\nnorm::Union{Nothing, Int64}: Normalise the feature mean by l-norm norm of the records [default: nothing]. Use it (e.g. norm=1 to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).","category":"page"},{"location":"models/SimpleImputer_BetaML/#Example:","page":"SimpleImputer","title":"Example:","text":"","category":"section"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"julia> using MLJ\n\njulia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;\n\njulia> modelType = @load SimpleImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.SimpleImputer\n\njulia> model = modelType(norm=1)\nSimpleImputer(\n statistic = Statistics.mean, \n norm = 1)\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(SimpleImputer(statistic = mean, …), …).\n\njulia> X_full = transform(mach) |> MLJ.matrix\n9×2 Matrix{Float64}:\n 1.0 10.5\n 1.5 0.295466\n 1.8 8.0\n 1.7 15.0\n 3.2 40.0\n 0.280952 1.69524\n 3.3 38.0\n 0.0750839 -2.3\n 5.2 -2.4","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#UnivariateDiscretizer_MLJModels","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"UnivariateDiscretizer","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Training-data","page":"UnivariateDiscretizer","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"where","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"x: any abstract vector with Continuous element scitype; check scitype with scitype(x).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Hyper-parameters","page":"UnivariateDiscretizer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"n_classes: number of discrete classes in the output","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Operations","page":"UnivariateDiscretizer","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach\ninverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Fitted-parameters","page":"UnivariateDiscretizer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"The fields of fitted_params(mach).fitesult include:","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"odd_quantiles: quantiles used for transforming (length is n_classes - 1)\neven_quantiles: quantiles used for inverse transforming (length is n_classes)","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Example","page":"UnivariateDiscretizer","title":"Example","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"using MLJ\nusing Random\nRandom.seed!(123)\n\ndiscretizer = UnivariateDiscretizer(n_classes=100)\nmach = machine(discretizer, randn(1000))\nfit!(mach)\n\njulia> x = rand(5)\n5-element Vector{Float64}:\n 0.8585244609846809\n 0.37541692370451396\n 0.6767070590395461\n 0.9208844241267105\n 0.7064611415680901\n\njulia> z = transform(mach, x)\n5-element CategoricalArrays.CategoricalArray{UInt8,1,UInt8}:\n 0x52\n 0x42\n 0x4d\n 0x54\n 0x4e\n\nx_approx = inverse_transform(mach, z)\njulia> x - x_approx\n5-element Vector{Float64}:\n 0.008224506144777322\n 0.012731354778359405\n 0.0056265330571125816\n 0.005738175684445124\n 0.006835652575801987","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/#GaussianNBClassifier_MLJScikitLearnInterface","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"","category":"section"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"A model type for constructing a Gaussian naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier = @load GaussianNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianNBClassifier(priors=...).","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"GaussianNBClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"priors = nothing\nvar_smoothing = 1.0e-9","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#GaussianNBClassifier_NaiveBayes","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"A model type for constructing a Gaussian naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier = @load GaussianNBClassifier pkg=NaiveBayes","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Given each class taken on by the target variable y, it is supposed that the conditional probability distribution for the input variables X is a multivariate Gaussian. The mean and covariance of these Gaussian distributions are estimated using maximum likelihood, and a probability distribution for y given X is deduced by applying Bayes' rule. The required marginal for y is estimated using class frequency in the training data.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Important. The name \"naive Bayes classifier\" is perhaps misleading. Since we are learning the full multivariate Gaussian distributions for X given y, we are not applying the usual naive Bayes independence condition, which would amount to forcing the covariance matrix to be diagonal.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Training-data","page":"GaussianNBClassifier","title":"Training data","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Here:","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Operations","page":"GaussianNBClassifier","title":"Operations","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic.\npredict_mode(mach, Xnew): Return the mode of above predictions.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Fitted-parameters","page":"GaussianNBClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"c_counts: A dictionary containing the observed count of each input class.\nc_stats: A dictionary containing observed statistics on each input class. Each class is represented by a DataStats object, with the following fields:\nn_vars: The number of variables used to describe the class's behavior.\nn_obs: The number of times the class is observed.\nobs_axis: The axis along which the observations were computed.\ngaussians: A per class dictionary of Gaussians, each representing the distribution of the class. Represented with type Distributions.MvNormal from the Distributions.jl package.\nn_obs: The total number of observations in the training data.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Examples","page":"GaussianNBClassifier","title":"Examples","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"using MLJ\nGaussianNB = @load GaussianNBClassifier pkg=NaiveBayes\n\nX, y = @load_iris\nclf = GaussianNB()\nmach = machine(clf, X, y) |> fit!\n\nfitted_params(mach)\n\npreds = predict(mach, X) ## probabilistic predictions\npreds[1]\npredict_mode(mach, X) ## point predictions","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"See also MultinomialNBClassifier","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#CatBoostRegressor_CatBoost","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"CatBoostRegressor","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"A model type for constructing a CatBoost regressor, based on CatBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"CatBoostRegressor = @load CatBoostRegressor pkg=CatBoost","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"Do model = CatBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostRegressor(iterations=...).","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Training-data","page":"CatBoostRegressor","title":"Training data","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"where","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.\ny: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Hyper-parameters","page":"CatBoostRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Operations","page":"CatBoostRegressor","title":"Operations","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Accessor-functions","page":"CatBoostRegressor","title":"Accessor functions","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Fitted-parameters","page":"CatBoostRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"model: The Python CatBoostRegressor model","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Report","page":"CatBoostRegressor","title":"Report","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"feature_importances: Vector{Pair{Symbol, Float64}} of feature importances","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Examples","page":"CatBoostRegressor","title":"Examples","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"using CatBoost.MLJCatBoostInterface\nusing MLJ\n\nX = (\n duration = [1.5, 4.1, 5.0, 6.7], \n n_phone_calls = [4, 5, 6, 7], \n department = coerce([\"acc\", \"ops\", \"acc\", \"ops\"], Multiclass), \n)\ny = [2.0, 4.0, 6.0, 7.0]\n\nmodel = CatBoostRegressor(iterations=5)\nmach = machine(model, X, y)\nfit!(mach)\npreds = predict(mach, X)","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"See also catboost and the unwrapped model type CatBoost.CatBoostRegressor.","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/#RidgeClassifier_MLJScikitLearnInterface","page":"RidgeClassifier","title":"RidgeClassifier","text":"","category":"section"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"RidgeClassifier","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"A model type for constructing a ridge regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"RidgeClassifier = @load RidgeClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"Do model = RidgeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeClassifier(alpha=...).","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"alpha = 1.0\nfit_intercept = true\ncopy_X = true\nmax_iter = nothing\ntol = 0.001\nclass_weight = nothing\nsolver = auto\nrandom_state = nothing","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/#LassoRegressor_MLJScikitLearnInterface","page":"LassoRegressor","title":"LassoRegressor","text":"","category":"section"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"A model type for constructing a lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor = @load LassoRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"Do model = LassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoRegressor(alpha=...).","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"alpha = 1.0\nfit_intercept = true\nprecompute = false\ncopy_X = true\nmax_iter = 1000\ntol = 0.0001\nwarm_start = false\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/KDEDetector_OutlierDetectionPython/#KDEDetector_OutlierDetectionPython","page":"KDEDetector","title":"KDEDetector","text":"","category":"section"},{"location":"models/KDEDetector_OutlierDetectionPython/","page":"KDEDetector","title":"KDEDetector","text":"KDEDetector(bandwidth=1.0,\n algorithm=\"auto\",\n leaf_size=30,\n metric=\"minkowski\",\n metric_params=None)","category":"page"},{"location":"models/KDEDetector_OutlierDetectionPython/","page":"KDEDetector","title":"KDEDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.kde","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#ConstantClassifier_MLJModels","page":"ConstantClassifier","title":"ConstantClassifier","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"ConstantClassifier","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"This \"dummy\" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Training-data","page":"ConstantClassifier","title":"Training data","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"Here:","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"X is any table of input features (eg, a DataFrame)\ny is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Hyper-parameters","page":"ConstantClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"None.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Operations","page":"ConstantClassifier","title":"Operations","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.\npredict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Fitted-parameters","page":"ConstantClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"target_distribution: The distribution fit to the supplied target data.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Examples","page":"ConstantClassifier","title":"Examples","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"using MLJ\n\nclf = ConstantClassifier()\n\nX, y = @load_crabs ## a table and a categorical vector\nmach = machine(clf, X, y) |> fit!\n\nfitted_params(mach)\n\nXnew = (;FL = [8.1, 24.8, 7.2],\n RW = [5.1, 25.7, 6.4],\n CL = [15.9, 46.7, 14.3],\n CW = [18.7, 59.7, 12.2],\n BD = [6.2, 23.6, 8.4],)\n\n## probabilistic predictions:\nyhat = predict(mach, Xnew)\nyhat[1]\n\n## raw probabilities:\npdf.(yhat, \"B\")\n\n## probability matrix:\nL = levels(y)\npdf(yhat, L)\n\n## point predictions:\npredict_mode(mach, Xnew)","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"See also ConstantRegressor","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#ClusterUndersampler_Imbalance","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"Initiate a cluster undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ClusterUndersampler","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"A model type for constructing a cluster undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ClusterUndersampler = @load ClusterUndersampler pkg=Imbalance","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"Do model = ClusterUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ClusterUndersampler(mode=...).","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ClusterUndersampler implements clustering undersampling as presented in Wei-Chao, L., Chih-Fong, T., Ya-Han, H., & Jing-Shang, J. (2017). Clustering-based undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26. with K-means as the clustering algorithm.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Training-data","page":"ClusterUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by \tmach = machine(model)","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"For default values of the hyper-parameters, model can be constructed with model = ClusterUndersampler().","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Hyperparameters","page":"ClusterUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"mode::AbstractString=\"nearest: If \"center\" then the undersampled data will consist of the centriods of","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"each cluster found; if `\"nearest\"` then it will consist of the nearest neighbor of each centroid.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ratios=1.0: A parameter that controls the amount of undersampling to be done for each class\nCan be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float ratio for that class\nmaxiter::Integer=100: Maximum number of iterations to run K-means\nrng::Integer=42: Random number generator seed. Must be an integer.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Transform-Inputs","page":"ClusterUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Transform-Outputs","page":"ClusterUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Operations","page":"ClusterUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"transform(mach, X, y): resample the data X and y using ClusterUndersampler, returning the undersampled versions","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Example","page":"ClusterUndersampler","title":"Example","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n \njulia> Imbalance.checkbalance(y; ref=\"minority\")\n 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load cluster_undersampling\nClusterUndersampler = @load ClusterUndersampler pkg=Imbalance\n\n## wrap the model in a machine\nundersampler = ClusterUndersampler(mode=\"nearest\", \n ratios=Dict(0=>1.0, 1=> 1.0, 2=>1.0), rng=42)\nmach = machine(undersampler)\n\n## provide the data to transform (there is nothing to fit)\nX_under, y_under = transform(mach, X, y)\n\n \njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#MultitargetRidgeRegressor_MultivariateStats","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"MultitargetRidgeRegressor","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"A model type for constructing a multitarget ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"MultitargetRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Do model = MultitargetRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetRidgeRegressor(lambda=...).","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Multi-target ridge regression adds a quadratic penalty term to multi-target least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. In this case, the output represents a response vector. Options exist to specify a bias term, and to adjust the strength of the penalty term.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Training-data","page":"MultitargetRidgeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Hyper-parameters","page":"MultitargetRidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.\nbias=true: Include the bias term if true, otherwise fit without bias term.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Operations","page":"MultitargetRidgeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Fitted-parameters","page":"MultitargetRidgeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"coefficients: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Examples","page":"MultitargetRidgeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"using MLJ\nusing DataFrames\n\nRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats\n\nX, y = make_regression(100, 6; n_targets = 2) ## a table and a table (synthetic data)\n\nridge_regressor = RidgeRegressor(lambda=1.5)\nmach = machine(ridge_regressor, X, y) |> fit!\n\nXnew, _ = make_regression(3, 6)\nyhat = predict(mach, Xnew) ## new predictions","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"See also LinearRegressor, MultitargetLinearRegressor, RidgeRegressor","category":"page"},{"location":"frequently_asked_questions/#Frequently-Asked-Questions","page":"FAQ","title":"Frequently Asked Questions","text":"","category":"section"},{"location":"frequently_asked_questions/#Julia-already-has-a-great-machine-learning-toolbox,-ScitkitLearn.jl.-Why-MLJ?","page":"FAQ","title":"Julia already has a great machine learning toolbox, ScitkitLearn.jl. Why MLJ?","text":"","category":"section"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"An alternative machine learning toolbox for Julia users is ScikitLearn.jl. Initially intended as a Julia wrapper for the popular python library scikit-learn, ML algorithms written in Julia can also implement the ScikitLearn.jl API. Meta-algorithms (systematic tuning, pipelining, etc) remain python wrapped code, however.","category":"page"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"While ScikitLearn.jl provides the Julia user with access to a mature and large library of machine learning models, the scikit-learn API on which it is modeled, dating back to 2007, is not likely to evolve significantly in the future. MLJ enjoys (or will enjoy) several features that should make it an attractive alternative in the longer term:","category":"page"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"One language. ScikitLearn.jl wraps Python code, which in turn wraps C code for performance-critical routines. A Julia machine learning algorithm that implements the MLJ model interface is 100% Julia. Writing code in Julia is almost as fast as Python and well-written Julia code runs almost as fast as C. Additionally, a single language design provides superior interoperability. For example, one can implement: (i) gradient-descent tuning of hyperparameters, using automatic differentiation libraries such as Flux.jl; and (ii) GPU performance boosts without major code refactoring, using CuArrays.jl.\nRegistry for model metadata. In ScikitLearn.jl the list of available models, as well as model metadata (whether a model handles categorical inputs, whether it can make probabilistic predictions, etc) must be gleaned from the documentation. In MLJ, this information is more structured and is accessible to MLJ via a searchable model registry (without the models needing to be loaded).\nFlexible API for model composition. Pipelines in scikit-learn are more of an afterthought than an integral part of the original design. By contrast, MLJ's user-interaction API was predicated on the requirements of a flexible \"learning network\" API, one that allows models to be connected in essentially arbitrary ways (such as Wolpert model stacks). Networks can be built and tested in stages before being exported as first-class stand-alone models. Networks feature \"smart\" training (only necessary components are retrained after parameter changes) and will eventually be trainable using a DAG scheduler.\nClean probabilistic API. The scikit-learn API does not specify a universal standard for the form of probabilistic predictions. By fixing a probabilistic API along the lines of the skpro project, MLJ aims to improve support for Bayesian statistics and probabilistic graphical models.\nUniversal adoption of categorical data types. Python's scientific array library NumPy has no dedicated data type for representing categorical data (i.e., no type that tracks the pool of all possible classes). Generally, scikit-learn models deal with this by requiring data to be relabeled as integers. However, the naive user trains a model on relabeled categorical data only to discover that evaluation on a test set crashes their code because a categorical feature takes on a value not observed in training. MLJ mitigates such issues by insisting on the use of categorical data types, and by insisting that MLJ model implementations preserve the class pools. If, for example, a training target contains classes in the pool that do not appear in the training set, a probabilistic prediction will nevertheless predict a distribution whose support includes the missing class, but which is appropriately weighted with probability zero.","category":"page"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"Finally, we note that a large number of ScikitLearn.jl models are now wrapped for use in MLJ.","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/#AffinityPropagation_MLJScikitLearnInterface","page":"AffinityPropagation","title":"AffinityPropagation","text":"","category":"section"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"AffinityPropagation","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"A model type for constructing a Affinity Propagation Clustering of data, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"AffinityPropagation = @load AffinityPropagation pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"Do model = AffinityPropagation() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damping=...).","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/#Hyper-parameters","page":"AffinityPropagation","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"damping = 0.5\nmax_iter = 200\nconvergence_iter = 15\ncopy = true\npreference = nothing\naffinity = euclidean\nverbose = false","category":"page"},{"location":"more_on_probabilistic_predictors/#More-on-Probabilistic-Predictors","page":"More on Probabilistic Predictors","title":"More on Probabilistic Predictors","text":"","category":"section"},{"location":"more_on_probabilistic_predictors/","page":"More on Probabilistic Predictors","title":"More on Probabilistic Predictors","text":"Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.","category":"page"},{"location":"more_on_probabilistic_predictors/","page":"More on Probabilistic Predictors","title":"More on Probabilistic Predictors","text":"The positive class is always the second class returned when calling levels on the training target y.","category":"page"},{"location":"more_on_probabilistic_predictors/","page":"More on Probabilistic Predictors","title":"More on Probabilistic Predictors","text":"MLJModels.BinaryThresholdPredictor","category":"page"},{"location":"more_on_probabilistic_predictors/#MLJModels.BinaryThresholdPredictor","page":"More on Probabilistic Predictors","title":"MLJModels.BinaryThresholdPredictor","text":"BinaryThresholdPredictor(model; threshold=0.5)\n\nWrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.\n\nBy convention the positive class is the second class returned by levels(y), where y is the target.\n\nIf threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.\n\nExample\n\nBelow is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.\n\nLoading the data:\n\nusing MLJ, Random\nrng = Xoshiro(123)\n\ndiabetes = OpenML.load(43582)\noutcome, X = unpack(diabetes, ==(:Outcome), rng=rng);\ny = coerce(Int.(outcome), OrderedFactor);\n\nChoosing a probabilistic classifier:\n\nEvoTreesClassifier = @load EvoTreesClassifier\nprob_predictor = EvoTreesClassifier()\n\nWrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:\n\npoint_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)\nXnew, _ = make_moons(3, rng=rng)\nmach = machine(point_predictor, X, y) |> fit!\npredict(mach, X)[1:3] # [0, 0, 0]\n\nEstimating performance:\n\nbalanced = BalancedAccuracy(adjusted=true)\ne = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])\ne.measurement[1] # 0.405 ± 0.089\n\nWrapping in tuning strategy to learn threshold that maximizes balanced accuracy:\n\nr = range(point_predictor, :threshold, lower=0.1, upper=0.9)\ntuned_point_predictor = TunedModel(\n point_predictor,\n tuning=RandomSearch(rng=rng),\n resampling=CV(nfolds=6),\n range = r,\n measure=balanced,\n n=30,\n)\nmach2 = machine(tuned_point_predictor, X, y) |> fit!\noptimized_point_predictor = report(mach2).best_model\noptimized_point_predictor.threshold # 0.260\npredict(mach2, X)[1:3] # [1, 1, 0]\n\nEstimating the performance of the auto-thresholding model (nested resampling here):\n\ne = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])\ne.measurement[1] # 0.477 ± 0.110\n\n\n\n\n\n","category":"type"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/#LogisticCVClassifier_MLJScikitLearnInterface","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"","category":"section"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"LogisticCVClassifier","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"A model type for constructing a logistic regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"LogisticCVClassifier = @load LogisticCVClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"Do model = LogisticCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticCVClassifier(Cs=...).","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"LogisticCVClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"Cs = 10\nfit_intercept = true\ncv = 5\ndual = false\npenalty = l2\nscoring = nothing\nsolver = lbfgs\ntol = 0.0001\nmax_iter = 100\nclass_weight = nothing\nn_jobs = nothing\nverbose = 0\nrefit = true\nintercept_scaling = 1.0\nmulti_class = auto\nrandom_state = nothing\nl1_ratios = nothing","category":"page"},{"location":"models/ROSE_Imbalance/#ROSE_Imbalance","page":"ROSE","title":"ROSE","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Initiate a ROSE model with the given hyper-parameters.","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"ROSE","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"A model type for constructing a rose, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"ROSE = @load ROSE pkg=Imbalance","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Do model = ROSE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ROSE(s=...).","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"ROSE implements the ROSE (Random Oversampling Examples) algorithm to correct for class imbalance as in G Menardi, N. Torelli, “Training and assessing classification rules with imbalanced data,” Data Mining and Knowledge Discovery, 28(1), pp.92-122, 2014.","category":"page"},{"location":"models/ROSE_Imbalance/#Training-data","page":"ROSE","title":"Training data","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"For default values of the hyper-parameters, model can be constructed by model = ROSE()","category":"page"},{"location":"models/ROSE_Imbalance/#Hyperparameters","page":"ROSE","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"s::float: A parameter that proportionally controls the bandwidth of the Gaussian kernel\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/ROSE_Imbalance/#Transform-Inputs","page":"ROSE","title":"Transform Inputs","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/ROSE_Imbalance/#Transform-Outputs","page":"ROSE","title":"Transform Outputs","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/ROSE_Imbalance/#Operations","page":"ROSE","title":"Operations","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"transform(mach, X, y): resample the data X and y using ROSE, returning both the new and original observations","category":"page"},{"location":"models/ROSE_Imbalance/#Example","page":"ROSE","title":"Example","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n## load ROSE\nROSE = @load ROSE pkg=Imbalance\n\n## wrap the model in a machine\noversampler = ROSE(s=0.3, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"simple_user_defined_models/#Simple-User-Defined-Models","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"","category":"section"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"To quickly implement a new supervised model in MLJ, it suffices to:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"Define a mutable struct to store hyperparameters. This is either a subtype of Probabilistic or Deterministic, depending on whether probabilistic or ordinary point predictions are intended. This struct is the model.\nDefine a fit method, dispatched on the model, returning learned parameters, also known as the fitresult.\nDefine a predict method, dispatched on the model, and the fitresult, to return predictions on new patterns.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"In the examples below, the training input X of fit, and the new input Xnew passed to predict, are tables. Each training target y is an AbstractVector.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"The predictions returned by predict have the same form as y for deterministic models, but are Vectors of distributions for probabilistic models.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"Advanced model functionality not addressed here includes: (i) optional update method to avoid redundant calculations when calling fit! on machines a second time; (ii) reporting extra training-related statistics; (iii) exposing model-specific functionality; (iv) checking the scientific type of data passed to your model in machine construction; and (iv) checking the validity of hyperparameter values. All this is described in Adding Models for General Use.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"For an unsupervised model, implement transform and, optionally, inverse_transform using the same signature at predict below.","category":"page"},{"location":"simple_user_defined_models/#A-simple-deterministic-regressor","page":"Simple User Defined Models","title":"A simple deterministic regressor","text":"","category":"section"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"Here's a quick-and-dirty implementation of a ridge regressor with no intercept:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"using MLJ; color_off() # hide\nimport MLJBase\nusing LinearAlgebra\n\nmutable struct MyRegressor <: MLJBase.Deterministic\n lambda::Float64\nend\nMyRegressor(; lambda=0.1) = MyRegressor(lambda)\n\n# fit returns coefficients minimizing a penalized rms loss function:\nfunction MLJBase.fit(model::MyRegressor, verbosity, X, y)\n x = MLJBase.matrix(X) # convert table to matrix\n fitresult = (x'x + model.lambda*I)\\(x'y) # the coefficients\n cache = nothing\n report = nothing\n return fitresult, cache, report\nend\n\n# predict uses coefficients to make a new prediction:\nMLJBase.predict(::MyRegressor, fitresult, Xnew) = MLJBase.matrix(Xnew) * fitresult\nnothing # hide","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"After loading this code, all MLJ's basic meta-algorithms can be applied to MyRegressor:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"using MLJ # hide\nX, y = @load_boston;\nmodel = MyRegressor(lambda=1.0)\nregressor = machine(model, X, y)\nevaluate!(regressor, resampling=CV(), measure=rms, verbosity=0)","category":"page"},{"location":"simple_user_defined_models/#A-simple-probabilistic-classifier","page":"Simple User Defined Models","title":"A simple probabilistic classifier","text":"","category":"section"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"The following probabilistic model simply fits a probability distribution to the MultiClass training target (i.e., ignores X) and returns this pdf for any new pattern:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"using MLJ # hide\nimport MLJBase\nimport Distributions\n\nstruct MyClassifier <: MLJBase.Probabilistic\nend\n\n# `fit` ignores the inputs X and returns the training target y\n# probability distribution:\nfunction MLJBase.fit(model::MyClassifier, verbosity, X, y)\n fitresult = Distributions.fit(MLJBase.UnivariateFinite, y)\n cache = nothing\n report = nothing\n return fitresult, cache, report\nend\n\n# `predict` returns the passed fitresult (pdf) for all new patterns:\nMLJBase.predict(model::MyClassifier, fitresult, Xnew) =\n [fitresult for r in 1:nrows(Xnew)]","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"X, y = @load_iris;\nmach = machine(MyClassifier(), X, y) |> fit!;\npredict(mach, selectrows(X, 1:2))","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/#BayesianRidgeRegressor_MLJScikitLearnInterface","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"","category":"section"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"BayesianRidgeRegressor","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"A model type for constructing a Bayesian ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"BayesianRidgeRegressor = @load BayesianRidgeRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"Do model = BayesianRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianRidgeRegressor(n_iter=...).","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"BayesianRidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"n_iter = 300\ntol = 0.001\nalpha_1 = 1.0e-6\nalpha_2 = 1.0e-6\nlambda_1 = 1.0e-6\nlambda_2 = 1.0e-6\ncompute_score = false\nfit_intercept = true\ncopy_X = true\nverbose = false","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/#RidgeCVClassifier_MLJScikitLearnInterface","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"","category":"section"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"RidgeCVClassifier","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"A model type for constructing a ridge regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"RidgeCVClassifier = @load RidgeCVClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"Do model = RidgeCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVClassifier(alphas=...).","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeCVClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"alphas = [0.1, 1.0, 10.0]\nfit_intercept = true\nscoring = nothing\ncv = 5\nclass_weight = nothing\nstore_cv_values = false","category":"page"},{"location":"models/ICA_MultivariateStats/#ICA_MultivariateStats","page":"ICA","title":"ICA","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"ICA","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"A model type for constructing a independent component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"ICA = @load ICA pkg=MultivariateStats","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Do model = ICA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ICA(outdim=...).","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Independent component analysis is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.","category":"page"},{"location":"models/ICA_MultivariateStats/#Training-data","page":"ICA","title":"Training data","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"mach = machine(model, X)","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Here:","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ICA_MultivariateStats/#Hyper-parameters","page":"ICA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"outdim::Int=0: The number of independent components to recover, set automatically if 0.\nalg::Symbol=:fastica: The algorithm to use (only :fastica is supported at the moment).\nfun::Symbol=:tanh: The approximate neg-entropy function, one of :tanh, :gaus.\ndo_whiten::Bool=true: Whether or not to perform pre-whitening.\nmaxiter::Int=100: The maximum number of iterations.\ntol::Real=1e-6: The convergence tolerance for change in the unmixing matrix W.\nmean::Union{Nothing, Real, Vector{Float64}}=nothing: mean to use, if nothing (default) centering is computed and applied, if zero, no centering; otherwise a vector of means can be passed.\nwinit::Union{Nothing,Matrix{<:Real}}=nothing: Initial guess for the unmixing matrix W: either an empty matrix (for random initialization of W), a matrix of size m × k (if do_whiten is true), or a matrix of size m × k. Here m is the number of components (columns) of the input.","category":"page"},{"location":"models/ICA_MultivariateStats/#Operations","page":"ICA","title":"Operations","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"transform(mach, Xnew): Return the component-separated version of input Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/ICA_MultivariateStats/#Fitted-parameters","page":"ICA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"projection: The estimated component matrix.\nmean: The estimated mean vector.","category":"page"},{"location":"models/ICA_MultivariateStats/#Report","page":"ICA","title":"Report","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data.\nmean: The mean of the untransformed training data, of length indim.","category":"page"},{"location":"models/ICA_MultivariateStats/#Examples","page":"ICA","title":"Examples","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"using MLJ\n\nICA = @load ICA pkg=MultivariateStats\n\ntimes = range(0, 8, length=2000)\n\nsine_wave = sin.(2*times)\nsquare_wave = sign.(sin.(3*times))\nsawtooth_wave = map(t -> mod(2t, 2) - 1, times)\nsignals = hcat(sine_wave, square_wave, sawtooth_wave)\nnoisy_signals = signals + 0.2*randn(size(signals))\n\nmixing_matrix = [ 1 1 1; 0.5 2 1; 1.5 1 2]\nX = MLJ.table(noisy_signals*mixing_matrix)\n\nmodel = ICA(outdim = 3, tol=0.1)\nmach = machine(model, X) |> fit!\n\nX_unmixed = transform(mach, X)\n\nusing Plots\n\nplot(X.x2)\nplot(X.x2)\nplot(X.x3)\n\nplot(X_unmixed.x1)\nplot(X_unmixed.x2)\nplot(X_unmixed.x3)\n","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"See also PCA, KernelPCA, FactorAnalysis, PPCA","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/#LarsCVRegressor_MLJScikitLearnInterface","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"","category":"section"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"LarsCVRegressor","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"A model type for constructing a least angle regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"LarsCVRegressor = @load LarsCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"Do model = LarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsCVRegressor(fit_intercept=...).","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LarsCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"fit_intercept = true\nverbose = false\nmax_iter = 500\nnormalize = false\nprecompute = auto\ncv = 5\nmax_n_alphas = 1000\nn_jobs = nothing\neps = 2.220446049250313e-16\ncopy_X = true","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#LogisticClassifier_MLJLinearModels","page":"LogisticClassifier","title":"LogisticClassifier","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"A model type for constructing a logistic classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Do model = LogisticClassifier() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"This model is more commonly known as \"logistic regression\". It is a standard classifier for both binary and multiclass classification. The objective function applies either a logistic loss (binary target) or multinomial (softmax) loss, and has a mixed L1/L2 penalty:","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"$","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"L(y, Xθ) + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":".","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Here L is either MLJLinearModels.LogisticLoss or MLJLinearModels.MultiClassLoss, λ and γ indicate the strength of the L2 (resp. L1) regularization components and n is the number of training observations.","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"With scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"$","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"L(y, Xθ) + λ|θ|₂²/2 + γ|θ|₁ $","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":".","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#Training-data","page":"LogisticClassifier","title":"Training data","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"where:","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#Hyperparameters","page":"LogisticClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"lambda::Real: strength of the regularizer if penalty is :l2 or :l1 and strength of the L2 regularizer if penalty is :en. Default: eps()\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, Newton, NewtonCG, ProxGrad; but subject to the following restrictions:\nIf penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.\nUnless scitype(y) <: Finite{2} (binary target) Newton is disallowed.\nIf solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#Example","page":"LogisticClassifier","title":"Example","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"using MLJ\nX, y = make_blobs(centers = 2)\nmach = fit!(machine(LogisticClassifier(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"See also MultinomialClassifier.","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/#BaggingRegressor_MLJScikitLearnInterface","page":"BaggingRegressor","title":"BaggingRegressor","text":"","category":"section"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"BaggingRegressor","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"A model type for constructing a bagging ensemble regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"BaggingRegressor = @load BaggingRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"Do model = BaggingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingRegressor(estimator=...).","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#KNNClassifier_NearestNeighborModels","page":"KNNClassifier","title":"KNNClassifier","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"KNNClassifier","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"A model type for constructing a K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"Do model = KNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNClassifier(K=...).","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"KNNClassifier implements K-Nearest Neighbors classifier which is non-parametric algorithm that predicts a discrete class distribution associated with a new point by taking a vote over the classes of the k-nearest points. Each neighbor vote is assigned a weight based on proximity of the neighbor point to the test point according to a specified distance metric.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"For more information about the weighting kernels, see the paper by Geler et.al Comparison of different weighting schemes for the kNN classifier on time-series data. ","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Training-data","page":"KNNClassifier","title":"Training data","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"OR","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"Here:","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is <:Finite (<:Multiclass or <:OrderedFactor will do); check the scitype with scitype(y)\nw is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Hyper-parameters","page":"KNNClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Operations","page":"KNNClassifier","title":"Operations","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Fitted-parameters","page":"KNNClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Examples","page":"KNNClassifier","title":"Examples","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"using MLJ\nKNNClassifier = @load KNNClassifier pkg=NearestNeighborModels\nX, y = @load_crabs; ## a table and a vector from the crabs dataset\n## view possible kernels\nNearestNeighborModels.list_kernels()\n## KNNClassifier instantiation\nmodel = KNNClassifier(weights = NearestNeighborModels.Inverse())\nmach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)\n","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"See also MultitargetKNNClassifier","category":"page"},{"location":"models/KMedoids_Clustering/#KMedoids_Clustering","page":"KMedoids","title":"KMedoids","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"KMedoids","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"A model type for constructing a K-medoids clusterer, based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"KMedoids = @load KMedoids pkg=Clustering","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"Do model = KMedoids() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMedoids(k=...).","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"K-medoids is a clustering algorithm that works by finding k data points (called medoids) such that the total distance between each data point and the closest medoid is minimal.","category":"page"},{"location":"models/KMedoids_Clustering/#Training-data","page":"KMedoids","title":"Training data","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"mach = machine(model, X)","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"Here:","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X)","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KMedoids_Clustering/#Hyper-parameters","page":"KMedoids","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"k=3: The number of centroids to use in clustering.\nmetric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.\ninit (defaults to :kmpp): how medoids should be initialized, could be one of the following:\n:kmpp: KMeans++\n:kmenc: K-medoids initialization based on centrality\n:rand: random\nan instance of Clustering.SeedingAlgorithm from Clustering.jl\nan integer vector of length k that provides the indices of points to use as initial medoids.\nSee documentation of Clustering.jl.","category":"page"},{"location":"models/KMedoids_Clustering/#Operations","page":"KMedoids","title":"Operations","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.\ntransform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.","category":"page"},{"location":"models/KMedoids_Clustering/#Fitted-parameters","page":"KMedoids","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"medoids: The coordinates of the cluster medoids.","category":"page"},{"location":"models/KMedoids_Clustering/#Report","page":"KMedoids","title":"Report","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"The fields of report(mach) are:","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"assignments: The cluster assignments of each point in the training data.\ncluster_labels: The labels assigned to each cluster.","category":"page"},{"location":"models/KMedoids_Clustering/#Examples","page":"KMedoids","title":"Examples","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"using MLJ\nKMedoids = @load KMedoids pkg=Clustering\n\ntable = load_iris()\ny, X = unpack(table, ==(:target), rng=123)\nmodel = KMedoids(k=3)\nmach = machine(model, X) |> fit!\n\nyhat = predict(mach, X)\n@assert yhat == report(mach).assignments\n\ncompare = zip(yhat, y) |> collect;\ncompare[1:8] ## clusters align with classes\n\ncenter_dists = transform(mach, fitted_params(mach).medoids')\n\n@assert center_dists[1][1] == 0.0\n@assert center_dists[2][2] == 0.0\n@assert center_dists[3][3] == 0.0","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"See also KMeans","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#RandomWalkOversampler_Imbalance","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Initiate a RandomWalkOversampler model with the given hyper-parameters.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"RandomWalkOversampler","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"A model type for constructing a random walk oversampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"RandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Do model = RandomWalkOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomWalkOversampler(ratios=...).","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"RandomWalkOversampler implements the random walk oversampling algorithm to correct for class imbalance as in Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 25, 4-20.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Training-data","page":"RandomWalkOversampler","title":"Training data","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"mach = machine(model)","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"model = RandomWalkOversampler()","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Hyperparameters","page":"RandomWalkOversampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"ratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Transform-Inputs","page":"RandomWalkOversampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":" elements in continuous columns should subtype `Infinite` (i.e., have \n [scitype](https://juliaai.github.io/ScientificTypes.jl/) `Count` or `Continuous`).","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"y: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Transform-Outputs","page":"RandomWalkOversampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Operations","page":"RandomWalkOversampler","title":"Operations","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"transform(mach, X, y): resample the data X and y using RandomWalkOversampler, returning both the new and original observations","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Example","page":"RandomWalkOversampler","title":"Example","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"using MLJ\nusing ScientificTypes\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows = 100\nnum_continuous_feats = 3\n## want two categorical features with three and two possible values respectively\nnum_vals_per_category = [3, 2]\n\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, num_vals_per_category, rng=42) \njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n\njulia> ScientificTypes.schema(X).scitypes\n(Continuous, Continuous, Continuous, Continuous, Continuous)\n## coerce nominal columns to a finite scitype (multiclass or ordered factor)\nX = coerce(X, :Column4=>Multiclass, :Column5=>Multiclass)\n\n## load RandomWalkOversampler model type:\nRandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance\n\n## oversample the minority classes to sizes relative to the majority class:\noversampler = RandomWalkOversampler(ratios = Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng = 42)\nmach = machine(oversampler)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#EvoTreeRegressor_EvoTrees","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"EvoTreeRegressor(;kwargs...)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"A model type for constructing a EvoTreeRegressor, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Hyper-parameters","page":"EvoTreeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"loss=:mse: Loss to be be minimized during training. One of:\n:mse\n:logloss\n:gamma\n:tweedie\n:quantile\n:l1\nnrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nalpha::T=0.5: Loss specific parameter in the [0, 1] range: - :quantile: target quantile for the regression. - :l1: weighting parameters to positive vs negative residuals. - Positive residual weights = alpha - Negative residual weights = (1 - alpha)\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). Only :linear, :logistic, :gamma and tweedie losses are supported at the moment.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Internal-API","page":"EvoTreeRegressor","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Do config = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Training-model","page":"EvoTreeRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Inference","page":"EvoTreeRegressor","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Predictions are obtained using predict which returns a Vector of length nobs:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"model(X)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#MLJ-Interface","page":"EvoTreeRegressor","title":"MLJ Interface","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Do model = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Training-model-2","page":"EvoTreeRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Operations","page":"EvoTreeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Fitted-parameters","page":"EvoTreeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Report","page":"EvoTreeRegressor","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Examples","page":"EvoTreeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"## MLJ Interface\nusing MLJ\nEvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees\nmodel = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_boston\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#KNNDetector_OutlierDetectionNeighbors","page":"KNNDetector","title":"KNNDetector","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"KNNDetector(k=5,\n metric=Euclidean,\n algorithm=:kdtree,\n leafsize=10,\n reorder=true,\n reduction=:maximum)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Calculate the anomaly score of an instance based on the distance to its k-nearest neighbors.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#Parameters","page":"KNNDetector","title":"Parameters","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"k::Integer","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"metric::Metric","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"leafsize::Int","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"reorder::Bool","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"parallel::Bool","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"reduction::Symbol","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"One of (:maximum, :median, :mean). (reduction=:maximum) was proposed by [1]. Angiulli et al. [2] proposed sum to reduce the distances, but mean has been implemented for numerical stability.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#Examples","page":"KNNDetector","title":"Examples","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"using OutlierDetection: KNNDetector, fit, transform\ndetector = KNNDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#References","page":"KNNDetector","title":"References","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"[1] Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000): Efficient Algorithms for Mining Outliers from Large Data Sets.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"[2] Angiulli, Fabrizio; Pizzuti, Clara (2002): Fast Outlier Detection in High Dimensional Spaces.","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/#RANSACRegressor_MLJScikitLearnInterface","page":"RANSACRegressor","title":"RANSACRegressor","text":"","category":"section"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"RANSACRegressor","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"A model type for constructing a ransac regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"RANSACRegressor = @load RANSACRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"Do model = RANSACRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RANSACRegressor(estimator=...).","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"RANSACRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"estimator = nothing\nmin_samples = 5\nresidual_threshold = nothing\nis_data_valid = nothing\nis_model_valid = nothing\nmax_trials = 100\nmax_skips = 9223372036854775807\nstop_n_inliers = 9223372036854775807\nstop_score = Inf\nstop_probability = 0.99\nloss = absolute_error\nrandom_state = nothing","category":"page"},{"location":"models/NuSVR_LIBSVM/#NuSVR_LIBSVM","page":"NuSVR","title":"NuSVR","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"NuSVR","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"A model type for constructing a ν-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"NuSVR = @load NuSVR pkg=LIBSVM","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"Do model = NuSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVR(kernel=...).","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"This model is a re-parameterization of EpsilonSVR in which the epsilon hyper-parameter is replaced with a new parameter nu (denoted ν in the cited reference) which attempts to control the number of support vectors directly.","category":"page"},{"location":"models/NuSVR_LIBSVM/#Training-data","page":"NuSVR","title":"Training data","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"where","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/NuSVR_LIBSVM/#Hyper-parameters","page":"NuSVR","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be\ncalled, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\nnu=0.5 (range (0, 1]): An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of some neighborhood of the graph of the prediction function (\"tube\" or \"slab\") and a training error is said to occur when a data point (x, y) lies outside of that neighborhood.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/NuSVR_LIBSVM/#Operations","page":"NuSVR","title":"Operations","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/NuSVR_LIBSVM/#Fitted-parameters","page":"NuSVR","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"libsvm_model: the trained model object created by the LIBSVM.jl package","category":"page"},{"location":"models/NuSVR_LIBSVM/#Report","page":"NuSVR","title":"Report","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/NuSVR_LIBSVM/#Examples","page":"NuSVR","title":"Examples","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/#Using-a-built-in-kernel","page":"NuSVR","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"using MLJ\nimport LIBSVM\n\nNuSVR = @load NuSVR pkg=LIBSVM ## model type\nmodel = NuSVR(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = make_regression(rng=123) ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew, _ = make_regression(3, rng=123)\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 0.2008156459920009\n 0.1131520519131709\n -0.2076156254934889","category":"page"},{"location":"models/NuSVR_LIBSVM/#User-defined-kernels","page":"NuSVR","title":"User-defined kernels","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = NuSVR(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 1.1211558175964662\n 0.06677125944808422\n -0.6817578942749346","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"See also EpsilonSVR, LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#CatBoostClassifier_CatBoost","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"CatBoostClassifier","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"A model type for constructing a CatBoost classifier, based on CatBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"CatBoostClassifier = @load CatBoostClassifier pkg=CatBoost","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"Do model = CatBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostClassifier(iterations=...).","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Training-data","page":"CatBoostClassifier","title":"Training data","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"where","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.\ny: the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with scitype(y)","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Hyper-parameters","page":"CatBoostClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Operations","page":"CatBoostClassifier","title":"Operations","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.\npredict_mode(mach, Xnew): returns the mode of each of the prediction above.","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Accessor-functions","page":"CatBoostClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Fitted-parameters","page":"CatBoostClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"model: The Python CatBoostClassifier model","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Report","page":"CatBoostClassifier","title":"Report","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"feature_importances: Vector{Pair{Symbol, Float64}} of feature importances","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Examples","page":"CatBoostClassifier","title":"Examples","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"using CatBoost.MLJCatBoostInterface\nusing MLJ\n\nX = (\n duration = [1.5, 4.1, 5.0, 6.7], \n n_phone_calls = [4, 5, 6, 7], \n department = coerce([\"acc\", \"ops\", \"acc\", \"ops\"], Multiclass), \n)\ny = coerce([0, 0, 1, 1], Multiclass)\n\nmodel = CatBoostClassifier(iterations=5)\nmach = machine(model, X, y)\nfit!(mach)\nprobs = predict(mach, X)\npreds = predict_mode(mach, X)","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"See also catboost and the unwrapped model type CatBoost.CatBoostClassifier.","category":"page"},{"location":"getting_started/#Getting-Started","page":"Getting Started","title":"Getting Started","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For an outline of MLJ's goals and features, see About MLJ.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This page introduces some MLJ basics, assuming some familiarity with machine learning. For a complete list of other MLJ learning resources, see Learning MLJ.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"MLJ collects together the functionality provided by mutliple packages. To learn how to install components separately, run using MLJ; @doc MLJ.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This section introduces only the most basic MLJ operations and concepts. It assumes MLJ has been successfully installed. See Installation if this is not the case.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"import Random.seed!\nusing MLJ\nusing InteractiveUtils\nMLJ.color_off()\nseed!(1234)","category":"page"},{"location":"getting_started/#Choosing-and-evaluating-a-model","page":"Getting Started","title":"Choosing and evaluating a model","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The following code loads Fisher's famous iris data set as a named tuple of column vectors:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"using MLJ\niris = load_iris();\nselectrows(iris, 1:3) |> pretty\nschema(iris)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Because this data format is compatible with Tables.jl (and satisfies Tables.istable(iris) == true) many MLJ methods (such as selectrows, pretty and schema used above) as well as many MLJ models can work with it. However, as most new users are already familiar with the access methods particular to DataFrames (also compatible with Tables.jl) we'll put our data into that format here:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"import DataFrames\niris = DataFrames.DataFrame(iris);\nnothing # hide","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Next, let's split the data \"horizontally\" into input and target parts, and specify an RNG seed, to force observations to be shuffled:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"y, X = unpack(iris, ==(:target); rng=123);\nfirst(X, 3) |> pretty","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This call to unpack splits off any column with name == to :target into something called y, and all the remaining columns into X.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"To list all models available in MLJ's model registry do models(). Listing the models compatible with the present data:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"models(matching(X,y))","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"In MLJ a model is a struct storing the hyperparameters of the learning algorithm indicated by the struct name (and nothing else). For common problems matching data to models, see Model Search and Preparing Data.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"To see the documentation for DecisionTreeClassifier (without loading its defining code) do","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Assuming the MLJDecisionTreeInterface.jl package is in your load path (see Installation) we can use @load to import the DecisionTreeClassifier model type, which we will bind to Tree:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"(In this case, we need to specify pkg=... because multiple packages provide a model type with the name DecisionTreeClassifier.) Now we can instantiate a model with default hyperparameters:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"tree = Tree()","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Important: DecisionTree.jl and most other packages implementing machine learning algorithms for use in MLJ are not MLJ dependencies. If such a package is not in your load path you will receive an error explaining how to add the package to your current environment. Alternatively, you can use the interactive macro @iload. For more on importing model types, see Loading Model Code.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Once instantiated, a model's performance can be evaluated with the evaluate method. Our classifier is a probabilistic predictor (check prediction_type(tree) == :probabilistic) which means we can specify a probabilistic measure (metric) like log_loss, as well deterministic measures like accuracy (which are applied after computing the mode of each prediction):","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"evaluate(tree, X, y,\n resampling=CV(shuffle=true),\n measures=[log_loss, accuracy],\n verbosity=0)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Under the hood, evaluate calls lower level functions predict or predict_mode according to the type of measure, as shown in the output. We shall call these operations directly below.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For more on performance evaluation, see Evaluating Model Performance for details.","category":"page"},{"location":"getting_started/#A-preview-of-data-type-specification-in-MLJ","page":"Getting Started","title":"A preview of data type specification in MLJ","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The target y above is a categorical vector, which is appropriate because our model is a decision tree classifier:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"typeof(y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"However, MLJ models do not prescribe the machine types for the data they operate on. Rather, they specify a scientific type, which refers to the way data is to be interpreted, as opposed to how it is encoded:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"target_scitype(tree)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Here Finite is an example of a \"scalar\" scientific type with two subtypes:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"subtypes(Finite)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"We use the scitype function to check how MLJ is going to interpret given data. Our choice of encoding for y works for DecisionTreeClassifier, because we have:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"scitype(y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"and Multiclass{3} <: Finite. If we would encode with integers instead, we obtain:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"yint = int.(y);\nscitype(yint)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"and using yint in place of y in classification problems will fail. See also Working with Categorical Data.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For more on scientific types, see Data containers and scientific types below.","category":"page"},{"location":"getting_started/#Fit-and-predict","page":"Getting Started","title":"Fit and predict","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"To illustrate MLJ's fit and predict interface, let's perform our performance evaluations by hand, but using a simple holdout set, instead of cross-validation.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Wrapping the model in data creates a machine which will store training outcomes:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"mach = machine(tree, X, y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Training and testing on a hold-out set:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"train, test = partition(eachindex(y), 0.7); # 70:30 split\nfit!(mach, rows=train);\nyhat = predict(mach, X[test,:]);\nyhat[3:5]\nlog_loss(yhat, y[test])","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Note that log_loss and cross_entropy are aliases for LogLoss() (which can be passed an optional keyword parameter, as in LogLoss(tol=0.001)). For a list of all losses and scores, and their aliases, run measures().","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Notice that yhat is a vector of Distribution objects, because DecisionTreeClassifier makes probabilistic predictions. The methods of the Distributions.jl package can be applied to such distributions:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"broadcast(pdf, yhat[3:5], \"virginica\") # predicted probabilities of virginica\nbroadcast(pdf, yhat, y[test])[3:5] # predicted probability of observed class\nmode.(yhat[3:5])","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Or, one can explicitly get modes by using predict_mode instead of predict:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"predict_mode(mach, X[test[3:5],:])","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Finally, we note that pdf() is overloaded to allow the retrieval of probabilities for all levels at once:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"L = levels(y)\npdf(yhat[3:5], L)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Unsupervised models have a transform method instead of predict, and may optionally implement an inverse_transform method:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"v = Float64[1, 2, 3, 4]\nstand = Standardizer() # this type is built-in\nmach2 = machine(stand, v)\nfit!(mach2)\nw = transform(mach2, v)\ninverse_transform(mach2, w)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Machines have an internal state which allows them to avoid redundant calculations when retrained, in certain conditions - for example when increasing the number of trees in a random forest, or the number of epochs in a neural network. The machine-building syntax also anticipates a more general syntax for composing multiple models, an advanced feature explained in Learning Networks.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"There is a version of evaluate for machines as well as models. This time we'll use a simple holdout strategy as above. (An exclamation point is added to the method name because machines are generally mutated when trained.)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"evaluate!(mach, resampling=Holdout(fraction_train=0.7),\n measures=[log_loss, accuracy],\n verbosity=0)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Changing a hyperparameter and re-evaluating:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"tree.max_depth = 3;\nevaluate!(mach, resampling=Holdout(fraction_train=0.7),\n measures=[log_loss, accuracy],\n verbosity=0)","category":"page"},{"location":"getting_started/#Next-steps","page":"Getting Started","title":"Next steps","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For next steps, consult the Learning MLJ section. At the least, we recommned you read the remainder of this page before considering serious use of MLJ.","category":"page"},{"location":"getting_started/#Data-containers-and-scientific-types","page":"Getting Started","title":"Data containers and scientific types","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The MLJ user should acquaint themselves with some basic assumptions about the form of data expected by MLJ, as outlined below. The basic machine constructors look like this (see also Constructing machines):","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"machine(model::Unsupervised, X)\nmachine(model::Supervised, X, y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Each supervised model in MLJ declares the permitted scientific type of the inputs X and targets y that can be bound to it in the first constructor above, rather than specifying specific machine types (such as Array{Float32, 2}). Similar remarks apply to the input X of an unsupervised model.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Scientific types are julia types defined in the package ScientificTypesBase.jl; the package ScientificTypes.jl implements the particular convention used in the MLJ universe for assigning a specific scientific type (interpretation) to each julia object (see the scitype examples below).","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The basic \"scalar\" scientific types are Continuous, Multiclass{N}, OrderedFactor{N}, Count and Textual. Missing and Nothing are also considered scientific types. Be sure you read Scalar scientific types below to guarantee your scalar data is interpreted correctly. Tools exist to coerce the data to have the appropriate scientific type; see ScientificTypes.jl or run ?coerce for details.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Additionally, most data containers - such as tuples, vectors, matrices and tables - have a scientific type parameterized by scitype of the elements they contain.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"(Image: )","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Figure 1. Part of the scientific type hierarchy in ScientificTypesBase.jl.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"scitype(4.6)\nscitype(42)\nx1 = coerce([\"yes\", \"no\", \"yes\", \"maybe\"], Multiclass);\nscitype(x1)\nX = (x1=x1, x2=rand(4), x3=rand(4)) # a \"column table\"\nscitype(X)","category":"page"},{"location":"getting_started/#Two-dimensional-data","page":"Getting Started","title":"Two-dimensional data","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Generally, two-dimensional data in MLJ is expected to be tabular. All data containers X compatible with the Tables.jl interface and sastisfying Tables.istable(X) == true (most of the formats in this list) have the scientific type Table{K}, where K depends on the scientific types of the columns, which can be individually inspected using schema:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"schema(X)","category":"page"},{"location":"getting_started/#Matrix-data","page":"Getting Started","title":"Matrix data","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"MLJ models expecting a table do not generally accept a matrix instead. However, a matrix can be wrapped as a table, using MLJ.table:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"matrix_table = MLJ.table(rand(2,3));\nschema(matrix_table)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The matrix is not copied, only wrapped. To manifest a table as a matrix, use MLJ.matrix.","category":"page"},{"location":"getting_started/#Observations-correspond-to-rows,-not-columns","page":"Getting Started","title":"Observations correspond to rows, not columns","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"When supplying models with matrices, or wrapping them in tables, each row should correspond to a different observation. That is, the matrix should be n x p, where n is the number of observations and p the number of features. However, some models may perform better if supplied the adjoint of a p x n matrix instead, and observation resampling is always more efficient in this case.","category":"page"},{"location":"getting_started/#Inputs","page":"Getting Started","title":"Inputs","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Since an MLJ model only specifies the scientific type of data, if that type is Table - which is the case for the majority of MLJ models - then any Tables.jl container X is permitted, so long as Tables.istable(X) == true.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Specifically, the requirement for an arbitrary model's input is scitype(X) <: input_scitype(model).","category":"page"},{"location":"getting_started/#Targets","page":"Getting Started","title":"Targets","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The target y expected by MLJ models is generally an AbstractVector. A multivariate target y will generally be a table.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Specifically, the type requirement for a model target is scitype(y) <: target_scitype(model).","category":"page"},{"location":"getting_started/#Querying-a-model-for-acceptable-data-types","page":"Getting Started","title":"Querying a model for acceptable data types","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Given a model instance, one can inspect the admissible scientific types of its input and target, and without loading the code defining the model;","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"tree = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"i = info(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")\ni.input_scitype\ni.target_scitype","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This output indicates that any table with Continuous, Count or OrderedFactor columns is acceptable as the input X, and that any vector with element scitype <: Finite is acceptable as the target y.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For more on matching models to data, see Model Search.","category":"page"},{"location":"getting_started/#Scalar-scientific-types","page":"Getting Started","title":"Scalar scientific types","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Models in MLJ will always apply the MLJ convention described in ScientificTypes.jl to decide how to interpret the elements of your container types. Here are the key features of that convention:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Any AbstractFloat is interpreted as Continuous.\nAny Integer is interpreted as Count.\nAny CategoricalValue x, is interpreted as Multiclass or OrderedFactor, depending on the value of isordered(x).\nStrings and Chars are not interpreted as Multiclass or OrderedFactor (they have scitypes Textual and Unknown respectively).\nIn particular, integers (including Bools) cannot be used to represent categorical data. Use the preceding coerce operations to coerce to a Finite scitype.\nThe scientific types of nothing and missing are Nothing and Missing, native types we also regard as scientific.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Use coerce(v, OrderedFactor) or coerce(v, Multiclass) to coerce a vector v of integers, strings or characters to a vector with an appropriate Finite (categorical) scitype. See also Working with Categorical Data, and the ScientificTypes.jl documentation.","category":"page"},{"location":"transformers/#Transformers-and-Other-Unsupervised-Models","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised Models","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (a subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Finally, we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.","category":"page"},{"location":"transformers/#Built-in-transformers","page":"Transformers and Other Unsupervised models","title":"Built-in transformers","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"MLJModels.Standardizer\nMLJModels.OneHotEncoder\nMLJModels.ContinuousEncoder\nMLJModels.FillImputer\nMLJModels.UnivariateFillImputer\nMLJModels.FeatureSelector\nMLJModels.UnivariateBoxCoxTransformer\nMLJModels.UnivariateDiscretizer\nMLJModels.UnivariateTimeTypeToContinuous","category":"page"},{"location":"transformers/#MLJModels.Standardizer","page":"Transformers and Other Unsupervised models","title":"MLJModels.Standardizer","text":"Standardizer\n\nA model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nStandardizer = @load Standardizer pkg=MLJModels\n\nDo model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).\n\nUse this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: one of the following, with the behavior indicated below:\n[] (empty, the default): standardize all features (columns) having Continuous element scitype\nnon-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).\nfunction or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.\nNote this behavior is further modified if the ordered_factor or count flags are set to true; see below\nignore=false: whether to ignore or standardize specified features, as explained above\nordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above\ncount=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above\n\nOperations\n\ntransform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.\ninverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_fit - the names of features that will be standardized\nmeans - the corresponding untransformed mean values\nstds - the corresponding untransformed standard deviations\n\nReport\n\nThe fields of report(mach) are:\n\nfeatures_fit: the names of features that will be standardized\n\nExamples\n\nusing MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([:x, :y, :x], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\njulia> schema(X)\n┌──────────┬──────────────────┐\n│ names │ scitypes │\n├──────────┼──────────────────┤\n│ ordinal1 │ Count │\n│ ordinal2 │ OrderedFactor{2} │\n│ ordinal3 │ Continuous │\n│ ordinal4 │ Continuous │\n│ nominal │ Multiclass{3} │\n└──────────┴──────────────────┘\n\nstand1 = Standardizer();\n\njulia> transform(fit!(machine(stand1, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [-1.0, 0.0, 1.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\nstand2 = Standardizer(features=[:ordinal3, ], ignore=true, count=true);\n\njulia> transform(fit!(machine(stand2, X)), X)\n(ordinal1 = [-1.0, 0.0, 1.0],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\nSee also OneHotEncoder, ContinuousEncoder.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.OneHotEncoder","page":"Transformers and Other Unsupervised models","title":"MLJModels.OneHotEncoder","text":"OneHotEncoder\n\nA model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nOneHotEncoder = @load OneHotEncoder pkg=MLJModels\n\nDo model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).\n\nUse this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.\n\nNew data to be transformed may lack features present in the fit data, but no new features can be present.\n\nWarning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.\n\nTo ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.\nordered_factor=false: when true, OrderedFactor features are universally excluded\ndrop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nall_features: names of all features encountered in training\nfitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name\nref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded\n\nReport\n\nThe fields of report(mach) are:\n\nfeatures_to_be_encoded: names of input features to be encoded\nnew_features: names of all output features\n\nExample\n\nusing MLJ\n\nX = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n└───────────┴──────────────────┘\n\nhot = OneHotEncoder(drop_last=true)\nmach = fit!(machine(hot, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade__A │ Continuous │\n│ grade__B │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Count │\n└──────────────┴────────────┘\n\nSee also ContinuousEncoder.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.ContinuousEncoder","page":"Transformers and Other Unsupervised models","title":"MLJModels.ContinuousEncoder","text":"ContinuousEncoder\n\nA model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nContinuousEncoder = @load ContinuousEncoder pkg=MLJModels\n\nDo model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).\n\nUse this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:\n\nIf ftr is already Continuous retain it.\nIf ftr is Multiclass, one-hot encode it.\nIf ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.\nIf ftr is Count, replace it with coerce(ftr, Continuous).\nIf ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.\n\nWarning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.\n\nTo selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\ndrop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.\none_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_to_keep: names of features that will not be dropped from the table\none_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding\none_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model\n\nReport\n\nfeatures_to_keep: names of input features that will not be dropped from the table\nnew_features: names of all output features\n\nExample\n\nX = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3],\n comments=[\"the force\", \"be\", \"with you\", \"too\"])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n│ comments │ Textual │\n└───────────┴──────────────────┘\n\nencoder = ContinuousEncoder(drop_last=true)\nmach = fit!(machine(encoder, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Continuous │\n└──────────────┴────────────┘\n\njulia> setdiff(schema(X).names, report(mach).features_to_keep) # dropped features\n1-element Vector{Symbol}:\n :comments\n\n\nSee also OneHotEncoder\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.FillImputer","page":"Transformers and Other Unsupervised models","title":"MLJModels.FillImputer","text":"FillImputer\n\nA model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nFillImputer = @load FillImputer pkg=MLJModels\n\nDo model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).\n\nUse this model to impute missing values in tabular data. A fixed \"filler\" value is learned from the training data, one for each column of the table.\n\nFor imputing missing values in a vector, use UnivariateFillImputer instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as \"impute all\".\ncontinuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values\n\nOperations\n\ntransform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_seen_in_fit: the names of features (columns) encountered during training\nunivariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)\nfiller_given_feature: dictionary of filler values, keyed on feature (column) names\n\nExamples\n\nusing MLJ\nimputer = FillImputer()\n\nX = (a = [1.0, 2.0, missing, 3.0, missing],\n b = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass),\n c = [1, 1, 2, missing, 3])\n\nschema(X)\njulia> schema(X)\n┌───────┬───────────────────────────────┐\n│ names │ scitypes │\n├───────┼───────────────────────────────┤\n│ a │ Union{Missing, Continuous} │\n│ b │ Union{Missing, Multiclass{2}} │\n│ c │ Union{Missing, Count} │\n└───────┴───────────────────────────────┘\n\nmach = machine(imputer, X)\nfit!(mach)\n\njulia> fitted_params(mach).filler_given_feature\n(filler = 2.0,)\n\njulia> fitted_params(mach).filler_given_feature\nDict{Symbol, Any} with 3 entries:\n :a => 2.0\n :b => \"y\"\n :c => 2\n\njulia> transform(mach, X)\n(a = [1.0, 2.0, 2.0, 3.0, 2.0],\n b = CategoricalValue{String, UInt32}[\"y\", \"n\", \"y\", \"y\", \"y\"],\n c = [1, 1, 2, 2, 3],)\n\nSee also UnivariateFillImputer.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateFillImputer","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateFillImputer","text":"UnivariateFillImputer\n\nA model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels\n\nDo model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).\n\nUse this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.\n\nFor imputing missing values in tabular data, use FillImputer instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\ncontinuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values\n\nOperations\n\ntransform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfiller: the fill value to be imputed in all new data\n\nExamples\n\nusing MLJ\nimputer = UnivariateFillImputer()\n\nx_continuous = [1.0, 2.0, missing, 3.0]\nx_multiclass = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass)\nx_count = [1, 1, 1, 2, missing, 3, 3]\n\nmach = machine(imputer, x_continuous)\nfit!(mach)\n\njulia> fitted_params(mach)\n(filler = 2.0,)\n\njulia> transform(mach, [missing, missing, 101.0])\n3-element Vector{Float64}:\n 2.0\n 2.0\n 101.0\n\nmach2 = machine(imputer, x_multiclass) |> fit!\n\njulia> transform(mach2, x_multiclass)\n5-element CategoricalArray{String,1,UInt32}:\n \"y\"\n \"n\"\n \"y\"\n \"y\"\n \"y\"\n\nmach3 = machine(imputer, x_count) |> fit!\n\njulia> transform(mach3, [missing, missing, 5])\n3-element Vector{Int64}:\n 2\n 2\n 5\n\nFor imputing tabular data, use FillImputer.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.FeatureSelector","page":"Transformers and Other Unsupervised models","title":"MLJModels.FeatureSelector","text":"FeatureSelector\n\nA model type for constructing a feature selector, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nFeatureSelector = @load FeatureSelector pkg=MLJModels\n\nDo model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).\n\nUse this model to select features (columns) of a table, usually as part of a model Pipeline.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any table of input features, where \"table\" is in the sense of Tables.jl\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: one of the following, with the behavior indicated:\n[] (empty, the default): filter out all features (columns) which were not encountered in training\nnon-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)\nfunction or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.\nignore: whether to ignore or keep specified features, as explained above\n\nOperations\n\ntransform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_to_keep: the features that will be selected\n\nExample\n\nusing MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([\"x\", \"y\", \"x\"], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\nselector = FeatureSelector(features=[:ordinal3, ], ignore=true);\n\njulia> transform(fit!(machine(selector, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[\"x\", \"y\", \"x\"],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateBoxCoxTransformer","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateBoxCoxTransformer","text":"UnivariateBoxCoxTransformer\n\nA model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels\n\nDo model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).\n\nBox-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.\n\nA Box-Cox transformation (with shift) is of the form\n\nx -> ((x + c)^λ - 1)/λ\n\nfor some constant c and real λ, unless λ = 0, in which case the above is replaced with\n\nx -> log(x + c)\n\nGiven user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.\n\nReference: Wikipedia entry for power transform.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector with element scitype Continuous; check the scitype with scitype(x)\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nn=171: number of values of the exponent λ to try\nshift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros\n\nOperations\n\ntransform(mach, xnew): apply the Box-Cox transformation learned when fitting mach\ninverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nλ: the learned Box-Cox exponent\nc: the learned shift\n\nExamples\n\nusing MLJ\nusing UnicodePlots\nusing Random\nRandom.seed!(123)\n\ntransf = UnivariateBoxCoxTransformer()\n\nx = randn(1000).^2\n\nmach = machine(transf, x)\nfit!(mach)\n\nz = transform(mach, x)\n\njulia> histogram(x)\n ┌ ┐\n [ 0.0, 2.0) ┤███████████████████████████████████ 848\n [ 2.0, 4.0) ┤████▌ 109\n [ 4.0, 6.0) ┤█▍ 33\n [ 6.0, 8.0) ┤▍ 7\n [ 8.0, 10.0) ┤▏ 2\n [10.0, 12.0) ┤ 0\n [12.0, 14.0) ┤▏ 1\n └ ┘\n Frequency\n\njulia> histogram(z)\n ┌ ┐\n [-5.0, -4.0) ┤█▎ 8\n [-4.0, -3.0) ┤████████▊ 64\n [-3.0, -2.0) ┤█████████████████████▊ 159\n [-2.0, -1.0) ┤█████████████████████████████▊ 216\n [-1.0, 0.0) ┤███████████████████████████████████ 254\n [ 0.0, 1.0) ┤█████████████████████████▊ 188\n [ 1.0, 2.0) ┤████████████▍ 90\n [ 2.0, 3.0) ┤██▊ 20\n [ 3.0, 4.0) ┤▎ 1\n └ ┘\n Frequency\n\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateDiscretizer","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateDiscretizer","text":"UnivariateDiscretizer\n\nA model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels\n\nDo model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).\n\nDiscretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).\n\nThe transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector with Continuous element scitype; check scitype with scitype(x).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nn_classes: number of discrete classes in the output\n\nOperations\n\ntransform(mach, xnew): discretize xnew according to the discretization learned when fitting mach\ninverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z\n\nFitted parameters\n\nThe fields of fitted_params(mach).fitesult include:\n\nodd_quantiles: quantiles used for transforming (length is n_classes - 1)\neven_quantiles: quantiles used for inverse transforming (length is n_classes)\n\nExample\n\nusing MLJ\nusing Random\nRandom.seed!(123)\n\ndiscretizer = UnivariateDiscretizer(n_classes=100)\nmach = machine(discretizer, randn(1000))\nfit!(mach)\n\njulia> x = rand(5)\n5-element Vector{Float64}:\n 0.8585244609846809\n 0.37541692370451396\n 0.6767070590395461\n 0.9208844241267105\n 0.7064611415680901\n\njulia> z = transform(mach, x)\n5-element CategoricalArrays.CategoricalArray{UInt8,1,UInt8}:\n 0x52\n 0x42\n 0x4d\n 0x54\n 0x4e\n\nx_approx = inverse_transform(mach, z)\njulia> x - x_approx\n5-element Vector{Float64}:\n 0.008224506144777322\n 0.012731354778359405\n 0.0056265330571125816\n 0.005738175684445124\n 0.006835652575801987\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateTimeTypeToContinuous","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateTimeTypeToContinuous","text":"UnivariateTimeTypeToContinuous\n\nA model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels\n\nDo model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).\n\nUse this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector whose element type is a subtype of Dates.TimeType\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nzero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.\nstep::Period=Hour(24): time interval to correspond to one unit under transformation\n\nOperations\n\ntransform(mach, xnew): apply the encoding inferred when mach was fit\n\nFitted parameters\n\nfitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.\n\nExample\n\nusing MLJ\nusing Dates\n\nx = [Date(2001, 1, 1) + Day(i) for i in 0:4]\n\nencoder = UnivariateTimeTypeToContinuous(zero_time=Date(2000, 1, 1),\n step=Week(1))\n\nmach = machine(encoder, x)\nfit!(mach)\njulia> transform(mach, x)\n5-element Vector{Float64}:\n 52.285714285714285\n 52.42857142857143\n 52.57142857142857\n 52.714285714285715\n 52.857142\n\n\n\n\n\n","category":"type"},{"location":"transformers/#Static-transformers","page":"Transformers and Other Unsupervised models","title":"Static transformers","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"A static transformer is a model for transforming data that does not generalize to new data (does not \"learn\") but which nevertheless has hyperparameters. For example, the DBSAN clustering model from Clustering.jl can assign labels to some collection of observations, cannot directly assign a label to some new observation.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"The general user may define their own static models. The main use-case is insertion into a Linear Pipelines some parameter-dependent transformation. (If a static transformer has no hyper-parameters, it is tantamount to an ordinary function. An ordinary function can be inserted directly into a pipeline; the situation for learning networks is only slightly more complicated.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"The following example defines a new model type Averager to perform the weighted average of two vectors (target predictions, for example). We suppose the weighting is normalized, and therefore controlled by a single hyper-parameter, mix.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"using MLJ","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"mutable struct Averager <: Static\n mix::Float64\nend\n\nMLJ.transform(a::Averager, _, y1, y2) = (1 - a.mix)*y1 + a.mix*y2","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Important. Note the sub-typing <: Static.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Such static transformers with (unlearned) parameters can have arbitrarily many inputs, but only one output. In the single input case, an inverse_transform can also be defined. Since they have no real learned parameters, you bind a static transformer to a machine without specifying training arguments; there is no need to fit! the machine:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"mach = machine(Averager(0.5))\ntransform(mach, [1, 2, 3], [3, 2, 1])","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Let's see how we can include our Averager in a learning network to mix the predictions of two regressors, with one-hot encoding of the inputs. Here's two regressors for mixing, and some dummy data for testing our learning network:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"ridge = (@load RidgeRegressor pkg=MultivariateStats)()\nknn = (@load KNNRegressor)()\n\nimport Random.seed!\nseed!(112)\nX = (\n x1=coerce(rand(\"ab\", 100), Multiclass),\n x2=rand(100),\n)\ny = X.x2 + 0.05*rand(100)\nschema(X)","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"And the learning network:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Xs = source(X)\nys = source(y)\n\naverager = Averager(0.5)\n\nmach0 = machine(OneHotEncoder(), Xs)\nW = transform(mach0, Xs) # one-hot encode the input\n\nmach1 = machine(ridge, W, ys)\ny1 = predict(mach1, W)\n\nmach2 = machine(knn, W, ys)\ny2 = predict(mach2, W)\n\nmach4= machine(averager)\nyhat = transform(mach4, y1, y2)\n\n# test:\nfit!(yhat)\nXnew = selectrows(X, 1:3)\nyhat(Xnew)","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"We next \"export\" the learning network as a standalone composite model type. First we need a struct for the composite model. Since we are restricting to Deterministic component regressors, the composite will also make deterministic predictions, and so gets the supertype DeterministicNetworkComposite:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"mutable struct DoubleRegressor <: DeterministicNetworkComposite\n regressor1\n regressor2\n averager\nend","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"As described in Learning Networks, we next paste the learning network into a prefit declaration, replace the component models with symbolic placeholders, and add a learning network \"interface\":","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"import MLJBase\nfunction MLJBase.prefit(composite::DoubleRegressor, verbosity, X, y)\n Xs = source(X)\n ys = source(y)\n\n mach0 = machine(OneHotEncoder(), Xs)\n W = transform(mach0, Xs) # one-hot encode the input\n\n mach1 = machine(:regressor1, W, ys)\n y1 = predict(mach1, W)\n\n mach2 = machine(:regressor2, W, ys)\n y2 = predict(mach2, W)\n\n mach4= machine(:averager)\n yhat = transform(mach4, y1, y2)\n\n # learning network interface:\n (; predict=yhat)\nend","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"The new model type can be evaluated like any other supervised model:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"X, y = @load_reduced_ames;\ncomposite = DoubleRegressor(ridge, knn, Averager(0.5))","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"composite.averager.mix = 0.25 # adjust mix from default of 0.5\nevaluate(composite, X, y, measure=l1)","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"A static transformer can also expose byproducts of the transform computation in the report of any associated machine. See Static models (models that do not generalize) for details.","category":"page"},{"location":"transformers/#Transformers-that-also-predict","page":"Transformers and Other Unsupervised models","title":"Transformers that also predict","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Some clustering algorithms learn to label data by identifying a collection of \"centroids\" in the training data. Any new input observation is labeled with the cluster to which it is closest (this is the output of predict) while the vector of all distances from the centroids defines a lower-dimensional representation of the observation (the output of transform). In the following example a K-means clustering algorithm assigns one of three labels 1, 2, 3 to the input features of the iris data set and compares them with the actual species recorded in the target (not seen by the algorithm).","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"using MLJ","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"import Random.seed!\nseed!(123)\n\nX, y = @load_iris\nKMeans = @load KMeans pkg=Clustering\nkmeans = KMeans()\nmach = machine(kmeans, X) |> fit!\nnothing # hide","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Transforming:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Xsmall = transform(mach)\nselectrows(Xsmall, 1:4) |> pretty","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Predicting:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"yhat = predict(mach)\ncompare = zip(yhat, y) |> collect","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"compare[1:8]","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"compare[51:58]","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"compare[101:108]","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/#GaussianProcessRegressor_MLJScikitLearnInterface","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"","category":"section"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"GaussianProcessRegressor","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"A model type for constructing a Gaussian process regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"GaussianProcessRegressor = @load GaussianProcessRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"Do model = GaussianProcessRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessRegressor(kernel=...).","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"GaussianProcessRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"kernel = nothing\nalpha = 1.0e-10\noptimizer = fmin_l_bfgs_b\nn_restarts_optimizer = 0\nnormalize_y = false\ncopy_X_train = true\nrandom_state = nothing","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/#MeanShift_MLJScikitLearnInterface","page":"MeanShift","title":"MeanShift","text":"","category":"section"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"MeanShift","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"A model type for constructing a mean shift, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"MeanShift = @load MeanShift pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"Do model = MeanShift() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MeanShift(bandwidth=...).","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"Mean shift clustering using a flat kernel. Mean shift clustering aims to discover \"blobs\" in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids.\"","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#StableRulesRegressor_SIRUS","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"StableRulesRegressor","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"A model type for constructing a stable rules regressor, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"StableRulesRegressor = @load StableRulesRegressor pkg=SIRUS","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"Do model = StableRulesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesRegressor(rng=...).","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"StableRulesRegressor implements the explainable rule-based regression model based on a random forest.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Training-data","page":"StableRulesRegressor","title":"Training data","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"where","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Hyperparameters","page":"StableRulesRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.\nmax_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.\nlambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Fitted-parameters","page":"StableRulesRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"fitresult: A StableRules object.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Operations","page":"StableRulesRegressor","title":"Operations","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"correcting_class_imbalance/#Correcting-Class-Imbalance","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"","category":"section"},{"location":"correcting_class_imbalance/#Oversampling-and-undersampling-methods","page":"Correcting Class Imbalance","title":"Oversampling and undersampling methods","text":"","category":"section"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"Models providing oversampling or undersampling methods, to correct for class imbalance, are listed under Class Imbalance. In particular, several popular algorithms are provided by the Imbalance.jl package, which includes detailed documentation and tutorials.","category":"page"},{"location":"correcting_class_imbalance/#Incorporating-class-imbalance-in-supervised-learning-pipelines","page":"Correcting Class Imbalance","title":"Incorporating class imbalance in supervised learning pipelines","text":"","category":"section"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"One or more oversampling/undersampling algorithms can be fused with an MLJ classifier using the BalancedModel wrapper. This creates a new classifier which can be treated like any other; resampling to correct for class imbalance, relevant only for training of the atomic classifier, is then carried out internally. If, for example, one applies cross-validation to the wrapped classifier (using evaluate!, say) then this means over/undersampling is then repeated for each training fold automatically.","category":"page"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"Refer to the MLJBalancing.jl documentation for further details.","category":"page"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"MLJBalancing.BalancedModel","category":"page"},{"location":"correcting_class_imbalance/#MLJBalancing.BalancedModel","page":"Correcting Class Imbalance","title":"MLJBalancing.BalancedModel","text":"BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)\nBalancedModel(model; balancer1=balancer_model1, balancer2=balancer_model2, ...)\n\nGiven a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.\n\nOperation\n\nDuring training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.\nDuring prediction, the balancers have no effect.\n\nArguments\n\nmodel::Supervised: A classification model that implements the MLJModelInterface. \nbalancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.\nbalancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.\nand so on for an arbitrary number of balancers.\n\nReturns\n\nAn instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.\n\nExample\n\nusing MLJ\nusing Imbalance\n\n# generate data\nX, y = Imbalance.generate_imbalanced_data(1000, 5; class_probs=[0.2, 0.3, 0.5])\n\n# prepare classification and balancing models\nSMOTENC = @load SMOTENC pkg=Imbalance verbosity=0\nTomekUndersampler = @load TomekUndersampler pkg=Imbalance verbosity=0\nLogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0\n\noversampler = SMOTENC(k=5, ratios=1.0, rng=42)\nundersampler = TomekUndersampler(min_ratios=0.5, rng=42)\nlogistic_model = LogisticClassifier()\n\n# wrap them in a BalancedModel\nbalanced_model = BalancedModel(model=logistic_model, balancer1=oversampler, balancer2=undersampler)\n\n# now this behaves as a unified model that can be trained, validated, fine-tuned, etc.\nmach = machine(balanced_model, X, y)\nfit!(mach)\n\n\n\n\n\n","category":"function"},{"location":"working_with_categorical_data/#Working-with-Categorical-Data","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"","category":"section"},{"location":"working_with_categorical_data/#Scientific-types-for-discrete-data","page":"Working with Categorical Data","title":"Scientific types for discrete data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Recall that models articulate their data requirements using scientific types (see Getting Started or the ScientificTypes.jl documentation). There are three scientific types discrete data can have: Count, OrderedFactor and Multiclass.","category":"page"},{"location":"working_with_categorical_data/#Count-data","page":"Working with Categorical Data","title":"Count data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"In MLJ you cannot use integers to represent (finite) categorical data. Integers are reserved for discrete data you want interpreted as Count <: Infinite:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"using MLJ # hide\nscitype([1, 4, 5, 6])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The Count scientific type includes things like the number of phone calls, or city populations, and other \"frequency\" data of a generally unbounded nature.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"That said, you may have data that is theoretically Count, but which you coerce to OrderedFactor to enable the use of more models, trusting to your knowledge of how those models work to inform an appropriate interpretation.","category":"page"},{"location":"working_with_categorical_data/#OrderedFactor-and-Multiclass-data","page":"Working with Categorical Data","title":"OrderedFactor and Multiclass data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Other integer data, such as the number of an animal's legs, or number of rooms in homes, are, generally, coerced to OrderedFactor <: Finite. The other categorical scientific type is Multiclass <: Finite, which is for unordered categorical data. Coercing data to one of these two forms is discussed under Detecting and coercing improperly represented categorical data below.","category":"page"},{"location":"working_with_categorical_data/#Binary-data","page":"Working with Categorical Data","title":"Binary data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"There is no separate scientific type for binary data. Binary data is either OrderedFactor{2} if ordered, and Multiclass{2} otherwise. Data with type OrderedFactor{2} is considered to have an intrinsic \"positive\" class, e.g., the outcome of a medical test, and the \"pass/fail\" outcome of an exam. MLJ measures, such as true_positive assume the second class in the ordering is the \"positive\" class. Inspecting and changing order are discussed in the next section.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"If data has type Bool it is considered Count data (as Bool <: Integer) and, generally, users will want to coerce such data to Multiclass or OrderedFactor.","category":"page"},{"location":"working_with_categorical_data/#Detecting-and-coercing-improperly-represented-categorical-data","page":"Working with Categorical Data","title":"Detecting and coercing improperly represented categorical data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"One inspects the scientific type of data using scitype as shown above. To inspect all column scientific types in a table simultaneously, use schema. (The scitype(X) of a table X contains a condensed form of this information used in type dispatch; see here.)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"import DataFrames: DataFrame\nX = DataFrame(\n name = [\"Siri\", \"Robo\", \"Alexa\", \"Cortana\"],\n gender = [\"male\", \"male\", \"Female\", \"female\"],\n likes_soup = [true, false, false, true],\n height = [152, missing, 148, 163],\n rating = [2, 5, 2, 1],\n outcome = [\"rejected\", \"accepted\", \"accepted\", \"rejected\"],\n)\nschema(X)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Coercing a single column:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"X.outcome = coerce(X.outcome, OrderedFactor)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The machine type of the result is a CategoricalArray. For more on this type see Under the hood: CategoricalValue and CategoricalArray below.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Inspecting the order of the levels:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(X.outcome)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Since we wish to regard \"accepted\" as the positive class, it should appear second, which we correct with the levels! function:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels!(X.outcome, [\"rejected\", \"accepted\"])\nlevels(X.outcome)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"warning: Changing levels of categorical data\nThe order of levels should generally be changed early in your data science workflow and then not again. Similar remarks apply to adding levels (which is possible; see the CategorialArrays.jl documentation). MLJ supervised and unsupervised models assume levels and their order do not change.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Coercing all remaining types simultaneously:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Xnew = coerce(X, :gender => Multiclass,\n :likes_soup => OrderedFactor,\n :height => Continuous,\n :rating => OrderedFactor)\nschema(Xnew)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"For DataFrames there is also in-place coercion, using coerce!.","category":"page"},{"location":"working_with_categorical_data/#Tracking-all-levels","page":"Working with Categorical Data","title":"Tracking all levels","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The key property of vectors of scientific type OrderedFactor and Multiclass is that the pool of all levels is not lost when separating out one or more elements:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = Xnew.rating","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(v[1:2])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(v[2])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"By tracking all classes in this way, MLJ avoids common pain points around categorical data, such as evaluating models on an evaluation set, only to crash your code because classes appear there which were not seen during training.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"By drawing test, validation and training data from a common data structure (as described in Getting Started, for example) one ensures that all possible classes of categorical variables are tracked at all times. However, this does not mitigate problems with new production data, if categorical features there are missing classes or contain previously unseen classes.","category":"page"},{"location":"working_with_categorical_data/#New-or-missing-levels-in-production-data","page":"Working with Categorical Data","title":"New or missing levels in production data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"warning: Warning\nUnpredictable behavior may result whenever Finite categorical data presents in a production set with different classes (levels) from those presented during training","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Consider, for example, the following naive workflow:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"# train a one-hot encoder on some data:\nx = coerce([\"black\", \"white\", \"white\", \"black\"], Multiclass)\nX = DataFrame(x=x)\n\nmodel = OneHotEncoder()\nmach = machine(model, X) |> fit!\n\n# one-hot encode new data with missing classes:\nxproduction = coerce([\"white\", \"white\"], Multiclass)\nXproduction = DataFrame(x=xproduction)\nXproduction == X[2:3,:]","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"So far, so good. But the following operation throws an error:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"julia> transform(mach, Xproduction) == transform(mach, X[2:3,:])\nERROR: Found category level mismatch in feature `x`. Consider using `levels!` to ensure fitted and transforming features have the same category levels.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The problem here is that levels(X.x) and levels(Xproduction.x) are different:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(X.x)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(Xproduction.x)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"This could be anticipated by the fact that the training and production data have different schema:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"schema(X)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"schema(Xproduction)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"One fix is to manually correct the levels of the production data:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels!(Xproduction.x, levels(x))\ntransform(mach, Xproduction) == transform(mach, X[2:3,:])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Another solution is to pack all production data with dummy rows based on the training data (subsequently dropped) to ensure there are no missing classes. Currently, MLJ contains no general tooling to check and fix categorical levels in production data (although one can check that training data and production data have the same schema, to ensure the number of classes in categorical data is consistent).","category":"page"},{"location":"working_with_categorical_data/#Extracting-an-integer-representation-of-Finite-data","page":"Working with Categorical Data","title":"Extracting an integer representation of Finite data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Occasionally, you may really want an integer representation of data that currently has scitype Finite. For example, you are a developer wrapping an algorithm from an external package for use in MLJ, and that algorithm uses integer representations. Use the int method for this purpose, and use decoder to construct decoders for reversing the transformation:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = coerce([\"one\", \"two\", \"three\", \"one\"], OrderedFactor);\nlevels!(v, [\"one\", \"two\", \"three\"]);\nv_int = int(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d = decoder(v); # or decoder(v[1])\nd.(v_int)","category":"page"},{"location":"working_with_categorical_data/#Under-the-hood:-CategoricalValue-and-CategoricalArray","page":"Working with Categorical Data","title":"Under the hood: CategoricalValue and CategoricalArray","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"In MLJ the objects with OrderedFactor or Multiclass scientific type have machine type CategoricalValue, from the CategoricalArrays.jl package. In some sense CategoricalValues are an implementation detail users can ignore for the most part, as shown above. However, you may want some basic understanding of these types, and those implementing MLJ's model interface for new algorithms will have to understand them. For the complete API, see the CategoricalArrays.jl documentation. Here are the basics:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"To construct an OrderedFactor or Multiclass vector directly from raw labels, one uses categorical:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"using CategoricalArrays # hide\nv = categorical(['A', 'B', 'A', 'A', 'C'])\ntypeof(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"(Equivalent to the idiomatically MLJ v = coerce(['A', 'B', 'A', 'A', 'C']), Multiclass).)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"scitype(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = categorical(['A', 'B', 'A', 'A', 'C'], ordered=true, compress=true)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"scitype(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"When you index a CategoricalVector you don't get a raw label, but instead an instance of CategoricalValue. As explained above, this value knows the complete pool of levels from the vector from which it came. Use get(val) to extract the raw label from a value val.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Despite the distinction that exists between a value (element) and a label, the two are the same, from the point of == and in:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v[1] == 'A' # true\n'A' in v # true","category":"page"},{"location":"working_with_categorical_data/#Probabilistic-predictions-of-categorical-data","page":"Working with Categorical Data","title":"Probabilistic predictions of categorical data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Recall from Getting Started that probabilistic classifiers ordinarily predict UnivariateFinite distributions, not raw probabilities (which are instead accessed using the pdf method.) Here's how to construct such a distribution yourself:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = coerce([\"yes\", \"no\", \"yes\", \"yes\", \"maybe\"], Multiclass)\nd = UnivariateFinite([v[2], v[1]], [0.9, 0.1])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Or, equivalently,","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d = UnivariateFinite([\"no\", \"yes\"], [0.9, 0.1], pool=v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"This distribution tracks all levels, not just the ones to which you have assigned probabilities:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"pdf(d, \"maybe\")","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"However, pdf(d, \"dunno\") will throw an error.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"You can declare pool=missing, but then \"maybe\" will not be tracked:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d = UnivariateFinite([\"no\", \"yes\"], [0.9, 0.1], pool=missing)\nlevels(d)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"To construct a whole vector of UnivariateFinite distributions, simply give the constructor a matrix of probabilities:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"yes_probs = rand(5)\nprobs = hcat(1 .- yes_probs, yes_probs)\nd_vec = UnivariateFinite([\"no\", \"yes\"], probs, pool=v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Or, equivalently:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d_vec = UnivariateFinite([\"no\", \"yes\"], yes_probs, augment=true, pool=v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"For more options, see UnivariateFinite.","category":"page"},{"location":"models/COPODDetector_OutlierDetectionPython/#COPODDetector_OutlierDetectionPython","page":"COPODDetector","title":"COPODDetector","text":"","category":"section"},{"location":"models/COPODDetector_OutlierDetectionPython/","page":"COPODDetector","title":"COPODDetector","text":"COPODDetector(n_jobs = 1)","category":"page"},{"location":"models/COPODDetector_OutlierDetectionPython/","page":"COPODDetector","title":"COPODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.copod","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#MultitargetNeuralNetworkRegressor_BetaML","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"mutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#Parameters:","page":"MultitargetNeuralNetworkRegressor","title":"Parameters:","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers\nloss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices.\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 300]\nbatch_size: Size of each individual batch [def: 16]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ndescr: An optional title and/or description for this model\ncb: A call back function to provide information during training [def: BetaML.fitting_info]\nrng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#Notes:","page":"MultitargetNeuralNetworkRegressor","title":"Notes:","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"data must be numerical\nthe label should be a n-records by n-dimensions matrix","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#Example:","page":"MultitargetNeuralNetworkRegressor","title":"Example:","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> ydouble = hcat(y, y .*2 .+5);\n\njulia> modelType = @load MultitargetNeuralNetworkRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Nn.MultitargetNeuralNetworkRegressor\n\njulia> layers = [BetaML.DenseLayer(12,50,f=BetaML.relu),BetaML.DenseLayer(50,50,f=BetaML.relu),BetaML.DenseLayer(50,50,f=BetaML.relu),BetaML.DenseLayer(50,2,f=BetaML.relu)];\n\njulia> model = modelType(layers=layers,opt_alg=BetaML.ADAM(),epochs=500)\nMultitargetNeuralNetworkRegressor(\n layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.2591582523441157 -0.027962845131416225 … 0.16044535560124418 -0.12838827994676857; -0.30381834909561184 0.2405495243851402 … -0.2588144861880588 0.09538577909777807; … ; -0.017320292924711156 -0.14042266424603767 … 0.06366999105841187 -0.13419651752478906; 0.07393079961409338 0.24521350531110264 … 0.04256867886217541 -0.0895506802948175], [0.14249427336553644, 0.24719379413682485, -0.25595911822556566, 0.10034088778965933, -0.017086404878505712, 0.21932184025609347, -0.031413516834861266, -0.12569076082247596, -0.18080140982481183, 0.14551901873323253 … -0.13321995621967364, 0.2436582233332092, 0.0552222336976439, 0.07000814133633904, 0.2280064379660025, -0.28885681475734193, -0.07414214246290696, -0.06783184733650621, -0.055318068046308455, -0.2573488383282579], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.0395424111703751 -0.22531232360829911 … -0.04341228943744482 0.024336206858365517; -0.16481887432946268 0.17798073384748508 … -0.18594039305095766 0.051159225856547474; … ; -0.011639475293705043 -0.02347011206244673 … 0.20508869536159186 -0.1158382446274592; -0.19078069527757857 -0.007487540070740484 … -0.21341165344291158 -0.24158671316310726], [-0.04283623889330032, 0.14924461547060602, -0.17039563392959683, 0.00907774027816255, 0.21738885963113852, -0.06308040225941691, -0.14683286822101105, 0.21726892197970937, 0.19784321784707126, -0.0344988665714947 … -0.23643089430602846, -0.013560425201427584, 0.05323948910726356, -0.04644175812567475, -0.2350400292671211, 0.09628312383424742, 0.07016420995205697, -0.23266392927140334, -0.18823664451487, 0.2304486691429084], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.11504184627266828 0.08601794194664503 … 0.03843129724045469 -0.18417305624127284; 0.10181551438831654 0.13459759904443674 … 0.11094951365942118 -0.1549466590355218; … ; 0.15279817525427697 0.0846661196058916 … -0.07993619892911122 0.07145402617285884; -0.1614160186346092 -0.13032002335149 … -0.12310552194729624 -0.15915773071049827], [-0.03435885900946367, -0.1198543931290306, 0.008454985905194445, -0.17980887188986966, -0.03557204910359624, 0.19125847393334877, -0.10949700778538696, -0.09343206702591, -0.12229583511781811, -0.09123969069220564 … 0.22119233518322862, 0.2053873143308657, 0.12756489387198222, 0.11567243705173319, -0.20982445664020496, 0.1595157838386987, -0.02087331046544119, -0.20556423263489765, -0.1622837764237961, -0.019220998739847395], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.25796717031347993 0.17579536633402948 … -0.09992960168785256 -0.09426177454620635; -0.026436330246675632 0.18070899284865127 … -0.19310119102392206 -0.06904005900252091], [0.16133004882307822, -0.3061228721091248], BetaML.Utils.relu, BetaML.Utils.drelu)], \n loss = BetaML.Utils.squared_cost, \n dloss = BetaML.Utils.dsquared_cost, \n epochs = 500, \n batch_size = 32, \n opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var\"#90#93\"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]), \n shuffle = true, \n descr = \"\", \n cb = BetaML.Nn.fitting_info, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, ydouble);\n\njulia> fit!(mach);\n\njulia> ŷdouble = predict(mach, X);\n\njulia> hcat(ydouble,ŷdouble)\n506×4 Matrix{Float64}:\n 24.0 53.0 28.4624 62.8607\n 21.6 48.2 22.665 49.7401\n 34.7 74.4 31.5602 67.9433\n 33.4 71.8 33.0869 72.4337\n ⋮ \n 23.9 52.8 23.3573 50.654\n 22.0 49.0 22.1141 48.5926\n 11.9 28.8 19.9639 45.5823","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/#MultinomialNBClassifier_MLJScikitLearnInterface","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"A model type for constructing a multinomial naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier = @load MultinomialNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Multinomial naive bayes classifier. It is suitable for classification with discrete features (e.g. word counts for text classification).","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/#LarsRegressor_MLJScikitLearnInterface","page":"LarsRegressor","title":"LarsRegressor","text":"","category":"section"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"LarsRegressor","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"A model type for constructing a least angle regressor (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"LarsRegressor = @load LarsRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"Do model = LarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsRegressor(fit_intercept=...).","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LarsRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"fit_intercept = true\nverbose = false\nnormalize = false\nprecompute = auto\nn_nonzero_coefs = 500\neps = 2.220446049250313e-16\ncopy_X = true\nfit_path = true","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#LOFDetector_OutlierDetectionNeighbors","page":"LOFDetector","title":"LOFDetector","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"LOFDetector(k = 5,\n metric = Euclidean(),\n algorithm = :kdtree,\n leafsize = 10,\n reorder = true,\n parallel = false)","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Calculate an anomaly score based on the density of an instance in comparison to its neighbors. This algorithm introduced the notion of local outliers and was developed by Breunig et al., see [1].","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#Parameters","page":"LOFDetector","title":"Parameters","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"k::Integer","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"metric::Metric","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"leafsize::Int","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"reorder::Bool","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"parallel::Bool","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#Examples","page":"LOFDetector","title":"Examples","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"using OutlierDetection: LOFDetector, fit, transform\ndetector = LOFDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#References","page":"LOFDetector","title":"References","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"[1] Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.; Sander, Jörg (2000): LOF: Identifying Density-Based Local Outliers.","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/#AdaBoostClassifier_MLJScikitLearnInterface","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"","category":"section"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"AdaBoostClassifier","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"A model type for constructing a ada boost classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"AdaBoostClassifier = @load AdaBoostClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"Do model = AdaBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostClassifier(estimator=...).","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"This class implements the algorithm known as AdaBoost-SAMME.","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/#SVMLinearClassifier_MLJScikitLearnInterface","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"","category":"section"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"SVMLinearClassifier","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"A model type for constructing a linear support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"SVMLinearClassifier = @load SVMLinearClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"Do model = SVMLinearClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearClassifier(penalty=...).","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMLinearClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"penalty = l2\nloss = squared_hinge\ndual = true\ntol = 0.0001\nC = 1.0\nmulti_class = ovr\nfit_intercept = true\nintercept_scaling = 1.0\nrandom_state = nothing\nmax_iter = 1000","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#StableForestClassifier_SIRUS","page":"StableForestClassifier","title":"StableForestClassifier","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"StableForestClassifier","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"A model type for constructing a stable forest classifier, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"StableForestClassifier = @load StableForestClassifier pkg=SIRUS","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"Do model = StableForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestClassifier(rng=...).","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"StableForestClassifier implements the random forest classifier with a stabilized forest structure (Bénard et al., 2021). This stabilization increases stability when extracting rules. The impact on the predictive accuracy compared to standard random forests should be relatively small.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"note: Note\nJust like normal random forests, this model is not easily explainable. If you are interested in an explainable model, use the StableRulesClassifier or StableRulesRegressor.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Training-data","page":"StableForestClassifier","title":"Training data","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"where","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Hyperparameters","page":"StableForestClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Fitted-parameters","page":"StableForestClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"fitresult: A StableForest object.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Operations","page":"StableForestClassifier","title":"Operations","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/#RidgeCVRegressor_MLJScikitLearnInterface","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"","category":"section"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"RidgeCVRegressor","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"A model type for constructing a ridge regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"RidgeCVRegressor = @load RidgeCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"Do model = RidgeCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVRegressor(alphas=...).","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"alphas = (0.1, 1.0, 10.0)\nfit_intercept = true\nscoring = nothing\ncv = 5\ngcv_mode = nothing\nstore_cv_values = false","category":"page"},{"location":"models/CountTransformer_MLJText/#CountTransformer_MLJText","page":"CountTransformer","title":"CountTransformer","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"CountTransformer","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"A model type for constructing a count transformer, based on MLJText.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"CountTransformer = @load CountTransformer pkg=MLJText","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Do model = CountTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CountTransformer(max_doc_freq=...).","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of term counts.","category":"page"},{"location":"models/CountTransformer_MLJText/#Training-data","page":"CountTransformer","title":"Training data","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"mach = machine(model, X)","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Here:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:\nA vector of abstract strings (tokens), e.g., [\"I\", \"like\", \"Sam\", \".\", \"Sam\", \"is\", \"nice\", \".\"] (scitype AbstractVector{Textual})\nA dictionary of counts, indexed on abstract strings, e.g., Dict(\"I\"=>1, \"Sam\"=>2, \"Sam is\"=>1) (scitype Multiset{Textual}})\nA dictionary of counts, indexed on plain ngrams, e.g., Dict((\"I\",)=>1, (\"Sam\",)=>2, (\"I\", \"Sam\")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/CountTransformer_MLJText/#Hyper-parameters","page":"CountTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.\nmin_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.","category":"page"},{"location":"models/CountTransformer_MLJText/#Operations","page":"CountTransformer","title":"Operations","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"transform(mach, Xnew): Based on the vocabulary learned in training, return the matrix of counts for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.","category":"page"},{"location":"models/CountTransformer_MLJText/#Fitted-parameters","page":"CountTransformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"vocab: A vector containing the string used in the transformer's vocabulary.","category":"page"},{"location":"models/CountTransformer_MLJText/#Examples","page":"CountTransformer","title":"Examples","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"CountTransformer accepts a variety of inputs. The example below transforms tokenized documents:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"using MLJ\nimport TextAnalysis\n\nCountTransformer = @load CountTransformer pkg=MLJText\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncount_transformer = CountTransformer()\n\njulia> tokenized_docs = TextAnalysis.tokenize.(docs)\n2-element Vector{Vector{String}}:\n [\"Hi\", \"my\", \"name\", \"is\", \"Sam\", \".\"]\n [\"How\", \"are\", \"you\", \"today\", \"?\"]\n\nmach = machine(count_transformer, tokenized_docs)\nfit!(mach)\n\nfitted_params(mach)\n\ntfidf_mat = transform(mach, tokenized_docs)","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Alternatively, one can provide documents pre-parsed as ngrams counts:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"using MLJ\nimport TextAnalysis\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncorpus = TextAnalysis.Corpus(TextAnalysis.NGramDocument.(docs, 1, 2))\nngram_docs = TextAnalysis.ngrams.(corpus)\n\njulia> ngram_docs[1]\nDict{AbstractString, Int64} with 11 entries:\n \"is\" => 1\n \"my\" => 1\n \"name\" => 1\n \".\" => 1\n \"Hi\" => 1\n \"Sam\" => 1\n \"my name\" => 1\n \"Hi my\" => 1\n \"name is\" => 1\n \"Sam .\" => 1\n \"is Sam\" => 1\n\ncount_transformer = CountTransformer()\nmach = machine(count_transformer, ngram_docs)\nMLJ.fit!(mach)\nfitted_params(mach)\n\ntfidf_mat = transform(mach, ngram_docs)","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"See also TfidfTransformer, BM25Transformer","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#RandomForestRegressor_DecisionTree","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A model type for constructing a CART random forest regressor, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(max_depth=...).","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"DecisionTreeRegressor implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): \"Random Forests.\", Machine Learning, vol. 45, pp. 5–32","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Training-data","page":"RandomForestRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"where","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Hyperparameters","page":"RandomForestRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: min number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)\nn_trees=10: number of trees to train\nsampling_fraction=0.7 fraction of samples to train each tree on\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Operations","page":"RandomForestRegressor","title":"Operations","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Fitted-parameters","page":"RandomForestRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"forest: the Ensemble object returned by the core DecisionTree.jl algorithm","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Report","page":"RandomForestRegressor","title":"Report","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Accessor-functions","page":"RandomForestRegressor","title":"Accessor functions","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Examples","page":"RandomForestRegressor","title":"Examples","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"using MLJ\nForest = @load RandomForestRegressor pkg=DecisionTree\nforest = Forest(max_depth=4, min_samples_split=3)\n\nX, y = make_regression(100, 2) ## synthetic data\nmach = machine(forest, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2)\nyhat = predict(mach, Xnew) ## new predictions\n\nfitted_params(mach).forest ## raw `Ensemble` object from DecisionTree.jl\nfeature_importances(mach)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestRegressor.","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/#MultiTaskElasticNetRegressor_MLJScikitLearnInterface","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"","category":"section"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"MultiTaskElasticNetRegressor","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"A model type for constructing a multi-target elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"MultiTaskElasticNetRegressor = @load MultiTaskElasticNetRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"Do model = MultiTaskElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetRegressor(alpha=...).","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskElasticNetRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"alpha = 1.0\nl1_ratio = 0.5\nfit_intercept = true\ncopy_X = true\nmax_iter = 1000\ntol = 0.0001\nwarm_start = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/XGBoostCount_XGBoost/#XGBoostCount_XGBoost","page":"XGBoostCount","title":"XGBoostCount","text":"","category":"section"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"XGBoostCount","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"A model type for constructing a eXtreme Gradient Boosting Count Regressor, based on XGBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"XGBoostCount = @load XGBoostCount pkg=XGBoost","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"Do model = XGBoostCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostCount(test=...).","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"Univariate discrete regression using xgboost.","category":"page"},{"location":"models/XGBoostCount_XGBoost/#Training-data","page":"XGBoostCount","title":"Training data","text":"","category":"section"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"m = machine(model, X, y)","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"where","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.\ny: is an AbstractVector continuous target.","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"Train using fit!(m, rows=...).","category":"page"},{"location":"models/XGBoostCount_XGBoost/#Hyper-parameters","page":"XGBoostCount","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"See https://xgboost.readthedocs.io/en/stable/parameter.html.","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/#HistGradientBoostingRegressor_MLJScikitLearnInterface","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"","category":"section"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"HistGradientBoostingRegressor","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"HistGradientBoostingRegressor = @load HistGradientBoostingRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"Do model = HistGradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingRegressor(loss=...).","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#EvoTreeMLE_EvoTrees","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"EvoTreeMLE(;kwargs...)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"A model type for constructing a EvoTreeMLE, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is specified through loss kwargs. Both Gaussian and Logistic distributions are supported.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Hyper-parameters","page":"EvoTreeMLE","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"loss=:gaussian: Loss to be be minimized during training. One of:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":":gaussian / :gaussian_mle\n:logistic / :logistic_mle\nnrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance. ","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Internal-API","page":"EvoTreeMLE","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Do config = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(max_depth=...).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Training-model","page":"EvoTreeMLE","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Inference","page":"EvoTreeMLE","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Predictions are obtained using predict which returns a Matrix of size [nobs, nparams] where the second dimensions refer to μ & σ for Normal/Gaussian and μ & s for Logistic.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"model(X)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#MLJ","page":"EvoTreeMLE","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"EvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Do model = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(loss=...).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Training-data","page":"EvoTreeMLE","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"where","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Operations","page":"EvoTreeMLE","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"predict(mach, Xnew): returns a vector of Gaussian or Logistic distributions (according to provided loss) given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Predictions are probabilistic.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Specific metrics can also be predicted using:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"predict_mean(mach, Xnew)\npredict_mode(mach, Xnew)\npredict_median(mach, Xnew)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Fitted-parameters","page":"EvoTreeMLE","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Report","page":"EvoTreeMLE","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Examples","page":"EvoTreeMLE","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeMLE(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"## MLJ Interface\nusing MLJ\nEvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees\nmodel = EvoTreeMLE(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_boston\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mean(mach, X)\npreds = predict_mode(mach, X)\npreds = predict_median(mach, X)","category":"page"},{"location":"homogeneous_ensembles/#Homogeneous-Ensembles","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"","category":"section"},{"location":"homogeneous_ensembles/","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"Although an ensemble of models sharing a common set of hyperparameters can be defined using the learning network API, MLJ's EnsembleModel model wrapper is preferred, for convenience and best performance. Examples of using EnsembleModel are given in this Data Science Tutorial.","category":"page"},{"location":"homogeneous_ensembles/","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"When bagging decision trees, further randomness is normally introduced by subsampling features, when training each node of each tree (Ho (1995), Brieman and Cutler (2001)). A bagged ensemble of such trees is known as a Random Forest. You can see an example of using EnsembleModel to build a random forest in this Data Science Tutorial. However, you may also want to use a canned random forest model. Run models(\"RandomForest\") to list such models.","category":"page"},{"location":"homogeneous_ensembles/","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"MLJEnsembles.EnsembleModel","category":"page"},{"location":"homogeneous_ensembles/#MLJEnsembles.EnsembleModel","page":"Homogeneous Ensembles","title":"MLJEnsembles.EnsembleModel","text":"EnsembleModel(model,\n atomic_weights=Float64[],\n bagging_fraction=0.8,\n n=100,\n rng=GLOBAL_RNG,\n acceleration=CPU1(),\n out_of_bag_measure=[])\n\nCreate a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.\n\nHere the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).\n\nIf rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.\n\nThe atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.\n\nThe ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.\n\nSpecify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.\n\nIf a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).\n\nImportant: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.\n\n\n\n\n\n","category":"function"},{"location":"models/PLSRegressor_PartialLeastSquaresRegressor/#PLSRegressor_PartialLeastSquaresRegressor","page":"PLSRegressor","title":"PLSRegressor","text":"","category":"section"},{"location":"models/PLSRegressor_PartialLeastSquaresRegressor/","page":"PLSRegressor","title":"PLSRegressor","text":"A Partial Least Squares Regressor. Contains PLS1, PLS2 (multi target) algorithms. Can be used mainly for regression.","category":"page"},{"location":"openml_integration/#OpenML-Integration","page":"OpenML Integration","title":"OpenML Integration","text":"","category":"section"},{"location":"openml_integration/","page":"OpenML Integration","title":"OpenML Integration","text":"The OpenML platform provides an integration platform for carrying out and comparing machine learning solutions across a broad collection of public datasets and software platforms.","category":"page"},{"location":"openml_integration/","page":"OpenML Integration","title":"OpenML Integration","text":"Integration with OpenML API is presently limited to querying and downloading datasets.","category":"page"},{"location":"openml_integration/","page":"OpenML Integration","title":"OpenML Integration","text":"Documentation is here.","category":"page"},{"location":"models/ECODDetector_OutlierDetectionPython/#ECODDetector_OutlierDetectionPython","page":"ECODDetector","title":"ECODDetector","text":"","category":"section"},{"location":"models/ECODDetector_OutlierDetectionPython/","page":"ECODDetector","title":"ECODDetector","text":"ECODDetector(n_jobs = 1)","category":"page"},{"location":"models/ECODDetector_OutlierDetectionPython/","page":"ECODDetector","title":"ECODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.ecod","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#UnivariateBoxCoxTransformer_MLJModels","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"UnivariateBoxCoxTransformer","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"A Box-Cox transformation (with shift) is of the form","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"x -> ((x + c)^λ - 1)/λ","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"for some constant c and real λ, unless λ = 0, in which case the above is replaced with","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"x -> log(x + c)","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Reference: Wikipedia entry for power transform.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Training-data","page":"UnivariateBoxCoxTransformer","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"where","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Hyper-parameters","page":"UnivariateBoxCoxTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"n=171: number of values of the exponent λ to try\nshift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Operations","page":"UnivariateBoxCoxTransformer","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach\ninverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Fitted-parameters","page":"UnivariateBoxCoxTransformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"λ: the learned Box-Cox exponent\nc: the learned shift","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Examples","page":"UnivariateBoxCoxTransformer","title":"Examples","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"using MLJ\nusing UnicodePlots\nusing Random\nRandom.seed!(123)\n\ntransf = UnivariateBoxCoxTransformer()\n\nx = randn(1000).^2\n\nmach = machine(transf, x)\nfit!(mach)\n\nz = transform(mach, x)\n\njulia> histogram(x)\n ┌ ┐\n [ 0.0, 2.0) ┤███████████████████████████████████ 848\n [ 2.0, 4.0) ┤████▌ 109\n [ 4.0, 6.0) ┤█▍ 33\n [ 6.0, 8.0) ┤▍ 7\n [ 8.0, 10.0) ┤▏ 2\n [10.0, 12.0) ┤ 0\n [12.0, 14.0) ┤▏ 1\n └ ┘\n Frequency\n\njulia> histogram(z)\n ┌ ┐\n [-5.0, -4.0) ┤█▎ 8\n [-4.0, -3.0) ┤████████▊ 64\n [-3.0, -2.0) ┤█████████████████████▊ 159\n [-2.0, -1.0) ┤█████████████████████████████▊ 216\n [-1.0, 0.0) ┤███████████████████████████████████ 254\n [ 0.0, 1.0) ┤█████████████████████████▊ 188\n [ 1.0, 2.0) ┤████████████▍ 90\n [ 2.0, 3.0) ┤██▊ 20\n [ 3.0, 4.0) ┤▎ 1\n └ ┘\n Frequency\n","category":"page"},{"location":"performance_measures/#Performance-Measures","page":"Performance Measures","title":"Performance Measures","text":"","category":"section"},{"location":"performance_measures/#Quick-links","page":"Performance Measures","title":"Quick links","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"List of aliases of all measures\nMigration guide for changes to measures in MLJBase 1.0","category":"page"},{"location":"performance_measures/#Introduction","page":"Performance Measures","title":"Introduction","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are collectively referred to as measures. These measures are provided by the package StatisticalMeasures.jl but are immediately available to the MLJ user. Here's a simple example of direct application of the log_loss measures to compute a training loss:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"using MLJ\nX, y = @load_iris\nDecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree\ntree = DecisionTreeClassifier(max_depth=2)\nmach = machine(tree, X, y) |> fit!\nyhat = predict(mach, X)\nlog_loss(yhat, y)","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"A list of all measures, ready to use after running using MLJ or using StatisticalMeasures, is here. Alternatively, call measures() (experimental) to generate a dictionary keyed on available measure constructors, with measure metadata as values.","category":"page"},{"location":"performance_measures/#Custom-measures","page":"Performance Measures","title":"Custom measures","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an \"atomic\" measure can be transformed into a multi-target measure using this package.","category":"page"},{"location":"performance_measures/#Uses-of-measures","page":"Performance Measures","title":"Uses of measures","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"In MLJ, measures are specified:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"when evaluating model performance using evaluate!/evaluate; see Evaluating Model Performance\nwhen wrapping models using TunedModel - see Tuning Models\nwhen wrapping iterative models using IteratedModel - see Controlling Iterative Models\nwhen generating learning curves using learning_curve - see Learning Curves","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"and elsewhere.","category":"page"},{"location":"performance_measures/#Using-LossFunctions.jl","page":"Performance Measures","title":"Using LossFunctions.jl","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.","category":"page"},{"location":"performance_measures/#Receiver-operator-characteristics","page":"Performance Measures","title":"Receiver operator characteristics","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve method:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"StatisticalMeasures.roc_curve","category":"page"},{"location":"performance_measures/#StatisticalMeasures.roc_curve","page":"Performance Measures","title":"StatisticalMeasures.roc_curve","text":"roc_curve(ŷ, y) -> false_positive_rates, true_positive_rates, thresholds\n\nReturn data for plotting the receiver operator characteristic (ROC curve) for a binary classification problem.\n\nHere ŷ is a vector of UnivariateFinite distributions (from CategoricalDistributions.jl) over the two values taken by the ground truth observations y, a CategoricalVector. \n\nIf there are k unique probabilities, then there are correspondingly k thresholds and k+1 \"bins\" over which the false positive and true positive rates are constant.:\n\n[0.0 - thresholds[1]]\n[thresholds[1] - thresholds[2]]\n...\n[thresholds[k] - 1]\n\nConsequently, true_positive_rates and false_positive_rates have length k+1 if thresholds has length k.\n\nTo plot the curve using your favorite plotting backend, do something like plot(false_positive_rates, true_positive_rates).\n\nCore algorithm: Functions.roc_curve\n\nSee also AreaUnderCurve. \n\n\n\n\n\n","category":"function"},{"location":"performance_measures/#Migration-guide-for-changes-to-measures-in-MLJBase-1.0","page":"Performance Measures","title":"Migration guide for changes to measures in MLJBase 1.0","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:","category":"page"},{"location":"performance_measures/#Breaking-behavior-likely-relevant-to-many-users","page":"Performance Measures","title":"Breaking behavior likely relevant to many users","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"If using MLJBase without MLJ, then, in Julia 1.9 or higher, StatisticalMeasures must be explicitly imported to use measures that were previously part of MLJBase. If using MLJ, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below).\nAll measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping Unaggregated) no longer do so. To get per-observation measurements, use the new method StatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights]).\nThe default measure for regression models (used in evaluate/evaluate! when measures is unspecified) is changed from rms to l2=LPLoss(2) (mean sum of squares).\nMeanAbsoluteError has been removed and instead mae is an alias for LPLoss(p=1).\nMeasures that previously skipped NaN values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not support missing now do.\nAliases for measure types have been removed. For example RMSE (alias for RootMeanSquaredError) is gone. Aliases for instances, such as rms and cross_entropy persist. The exception is precision, for which ppv can be used in its place. (This is to avoid conflict with Base.precision, which was previously pirated.)\ninfo(measure) has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measure rms, do import StatisticalMeasures as SM; SM.orientation(rms).\nBehavior of the measures() method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples; measures(predicate) is decommissioned, but measures(needle) is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.)\nMeasures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly import LossFunctions and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples.\nSome user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.\nMeasures with a \"feature argument\" X, as in some_measure(ŷ, y, X), are no longer supported. See What is a measure? for allowed signatures in measures.","category":"page"},{"location":"performance_measures/#Packages-implementing-the-MLJ-model-interface","page":"Performance Measures","title":"Packages implementing the MLJ model interface","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"Add StatisticalMeasures as test dependency, and add using StatisticalMeasures to your runtests.jl (and/or included submodules).\nIf measures are qualified, as in MLJBase.rms, then the qualification must be removed or changed to StatisticalMeasures.rms, etc.\nBe aware that the default measure used in methods such as evaluate!, when measure is not specified, is changed from rms to l2 for regression models.\nBe aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.","category":"page"},{"location":"performance_measures/#Breaking-behavior-possibly-relevant-to-some-developers","page":"Performance Measures","title":"Breaking behavior possibly relevant to some developers","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"The abstract measure types Aggregated, Unaggregated, Measure have been decommissioned. (A measure is now defined purely by its calling behavior.)\nWhat were previously exported as measure types are now only constructors.\ntarget_scitype(measure) is decommissioned. Related is StatisticalMeasures.observation_scitype(measure) which declares an upper bound on the allowed scitype of a single observation.\nprediction_type(measure) is decommissioned. Instead use StatisticalMeasures.kind_of_proxy(measure).\nThe trait reports_each_observation is decommissioned. Related is StatisticalMeasures.can_report_unaggregated; if false the new measurements method simply returns n copies of the aggregated measurement, where n is the number of observations provided, instead of individual observation-dependent measurements.\naggregation(measure) has been decommissioned. Instead use StatisticalMeasures.external_mode_of_aggregation(measure).\ninstances(measure) has been decommissioned; query docstrings for measure aliases, or follow this example: aliases = measures()[RootMeanSquaredError].aliases.\nis_feature_dependent(measure) has been decommissioned. Measures consuming feature data are not longer supported; see above.\ndistribution_type(measure) has been decommissioned.\ndocstring(measure) has been decommissioned.\nBehavior of aggregate has changed.\nThe following traits, previously exported by MLJBase and MLJ, cannot be applied to measures: supports_weights, supports_class_weights, orientation, human_name. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as in import StatisticalMeasures; StatisticalMeasures.orientation(measure)).","category":"page"},{"location":"models/GMMDetector_OutlierDetectionPython/#GMMDetector_OutlierDetectionPython","page":"GMMDetector","title":"GMMDetector","text":"","category":"section"},{"location":"models/GMMDetector_OutlierDetectionPython/","page":"GMMDetector","title":"GMMDetector","text":"GMMDetector(n_components=1,\n covariance_type=\"full\",\n tol=0.001,\n reg_covar=1e-06,\n max_iter=100,\n n_init=1,\n init_params=\"kmeans\",\n weights_init=None,\n means_init=None,\n precisions_init=None,\n random_state=None,\n warm_start=False)","category":"page"},{"location":"models/GMMDetector_OutlierDetectionPython/","page":"GMMDetector","title":"GMMDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.gmm","category":"page"},{"location":"models/LGBMRegressor_LightGBM/#LGBMRegressor_LightGBM","page":"LGBMRegressor","title":"LGBMRegressor","text":"","category":"section"},{"location":"models/LGBMRegressor_LightGBM/","page":"LGBMRegressor","title":"LGBMRegressor","text":"Microsoft LightGBM FFI wrapper: Regressor","category":"page"},{"location":"models/LMDDDetector_OutlierDetectionPython/#LMDDDetector_OutlierDetectionPython","page":"LMDDDetector","title":"LMDDDetector","text":"","category":"section"},{"location":"models/LMDDDetector_OutlierDetectionPython/","page":"LMDDDetector","title":"LMDDDetector","text":"LMDDDetector(n_iter = 50,\n dis_measure = \"aad\",\n random_state = nothing)","category":"page"},{"location":"models/LMDDDetector_OutlierDetectionPython/","page":"LMDDDetector","title":"LMDDDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lmdd","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#EvoTreeClassifier_EvoTrees","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"EvoTreeClassifier(;kwargs...)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"A model type for constructing a EvoTreeClassifier, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface. EvoTreeClassifier is used to perform multi-class classification, using cross-entropy loss.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Hyper-parameters","page":"EvoTreeClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Internal-API","page":"EvoTreeClassifier","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Do config = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(max_depth=...).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Training-model","page":"EvoTreeClassifier","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Inference","page":"EvoTreeClassifier","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Predictions are obtained using predict which returns a Matrix of size [nobs, K] where K is the number of classes:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"model(X)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#MLJ","page":"EvoTreeClassifier","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Do model = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(loss=...).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Training-data","page":"EvoTreeClassifier","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"where","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Multiclas or <:OrderedFactor; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Operations","page":"EvoTreeClassifier","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.\npredict_mode(mach, Xnew): returns the mode of each of the prediction above.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Fitted-parameters","page":"EvoTreeClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Report","page":"EvoTreeClassifier","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Examples","page":"EvoTreeClassifier","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeClassifier(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(1:3, nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"## MLJ Interface\nusing MLJ\nEvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees\nmodel = EvoTreeClassifier(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_iris\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mode(mach, X)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"See also EvoTrees.jl.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#FactorAnalysis_MultivariateStats","page":"FactorAnalysis","title":"FactorAnalysis","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"FactorAnalysis","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Do model = FactorAnalysis() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...).","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Training-data","page":"FactorAnalysis","title":"Training data","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"mach = machine(model, X)","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Here:","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Hyper-parameters","page":"FactorAnalysis","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"method::Symbol=:cm: Method to use to solve the problem, one of :ml, :em, :bayes.\nmaxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.\nmaxiter::Int=1000: Maximum number of iterations.\ntol::Real=1e-6: Convergence tolerance.\neta::Real=tol: Variance lower bound.\nmean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Operations","page":"FactorAnalysis","title":"Operations","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Fitted-parameters","page":"FactorAnalysis","title":"Fitted parameters","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Report","page":"FactorAnalysis","title":"Report","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"The fields of report(mach) are:","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data (number of factors).\nvariance: The variance of the factors.\ncovariance_matrix: The estimated covariance matrix.\nmean: The mean of the untransformed training data, of length indim.\nloadings: The factor loadings. A matrix of size (indim, outdim) where indim and outdim are as defined above.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Examples","page":"FactorAnalysis","title":"Examples","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"using MLJ\n\nFactorAnalysis = @load FactorAnalysis pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = FactorAnalysis(maxoutdim=2)\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"See also KernelPCA, ICA, PPCA, PCA","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#SRRegressor_SymbolicRegression","page":"SRRegressor","title":"SRRegressor","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"SRRegressor","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"SRRegressor = @load SRRegressor pkg=SymbolicRegression","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(binary_operators=...).","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Training-data","page":"SRRegressor","title":"Training data","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"OR","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Here:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.\nw is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Hyper-parameters","page":"SRRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.\nunary_operators: Same, but for unary operators (one input scalar, gives an output scalar).\nconstraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.\nbatching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.\nbatch_size: What batch size to use if using batching.\nelementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).\nloss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,\n function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}\n prediction, flag = eval_tree_array(tree, dataset.X, options)\n if !flag\n return L(Inf)\n end\n return sum((prediction .- dataset.y) .^ 2) / dataset.n\n end\nnode_type::Type{N}=Node: The type of node to use for the search. For example, Node or GraphNode.\npopulations: How many populations of equations to use.\npopulation_size: How many equations in each population.\nncycles_per_iteration: How many generations to consider per iteration.\ntournament_selection_n: Number of expressions considered in each tournament.\ntournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.\ntopn: Number of equations to return to the host process, and to consider for the hall of fame.\ncomplexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].\ncomplexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.\ncomplexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.\nalpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.\nmaxsize: Maximum size of equations during the search.\nmaxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.\nparsimony: A multiplicative factor for how much complexity is punished.\ndimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.\nuse_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.\nuse_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.\nadaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.\nturbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!\nbumper: Whether to use Bumper.jl for faster evaluation. Experimental!\nmigration: Whether to migrate equations between processes.\nhof_migration: Whether to migrate equations from the hall of fame to processes.\nfraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.\nfraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.\nshould_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.\nshould_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.\noptimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).\noptimizer_nrestarts: How many different random starting positions to consider for optimization of constants.\noptimizer_probability: Probability of performing optimization of constants at the end of a given iteration.\noptimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.\noptimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 0 which means no limit.\noptimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.\noutput_file: What file to store equations to, as a backup.\nperturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).\nprobability_negate_constant: Probability of negating a constant in the equation when mutating it.\nmutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.\ncrossover_probability: Probability of performing crossover.\nannealing: Whether to use simulated annealing.\nwarmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.\nverbosity: Whether to print debugging statements or not.\nprint_precision: How many digits to print when printing equations. By default, this is 5.\nsave_to_file: Whether to save equations to a file during the search.\nbin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).\nuna_constraints: Likewise, for unary operators.\nseed: What random seed to use. nothing uses no seed.\nprogress: Whether to use a progress bar output (verbosity will have no effect).\nearly_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.\ntimeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).\nmax_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.\nskip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.\nnested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.\ndeterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.\ndefine_helper_functions: Whether to define helper functions for constructing and evaluating trees.\nniterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.\nparallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like \"multithreading\".\nnumprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).\nprocs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.\naddprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.\nheap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.\nruntests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.\nloss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.\nselection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.\ndimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Operations","page":"SRRegressor","title":"Operations","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.\npredict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Fitted-parameters","page":"SRRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). T is equal to the element type of the passed data.\nequation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Report","page":"SRRegressor","title":"Report","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity).\nequation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.\ncomplexities::Vector{Int}: The complexity of each expression in the Pareto frontier.\nlosses::Vector{L}: The loss of each expression in the Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.\nscores::Vector{L}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Examples","page":"SRRegressor","title":"Examples","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"using MLJ\nSRRegressor = @load SRRegressor pkg=SymbolicRegression\nX, y = @load_boston\nmodel = SRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)\nmach = machine(model, X, y)\nfit!(mach)\ny_hat = predict(mach, X)\n## View the equation used:\nr = report(mach)\nprintln(\"Equation used:\", r.equation_strings[r.best_idx])","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"With units and variable names:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"using MLJ\nusing DynamicQuantities\nSRegressor = @load SRRegressor pkg=SymbolicRegression\n\nX = (; x1=rand(32) .* us\"km/h\", x2=rand(32) .* us\"km\")\ny = @. X.x2 / X.x1 + 0.5us\"h\"\nmodel = SRRegressor(binary_operators=[+, -, *, /])\nmach = machine(model, X, y)\nfit!(mach)\ny_hat = predict(mach, X)\n## View the equation used:\nr = report(mach)\nprintln(\"Equation used:\", r.equation_strings[r.best_idx])","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"See also MultitargetSRRegressor.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#EvoTreeGaussian_EvoTrees","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"EvoTreeGaussian(;kwargs...)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"A model type for constructing a EvoTreeGaussian, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ and σ parameters to maximize likelihood.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Hyper-parameters","page":"EvoTreeGaussian","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Internal-API","page":"EvoTreeGaussian","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Do config = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(max_depth=...).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Training-model","page":"EvoTreeGaussian","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Inference","page":"EvoTreeGaussian","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Predictions are obtained using predict which returns a Matrix of size [nobs, 2] where the second dimensions refer to μ and σ respectively:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"model(X)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#MLJ","page":"EvoTreeGaussian","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"EvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Do model = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(loss=...).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Training-data","page":"EvoTreeGaussian","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"where","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Operations","page":"EvoTreeGaussian","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"predict(mach, Xnew): returns a vector of Gaussian distributions given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Predictions are probabilistic.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Specific metrics can also be predicted using:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"predict_mean(mach, Xnew)\npredict_mode(mach, Xnew)\npredict_median(mach, Xnew)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Fitted-parameters","page":"EvoTreeGaussian","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Report","page":"EvoTreeGaussian","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Examples","page":"EvoTreeGaussian","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"## Internal API\nusing EvoTrees\nparams = EvoTreeGaussian(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(nobs)\nmodel = fit_evotree(params; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"## MLJ Interface\nusing MLJ\nEvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees\nmodel = EvoTreeGaussian(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_boston\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mean(mach, X)\npreds = predict_mode(mach, X)\npreds = predict_median(mach, X)","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/#GaussianMixtureImputer_BetaML","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"","category":"section"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"mutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/#Hyperparameters:","page":"GaussianMixtureImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to \"gived\" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures areSphericalGaussian,DiagonalGaussianandFullGaussian. [def:DiagonalGaussian`]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/#Example-:","page":"GaussianMixtureImputer","title":"Example :","text":"","category":"section"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"julia> using MLJ\n\njulia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;\n\njulia> modelType = @load GaussianMixtureImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.GaussianMixtureImputer\n\njulia> model = modelType(initialisation_strategy=\"grid\")\nGaussianMixtureImputer(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"grid\", \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(GaussianMixtureImputer(n_classes = 3, …), …).\nIter. 1: Var. of the post 2.0225921341714286 Log-likelihood -42.96100103213314\n\njulia> X_full = transform(mach) |> MLJ.matrix\n9×2 Matrix{Float64}:\n 1.0 10.5\n 1.5 14.7366\n 1.8 8.0\n 1.7 15.0\n 3.2 40.0\n 2.51842 15.1747\n 3.3 38.0\n 2.47412 -2.3\n 5.2 -2.4","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#HuberRegressor_MLJLinearModels","page":"HuberRegressor","title":"HuberRegressor","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"A model type for constructing a huber regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor = @load HuberRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"Do model = HuberRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to HuberRho(delta), where delta is a new hyperparameter.","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#Training-data","page":"HuberRegressor","title":"Training data","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"where:","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#Hyperparameters","page":"HuberRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"delta::Real: parameterizes the HuberRho function (radius of the ball within which the loss is a quadratic loss) Default: 0.5\nlambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#Example","page":"HuberRegressor","title":"Example","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(HuberRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"See also RobustRegressor, QuantileRegressor.","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#UnivariateTimeTypeToContinuous_MLJModels","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"UnivariateTimeTypeToContinuous","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Training-data","page":"UnivariateTimeTypeToContinuous","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"where","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"x: any abstract vector whose element type is a subtype of Dates.TimeType","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Hyper-parameters","page":"UnivariateTimeTypeToContinuous","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.\nstep::Period=Hour(24): time interval to correspond to one unit under transformation","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Operations","page":"UnivariateTimeTypeToContinuous","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"transform(mach, xnew): apply the encoding inferred when mach was fit","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Fitted-parameters","page":"UnivariateTimeTypeToContinuous","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Example","page":"UnivariateTimeTypeToContinuous","title":"Example","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"using MLJ\nusing Dates\n\nx = [Date(2001, 1, 1) + Day(i) for i in 0:4]\n\nencoder = UnivariateTimeTypeToContinuous(zero_time=Date(2000, 1, 1),\n step=Week(1))\n\nmach = machine(encoder, x)\nfit!(mach)\njulia> transform(mach, x)\n5-element Vector{Float64}:\n 52.285714285714285\n 52.42857142857143\n 52.57142857142857\n 52.714285714285715\n 52.857142","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#AdaBoostStumpClassifier_DecisionTree","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"AdaBoostStumpClassifier","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"A model type for constructing a Ada-boosted stump classifier, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"AdaBoostStumpClassifier = @load AdaBoostStumpClassifier pkg=DecisionTree","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"Do model = AdaBoostStumpClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostStumpClassifier(n_iter=...).","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Training-data","page":"AdaBoostStumpClassifier","title":"Training data","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"where:","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Hyperparameters","page":"AdaBoostStumpClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"n_iter=10: number of iterations of AdaBoost\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Operations","page":"AdaBoostStumpClassifier","title":"Operations","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.\npredict_mode(mach, Xnew): instead return the mode of each prediction above.","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Fitted-Parameters","page":"AdaBoostStumpClassifier","title":"Fitted Parameters","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"stumps: the Ensemble object returned by the core DecisionTree.jl algorithm.\ncoefficients: the stump coefficients (one per stump)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Report","page":"AdaBoostStumpClassifier","title":"Report","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Accessor-functions","page":"AdaBoostStumpClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Examples","page":"AdaBoostStumpClassifier","title":"Examples","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"using MLJ\nBooster = @load AdaBoostStumpClassifier pkg=DecisionTree\nbooster = Booster(n_iter=15)\n\nX, y = @load_iris\nmach = machine(booster, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\nyhat = predict(mach, Xnew) ## probabilistic predictions\npredict_mode(mach, Xnew) ## point predictions\npdf.(yhat, \"virginica\") ## probabilities for the \"verginica\" class\n\nfitted_params(mach).stumps ## raw `Ensemble` object from DecisionTree.jl\nfitted_params(mach).coefs ## coefficient associated with each stump\nfeature_importances(mach)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.AdaBoostStumpClassifier.","category":"page"},{"location":"models/CBLOFDetector_OutlierDetectionPython/#CBLOFDetector_OutlierDetectionPython","page":"CBLOFDetector","title":"CBLOFDetector","text":"","category":"section"},{"location":"models/CBLOFDetector_OutlierDetectionPython/","page":"CBLOFDetector","title":"CBLOFDetector","text":"CBLOFDetector(n_clusters = 8,\n alpha = 0.9,\n beta = 5,\n use_weights = false,\n random_state = nothing,\n n_jobs = 1)","category":"page"},{"location":"models/CBLOFDetector_OutlierDetectionPython/","page":"CBLOFDetector","title":"CBLOFDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cblof","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/#LassoLarsRegressor_MLJScikitLearnInterface","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"","category":"section"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"LassoLarsRegressor","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"A model type for constructing a Lasso model fit with least angle regression (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"LassoLarsRegressor = @load LassoLarsRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"Do model = LassoLarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsRegressor(alpha=...).","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoLarsRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"alpha = 1.0\nfit_intercept = true\nverbose = false\nnormalize = false\nprecompute = auto\nmax_iter = 500\neps = 2.220446049250313e-16\ncopy_X = true\nfit_path = true\npositive = false","category":"page"},{"location":"models/TSVDTransformer_TSVD/#TSVDTransformer_TSVD","page":"TSVDTransformer","title":"TSVDTransformer","text":"","category":"section"},{"location":"models/TSVDTransformer_TSVD/","page":"TSVDTransformer","title":"TSVDTransformer","text":"Truncated SVD dimensionality reduction","category":"page"},{"location":"models/COFDetector_OutlierDetectionPython/#COFDetector_OutlierDetectionPython","page":"COFDetector","title":"COFDetector","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionPython/","page":"COFDetector","title":"COFDetector","text":"COFDetector(n_neighbors = 5,\n method=\"fast\")","category":"page"},{"location":"models/COFDetector_OutlierDetectionPython/","page":"COFDetector","title":"COFDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cof","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#ProbabilisticSVC_LIBSVM","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"ProbabilisticSVC","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Do model = ProbabilisticSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...).","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"This model is identical to SVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Platt, John (1999): \"Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.\"","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Training-data","page":"ProbabilisticSVC","title":"Training data","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"where","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)\nw: a dictionary of class weights, keyed on levels(y).","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Hyper-parameters","page":"ProbabilisticSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Operations","page":"ProbabilisticSVC","title":"Operations","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"predict(mach, Xnew): return probabilistic predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Fitted-parameters","page":"ProbabilisticSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Report","page":"ProbabilisticSVC","title":"Report","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Examples","page":"ProbabilisticSVC","title":"Examples","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/#Using-a-built-in-kernel","page":"ProbabilisticSVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"using MLJ\nimport LIBSVM\n\nProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM ## model type\nmodel = ProbabilisticSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> probs = predict(mach, Xnew)\n3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.00186, versicolor=>0.003, virginica=>0.995)\n UnivariateFinite{Multiclass{3}}(setosa=>0.000563, versicolor=>0.0554, virginica=>0.944)\n UnivariateFinite{Multiclass{3}}(setosa=>1.4e-6, versicolor=>1.68e-6, virginica=>1.0)\n\n\njulia> labels = mode.(probs)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#User-defined-kernels","page":"ProbabilisticSVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = ProbabilisticSVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\nprobs = predict(mach, Xnew)","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Incorporating-class-weights","page":"ProbabilisticSVC","title":"Incorporating class weights","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"In either scenario above, we can do:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"weights = Dict(\"virginica\" => 1, \"versicolor\" => 20, \"setosa\" => 1)\nmach = machine(model, X, y, weights) |> fit!\n\nprobs = predict(mach, Xnew)","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"See also the classifiers SVC, NuSVC and LinearSVC, and LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/#LogisticClassifier_MLJScikitLearnInterface","page":"LogisticClassifier","title":"LogisticClassifier","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"A model type for constructing a logistic regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier = @load LogisticClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Do model = LogisticClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticClassifier(penalty=...).","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"LogisticClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"penalty = l2\ndual = false\ntol = 0.0001\nC = 1.0\nfit_intercept = true\nintercept_scaling = 1.0\nclass_weight = nothing\nrandom_state = nothing\nsolver = lbfgs\nmax_iter = 100\nmulti_class = auto\nverbose = 0\nwarm_start = false\nn_jobs = nothing\nl1_ratio = nothing","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#ImageClassifier_MLJFlux","page":"ImageClassifier","title":"ImageClassifier","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"A model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier = @load ImageClassifier pkg=MLJFlux","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Do model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Training-data","page":"ImageClassifier","title":"Training data","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Here:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"X is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Hyper-parameters","page":"ImageClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"builder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Operations","page":"ImageClassifier","title":"Operations","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Fitted-parameters","page":"ImageClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Report","page":"ImageClassifier","title":"Report","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Examples","page":"ImageClassifier","title":"Examples","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"In this example we use MLJFlux and a custom builder to classify the MNIST image dataset.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"using MLJ\nusing Flux\nimport MLJFlux\nimport MLJIteration ## for `skip` control","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"First we want to download the MNIST dataset, and unpack into images and labels:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"import MLDatasets: MNIST\ndata = MNIST(split=:train)\nimages, labels = data.features, data.targets","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"In MLJ, integers cannot be used for encoding categorical data, so we must coerce them into the Multiclass scitype:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"labels = coerce(labels, Multiclass);","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Above images is a single array but MLJFlux requires the images to be a vector of individual image arrays:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"images = coerce(images, GrayImage);\nimages[1]","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"We start by defining a suitable builder object. This is a recipe for building the neural network. Our builder will work for images of any (constant) size, whether they be color or black and white (ie, single or multi-channel). The architecture always consists of six alternating convolution and max-pool layers, and a final dense layer; the filter size and the number of channels after each convolution layer is customizable.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"import MLJFlux\n\nstruct MyConvBuilder\n filter_size::Int\n channels1::Int\n channels2::Int\n channels3::Int\nend\n\nmake2d(x::AbstractArray) = reshape(x, :, size(x)[end])\n\nfunction MLJFlux.build(b::MyConvBuilder, rng, n_in, n_out, n_channels)\n k, c1, c2, c3 = b.filter_size, b.channels1, b.channels2, b.channels3\n mod(k, 2) == 1 || error(\"`filter_size` must be odd. \")\n p = div(k - 1, 2) ## padding to preserve image size\n init = Flux.glorot_uniform(rng)\n front = Chain(\n Conv((k, k), n_channels => c1, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c1 => c2, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c2 => c3, pad=(p, p), relu, init=init),\n MaxPool((2 ,2)),\n make2d)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n return Chain(front, Dense(d, n_out, init=init))\nend","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"It is important to note that in our build function, there is no final softmax. This is applied by default in all MLJFlux classifiers (override this using the finaliser hyperparameter).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Now that our builder is defined, we can instantiate the actual MLJFlux model. If you have a GPU, you can substitute in acceleration=CUDALibs() below to speed up training.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier = @load ImageClassifier pkg=MLJFlux\nclf = ImageClassifier(builder=MyConvBuilder(3, 16, 32, 32),\n batch_size=50,\n epochs=10,\n rng=123)","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"You can add Flux options such as optimiser and loss in the snippet above. Currently, loss must be a flux-compatible loss, and not an MLJ measure.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Next, we can bind the model with the data in a machine, and train using the first 500 images:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"mach = machine(clf, images, labels);\nfit!(mach, rows=1:500, verbosity=2);\nreport(mach)\nchain = fitted_params(mach)\nFlux.params(chain)[2]","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"We can tack on 20 more epochs by modifying the epochs field, and iteratively fit some more:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"clf.epochs = clf.epochs + 20\nfit!(mach, rows=1:500, verbosity=2);","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"We can also make predictions and calculate an out-of-sample loss estimate, using any MLJ measure (loss/score):","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"predicted_labels = predict(mach, rows=501:1000);\ncross_entropy(predicted_labels, labels[501:1000]) |> mean","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"The preceding fit!/predict/evaluate workflow can be alternatively executed as follows:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"evaluate!(mach,\n resampling=Holdout(fraction_train=0.5),\n measure=cross_entropy,\n rows=1:1000,\n verbosity=0)","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"See also NeuralNetworkClassifier.","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/#DeterministicConstantRegressor_MLJModels","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"","category":"section"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"DeterministicConstantRegressor","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"A model type for constructing a deterministic constant regressor, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"DeterministicConstantRegressor = @load DeterministicConstantRegressor pkg=MLJModels","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"Do model = DeterministicConstantRegressor() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/SMOTE_Imbalance/#SMOTE_Imbalance","page":"SMOTE","title":"SMOTE","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Initiate a SMOTE model with the given hyper-parameters.","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"SMOTE","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"A model type for constructing a smote, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"SMOTE = @load SMOTE pkg=Imbalance","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Do model = SMOTE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTE(k=...).","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"SMOTE implements the SMOTE algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.","category":"page"},{"location":"models/SMOTE_Imbalance/#Training-data","page":"SMOTE","title":"Training data","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"mach = machine(model)","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"model = SMOTE()","category":"page"},{"location":"models/SMOTE_Imbalance/#Hyperparameters","page":"SMOTE","title":"Hyperparameters","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"k=5: Number of nearest neighbors to consider in the SMOTE algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/SMOTE_Imbalance/#Transform-Inputs","page":"SMOTE","title":"Transform Inputs","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/SMOTE_Imbalance/#Transform-Outputs","page":"SMOTE","title":"Transform Outputs","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/SMOTE_Imbalance/#Operations","page":"SMOTE","title":"Operations","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"transform(mach, X, y): resample the data X and y using SMOTE, returning both the new and original observations","category":"page"},{"location":"models/SMOTE_Imbalance/#Example","page":"SMOTE","title":"Example","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n## load SMOTE\nSMOTE = @load SMOTE pkg=Imbalance\n\n## wrap the model in a machine\noversampler = SMOTE(k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n","category":"page"},{"location":"adding_models_for_general_use/#Adding-Models-for-General-Use","page":"Adding Models for General Use","title":"Adding Models for General Use","text":"","category":"section"},{"location":"adding_models_for_general_use/","page":"Adding Models for General Use","title":"Adding Models for General Use","text":"To write a complete MLJ model interface for new or existing machine learning models, suitable for addition to the MLJ Model Registry, consult the MLJModelInterface.jl documentation.","category":"page"},{"location":"adding_models_for_general_use/","page":"Adding Models for General Use","title":"Adding Models for General Use","text":"For quick-and-dirty user-defined models see Simple User Defined Models.","category":"page"},{"location":"models/PartLS_PartitionedLS/#PartLS_PartitionedLS","page":"PartLS","title":"PartLS","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"PartLS","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"A model type for fitting a partitioned least squares model to data. Both an MLJ and native interfacew are provided.","category":"page"},{"location":"models/PartLS_PartitionedLS/#MLJ-Interface","page":"PartLS","title":"MLJ Interface","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"PartLS = @load PartLS pkg=PartitionedLS","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).","category":"page"},{"location":"models/PartLS_PartitionedLS/#Training-data","page":"PartLS","title":"Training data","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"where","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"Train the machine using fit!(mach).","category":"page"},{"location":"models/PartLS_PartitionedLS/#Hyper-parameters","page":"PartLS","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).\nP: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.\nη: the regularization parameter. It controls the strength of the regularization.\nϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.\nT: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.\nrng: the random number generator to use.\nIf nothing, the global random number generator rand is used.\nIf an integer, the global number generator rand is used after seeding it with the given integer.\nIf an object of type AbstractRNG, the given random number generator is used.","category":"page"},{"location":"models/PartLS_PartitionedLS/#Operations","page":"PartLS","title":"Operations","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"predict(mach, Xnew): return the predictions of the model on new data Xnew","category":"page"},{"location":"models/PartLS_PartitionedLS/#Fitted-parameters","page":"PartLS","title":"Fitted parameters","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"α: the values of the α variables. For each partition k, it holds the values of the α variables are such that sum_i in P_k alpha_k = 1.\nβ: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.\nt: the intercept term of the model.\nP: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.","category":"page"},{"location":"models/PartLS_PartitionedLS/#Examples","page":"PartLS","title":"Examples","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"PartLS = @load PartLS pkg=PartitionedLS\n\nX = [[1. 2. 3.];\n [3. 3. 4.];\n [8. 1. 3.];\n [5. 3. 1.]]\n\ny = [1.;\n 1.;\n 2.;\n 3.]\n\nP = [[1 0];\n [1 0];\n [0 1]]\n\n\nmodel = PartLS(P=P)\nmach = machine(model, X, y) |> fit!\n\n## predictions on the training set:\npredict(mach, X)\n","category":"page"},{"location":"models/PartLS_PartitionedLS/#Native-Interface","page":"PartLS","title":"Native Interface","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"using PartitionedLS\n\nX = [[1. 2. 3.];\n [3. 3. 4.];\n [8. 1. 3.];\n [5. 3. 1.]]\n\ny = [1.;\n 1.;\n 2.;\n 3.]\n\nP = [[1 0];\n [1 0];\n [0 1]]\n\n\n## fit using the optimal algorithm\nresult = fit(Opt, X, y, P, η = 0.0)\ny_hat = predict(result.model, X)","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"For other fit keyword options, refer to the \"Hyper-parameters\" section for the MLJ interface.","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/#HDBSCAN_MLJScikitLearnInterface","page":"HDBSCAN","title":"HDBSCAN","text":"","category":"section"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"HDBSCAN","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"A model type for constructing a hdbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"HDBSCAN = @load HDBSCAN pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"Do model = HDBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HDBSCAN(min_cluster_size=...).","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection. ","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/#MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"","category":"section"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"MultiTaskElasticNetCVRegressor","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"A model type for constructing a multi-target elastic net regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"MultiTaskElasticNetCVRegressor = @load MultiTaskElasticNetCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"Do model = MultiTaskElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetCVRegressor(l1_ratio=...).","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskElasticNetCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"l1_ratio = 0.5\neps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nmax_iter = 1000\ntol = 0.0001\ncv = 5\ncopy_X = true\nverbose = 0\nn_jobs = nothing\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/#XGBoostRegressor_XGBoost","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"","category":"section"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"XGBoostRegressor","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"A model type for constructing a eXtreme Gradient Boosting Regressor, based on XGBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"XGBoostRegressor = @load XGBoostRegressor pkg=XGBoost","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"Do model = XGBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostRegressor(test=...).","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"Univariate continuous regression using xgboost.","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/#Training-data","page":"XGBoostRegressor","title":"Training data","text":"","category":"section"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"m = machine(model, X, y)","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"where","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"X: any table of input features whose columns have Continuous element scitype; check column scitypes with schema(X).\ny: is an AbstractVector target with Continuous elements; check the scitype with scitype(y).","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"Train using fit!(m, rows=...).","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/#Hyper-parameters","page":"XGBoostRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"See https://xgboost.readthedocs.io/en/stable/parameter.html.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#LinearCountRegressor_GLM","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"LinearCountRegressor","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"A model type for constructing a linear count regressor, based on GLM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"LinearCountRegressor = @load LinearCountRegressor pkg=GLM","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"Do model = LinearCountRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearCountRegressor(fit_intercept=...).","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"LinearCountRegressor is a generalized linear model, specialised to the case of a Count target variable (non-negative, unbounded integer) with user-specified link function. Options exist to specify an intercept or offset feature.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Training-data","page":"LinearCountRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"Here","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Count; check the scitype with schema(y)\nw: is a vector of Real per-observation weights","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Hyper-parameters","page":"LinearCountRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)\ndistribution=Distributions.Poisson(): The distribution which the residuals/errors of the model should fit.\nlink=GLM.LogLink(): The function which links the linear prediction function to the probability of a particular outcome or class. This should be one of the following: GLM.IdentityLink(), GLM.InverseLink(), GLM.InverseSquareLink(), GLM.LogLink(), GLM.SqrtLink().\noffsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.\nmaxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.\natol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nrtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nminstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.\nreport_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Operations","page":"LinearCountRegressor","title":"Operations","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.\npredict_mean(mach, Xnew): instead return the mean of each prediction above\npredict_median(mach, Xnew): instead return the median of each prediction above.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Fitted-parameters","page":"LinearCountRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"features: The names of the features encountered during model fitting.\ncoef: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Report","page":"LinearCountRegressor","title":"Report","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares\ndof_residual: The degrees of freedom for residuals, when meaningful.\nstderror: The standard errors of the coefficients.\nvcov: The estimated variance-covariance matrix of the coefficient estimates.\ncoef_table: Table which displays coefficients and summarizes their significance and confidence intervals.\nglm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Examples","page":"LinearCountRegressor","title":"Examples","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"using MLJ\nimport MLJ.Distributions.Poisson\n\n## Generate some data whose target y looks Poisson when conditioned on\n## X:\nN = 10_000\nw = [1.0, -2.0, 3.0]\nmu(x) = exp(w'x) ## mean for a log link function\nXmat = rand(N, 3)\nX = MLJ.table(Xmat)\ny = map(1:N) do i\n x = Xmat[i, :]\n rand(Poisson(mu(x)))\nend;\n\nCountRegressor = @load LinearCountRegressor pkg=GLM\nmodel = CountRegressor(fit_intercept=false)\nmach = machine(model, X, y)\nfit!(mach)\n\nXnew = MLJ.table(rand(3, 3))\nyhat = predict(mach, Xnew)\nyhat_point = predict_mean(mach, Xnew)\n\n## get coefficients approximating `w`:\njulia> fitted_params(mach).coef\n3-element Vector{Float64}:\n 0.9969008753103842\n -2.0255901752504775\n 3.014407534033522\n\nreport(mach)","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"See also LinearRegressor, LinearBinaryClassifier","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/#ElasticNetCVRegressor_MLJScikitLearnInterface","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"","category":"section"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"ElasticNetCVRegressor","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"A model type for constructing a elastic net regression with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"ElasticNetCVRegressor = @load ElasticNetCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"Do model = ElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetCVRegressor(l1_ratio=...).","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"ElasticNetCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"l1_ratio = 0.5\neps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nprecompute = auto\nmax_iter = 1000\ntol = 0.0001\ncv = 5\ncopy_X = true\nverbose = 0\nn_jobs = nothing\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#NeuralNetworkRegressor_BetaML","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"mutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#Parameters:","page":"NeuralNetworkRegressor","title":"Parameters:","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers\nloss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices, even if the regression task is 1-D\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]\nbatch_size: Size of each individual batch [def: 16]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ndescr: An optional title and/or description for this model\ncb: A call back function to provide information during training [def: fitting_info]\nrng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#Notes:","page":"NeuralNetworkRegressor","title":"Notes:","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"data must be numerical\nthe label should be be a n-records vector.","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#Example:","page":"NeuralNetworkRegressor","title":"Example:","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load NeuralNetworkRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Nn.NeuralNetworkRegressor\n\njulia> layers = [BetaML.DenseLayer(12,20,f=BetaML.relu),BetaML.DenseLayer(20,20,f=BetaML.relu),BetaML.DenseLayer(20,1,f=BetaML.relu)];\n\njulia> model = modelType(layers=layers,opt_alg=BetaML.ADAM());\nNeuralNetworkRegressor(\n layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.23249759178069676 -0.4125090172711131 … 0.41401934928739 -0.33017881111237535; -0.27912169279319965 0.270551221249931 … 0.19258414323473344 0.1703002982374256; … ; 0.31186742456482447 0.14776438287394805 … 0.3624993442655036 0.1438885872964824; 0.24363744610286758 -0.3221033024934767 … 0.14886090419299408 0.038411663101909355], [-0.42360286004241765, -0.34355377040029594, 0.11510963232946697, 0.29078650404397893, -0.04940236502546075, 0.05142849152316714, -0.177685375947775, 0.3857630523957018, -0.25454667127064756, -0.1726731848206195, 0.29832456225553444, -0.21138505291162835, -0.15763643112604903, -0.08477044513587562, -0.38436681165349196, 0.20538016429104916, -0.25008157754468335, 0.268681800562054, 0.10600581996650865, 0.4262194464325672], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.08534180387478185 0.19659398307677617 … -0.3413633217504578 -0.0484925247381256; 0.0024419192794883915 -0.14614102508129 … -0.21912059923003044 0.2680725396694708; … ; 0.25151545823147886 -0.27532269951606037 … 0.20739970895058063 0.2891938885916349; -0.1699020711688904 -0.1350423717084296 … 0.16947589410758873 0.3629006047373296], [0.2158116357688406, -0.3255582642532289, -0.057314442103850394, 0.29029696770539953, 0.24994080694366455, 0.3624239027782297, -0.30674318230919984, -0.3854738338935017, 0.10809721838554087, 0.16073511121016176, -0.005923262068960489, 0.3157147976348795, -0.10938918304264739, -0.24521229198853187, -0.307167732178712, 0.0808907777008302, -0.014577497150872254, -0.0011287181458157214, 0.07522282588658086, 0.043366500526073104], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.021367697115938555 -0.28326652172347155 … 0.05346175368370165 -0.26037328415871647], [-0.2313659199724562], BetaML.Utils.relu, BetaML.Utils.drelu)], \n loss = BetaML.Utils.squared_cost, \n dloss = BetaML.Utils.dsquared_cost, \n epochs = 100, \n batch_size = 32, \n opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var\"#90#93\"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]), \n shuffle = true, \n descr = \"\", \n cb = BetaML.Nn.fitting_info, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> ŷ = predict(mach, X);\n\njulia> hcat(y,ŷ)\n506×2 Matrix{Float64}:\n 24.0 30.7726\n 21.6 28.0811\n 34.7 31.3194\n ⋮ \n 23.9 30.9032\n 22.0 29.49\n 11.9 27.2438","category":"page"},{"location":"learning_mlj/#Learning-MLJ","page":"Learning MLJ","title":"Learning MLJ","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"MLJ Cheatsheet","category":"page"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"See also Getting help and reporting problems.","category":"page"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"The present document, although littered with examples, is primarily intended as a complete reference. ","category":"page"},{"location":"learning_mlj/#Where-to-start?","page":"Learning MLJ","title":"Where to start?","text":"","category":"section"},{"location":"learning_mlj/#Completely-new-to-Julia?","page":"Learning MLJ","title":"Completely new to Julia?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Julia's learning resources page | Learn X in Y minutes | HelloJulia","category":"page"},{"location":"learning_mlj/#New-to-data-science?","page":"Learning MLJ","title":"New to data science?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Julia Data Science","category":"page"},{"location":"learning_mlj/#New-to-machine-learning?","page":"Learning MLJ","title":"New to machine learning?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Introduction to Statistical Learning with Julia versions of the R labs here","category":"page"},{"location":"learning_mlj/#Know-some-ML-and-just-want-MLJ-basics?","page":"Learning MLJ","title":"Know some ML and just want MLJ basics?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Getting Started | Common MLJ Workflows","category":"page"},{"location":"learning_mlj/#An-ML-practitioner-transitioning-from-another-platform?","page":"Learning MLJ","title":"An ML practitioner transitioning from another platform?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"MLJ for Data Scientists in Two Hours | MLJTutorial","category":"page"},{"location":"learning_mlj/#Other-resources","page":"Learning MLJ","title":"Other resources","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Data Science Tutorials: MLJ tutorials including end-to-end examples, and \"Introduction to Statistical Learning\" labs\nMLCourse: Teaching material for an introductory machine learning course at EPFL (for an interactive preview see here).\nJulia Boards the Titanic Blog post on using MLJ for users new to Julia. \nAnalyzing the Glass Dataset: A gentle introduction to data science using Julia and MLJ (three-part blog post)\nLightning Tour: A compressed demonstration of key MLJ functionality\nMLJ JuliaCon2020 Workshop: older version of MLJTutorial with video\nLearning Networks: For advanced MLJ users wanting to wrap workflows more complicated than linear pipelines\nMachine Learning Property Loans for Fun and Profit - Blog post demonstrating the use of MLJ to predict prospects for investment in property development loans. \nPredicting a Successful Mt Everest Climb - Blog post using MLJ to discover factors correlating with success in expeditions to climb the world's highest peak.","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/#LassoLarsICRegressor_MLJScikitLearnInterface","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"","category":"section"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"LassoLarsICRegressor","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"A model type for constructing a Lasso model with LARS using BIC or AIC for model selection, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"LassoLarsICRegressor = @load LassoLarsICRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"Do model = LassoLarsICRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsICRegressor(criterion=...).","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoLarsICRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"criterion = aic\nfit_intercept = true\nverbose = false\nnormalize = false\nprecompute = auto\nmax_iter = 500\neps = 2.220446049250313e-16\ncopy_X = true\npositive = false","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/#GaussianMixtureRegressor_BetaML","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"","category":"section"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"mutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor.","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/#Hyperparameters:","page":"GaussianMixtureRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to \"gived\" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nmaximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/#Example:","page":"GaussianMixtureRegressor","title":"Example:","text":"","category":"section"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load GaussianMixtureRegressor pkg = \"BetaML\" verbosity=0\nBetaML.GMM.GaussianMixtureRegressor\n\njulia> model = modelType()\nGaussianMixtureRegressor(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"kmeans\", \n maximum_iterations = 9223372036854775807, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(GaussianMixtureRegressor(n_classes = 3, …), …).\nIter. 1: Var. of the post 21.74887448784976 Log-likelihood -21687.09917379566\n\njulia> ŷ = predict(mach, X)\n506-element Vector{Float64}:\n 24.703442835305577\n 24.70344283512716\n ⋮\n 17.172486989759676\n 17.172486989759644","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#MultitargetNeuralNetworkRegressor_MLJFlux","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"A model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Do model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Training-data","page":"MultitargetNeuralNetworkRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Hyper-parameters","page":"MultitargetNeuralNetworkRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Operations","page":"MultitargetNeuralNetworkRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Fitted-parameters","page":"MultitargetNeuralNetworkRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Report","page":"MultitargetNeuralNetworkRegressor","title":"Report","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Examples","page":"MultitargetNeuralNetworkRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"In this example we apply a multi-target regression model to synthetic data:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"using MLJ\nimport MLJFlux\nusing Flux","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"First, we generate some synthetic data (needs MLJBase 0.20.16 or higher):","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"X, y = make_regression(100, 9; n_targets = 2) ## both tables\nschema(y)\nschema(X)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Splitting off a test set:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Next, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features and n_out the number of target variables (both known at fit! time), while rng is a proxy for a RNG (which will be passed from the rng field of model defined below).","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"builder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Instantiating the regression model:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor\nmodel = MultitargetNeuralNetworkRegressor(builder=builder, rng=123, epochs=20)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"We will arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"pipe = Standardizer |> TransformedTargetModel(model, target=Standardizer)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"If we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"mach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n## first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"For experimenting with learning rate, see the NeuralNetworkRegressor example.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"pipe.transformed_target_model_deterministic.model.optimiser.eta = 0.0001","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"With the learning rate fixed, we can now compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"## custom MLJ loss:\nmulti_loss(yhat, y) = l2(MLJ.matrix(yhat), MLJ.matrix(y)) |> mean\n\n## CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=multi_loss)\n\n## loss for `(Xtest, test)`:\nfit!(mach) ## trains on all data `(X, y)`\nyhat = predict(mach, Xtest)\nmulti_loss(yhat, ytest)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"See also NeuralNetworkRegressor","category":"page"},{"location":"models/Standardizer_MLJModels/#Standardizer_MLJModels","page":"Standardizer","title":"Standardizer","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Standardizer","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Standardizer = @load Standardizer pkg=MLJModels","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.","category":"page"},{"location":"models/Standardizer_MLJModels/#Training-data","page":"Standardizer","title":"Training data","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"mach = machine(model, X)","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"where","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/Standardizer_MLJModels/#Hyper-parameters","page":"Standardizer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"features: one of the following, with the behavior indicated below:\n[] (empty, the default): standardize all features (columns) having Continuous element scitype\nnon-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).\nfunction or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.\nNote this behavior is further modified if the ordered_factor or count flags are set to true; see below\nignore=false: whether to ignore or standardize specified features, as explained above\nordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above\ncount=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above","category":"page"},{"location":"models/Standardizer_MLJModels/#Operations","page":"Standardizer","title":"Operations","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.\ninverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.","category":"page"},{"location":"models/Standardizer_MLJModels/#Fitted-parameters","page":"Standardizer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"features_fit - the names of features that will be standardized\nmeans - the corresponding untransformed mean values\nstds - the corresponding untransformed standard deviations","category":"page"},{"location":"models/Standardizer_MLJModels/#Report","page":"Standardizer","title":"Report","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"The fields of report(mach) are:","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"features_fit: the names of features that will be standardized","category":"page"},{"location":"models/Standardizer_MLJModels/#Examples","page":"Standardizer","title":"Examples","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"using MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([:x, :y, :x], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\njulia> schema(X)\n┌──────────┬──────────────────┐\n│ names │ scitypes │\n├──────────┼──────────────────┤\n│ ordinal1 │ Count │\n│ ordinal2 │ OrderedFactor{2} │\n│ ordinal3 │ Continuous │\n│ ordinal4 │ Continuous │\n│ nominal │ Multiclass{3} │\n└──────────┴──────────────────┘\n\nstand1 = Standardizer();\n\njulia> transform(fit!(machine(stand1, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [-1.0, 0.0, 1.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\nstand2 = Standardizer(features=[:ordinal3, ], ignore=true, count=true);\n\njulia> transform(fit!(machine(stand2, X)), X)\n(ordinal1 = [-1.0, 0.0, 1.0],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"See also OneHotEncoder, ContinuousEncoder.","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/#BernoulliNBClassifier_MLJScikitLearnInterface","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"","category":"section"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"BernoulliNBClassifier","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"A model type for constructing a Bernoulli naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"BernoulliNBClassifier = @load BernoulliNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"Do model = BernoulliNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BernoulliNBClassifier(alpha=...).","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"Binomial naive bayes classifier. It is suitable for classification with binary features; features will be binarized based on the binarize keyword (unless it's nothing in which case the features are assumed to be binary).","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#MultitargetKNNRegressor_NearestNeighborModels","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"MultitargetKNNRegressor","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"A model type for constructing a multitarget K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"MultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Do model = MultitargetKNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNRegressor(K=...).","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Multi-target K-Nearest Neighbors regressor (MultitargetKNNRegressor) is a variation of KNNRegressor that assumes the target variable is vector-valued with Continuous components. (Target data must be presented as a table, however.)","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Training-data","page":"MultitargetKNNRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"OR","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any table of responses whose element scitype is Continuous; check column scitypes with schema(y).\nw is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Hyper-parameters","page":"MultitargetKNNRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Operations","page":"MultitargetKNNRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Fitted-parameters","page":"MultitargetKNNRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Examples","page":"MultitargetKNNRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"using MLJ\n\n## Create Data\nX, y = make_regression(10, 5, n_targets=2)\n\n## load MultitargetKNNRegressor\nMultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels\n\n## view possible kernels\nNearestNeighborModels.list_kernels()\n\n## MutlitargetKNNRegressor instantiation\nmodel = MultitargetKNNRegressor(weights = NearestNeighborModels.Inverse())\n\n## Wrap model and required data in an MLJ machine and fit.\nmach = machine(model, X, y) |> fit! \n\n## Predict\ny_hat = predict(mach, X)\n","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"See also KNNRegressor","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#ABODDetector_OutlierDetectionNeighbors","page":"ABODDetector","title":"ABODDetector","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"ABODDetector(k = 5,\n metric = Euclidean(),\n algorithm = :kdtree,\n static = :auto,\n leafsize = 10,\n reorder = true,\n parallel = false,\n enhanced = false)","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Determine outliers based on the angles to its nearest neighbors. This implements the FastABOD variant described in the paper, that is, it uses the variance of angles to its nearest neighbors, not to the whole dataset, see [1]. ","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Notice: The scores are inverted, to conform to our notion that higher scores describe higher outlierness.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#Parameters","page":"ABODDetector","title":"Parameters","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"k::Integer","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"metric::Metric","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"leafsize::Int","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"reorder::Bool","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"parallel::Bool","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"enhanced::Bool","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"When enhanced=true, it uses the enhanced ABOD (EABOD) adaptation proposed by [2].","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#Examples","page":"ABODDetector","title":"Examples","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"using OutlierDetection: ABODDetector, fit, transform\ndetector = ABODDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#References","page":"ABODDetector","title":"References","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"[1] Kriegel, Hans-Peter; S hubert, Matthias; Zimek, Arthur (2008): Angle-based outlier detection in high-dimensional data.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"[2] Li, Xiaojie; Lv, Jian Cheng; Cheng, Dongdong (2015): Angle-Based Outlier Detection Algorithm with More Stable Relationships.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#TfidfTransformer_MLJText","page":"TfidfTransformer","title":"TfidfTransformer","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"TfidfTransformer","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"A model type for constructing a TF-IFD transformer, based on MLJText.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"TfidfTransformer = @load TfidfTransformer pkg=MLJText","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Do model = TfidfTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TfidfTransformer(max_doc_freq=...).","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of TF-IDF scores. Here \"TF\" means term-frequency while \"IDF\" means inverse document frequency (defined below). The TF-IDF score is the product of the two. This is a common term weighting scheme in information retrieval, that has also found good use in document classification. The goal of using TF-IDF instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Training-data","page":"TfidfTransformer","title":"Training data","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"mach = machine(model, X)","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Here:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:\nA vector of abstract strings (tokens), e.g., [\"I\", \"like\", \"Sam\", \".\", \"Sam\", \"is\", \"nice\", \".\"] (scitype AbstractVector{Textual})\nA dictionary of counts, indexed on abstract strings, e.g., Dict(\"I\"=>1, \"Sam\"=>2, \"Sam is\"=>1) (scitype Multiset{Textual}})\nA dictionary of counts, indexed on plain ngrams, e.g., Dict((\"I\",)=>1, (\"Sam\",)=>2, (\"I\", \"Sam\")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Hyper-parameters","page":"TfidfTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.\nmin_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.\nsmooth_idf=true: Control which definition of IDF to use (see above).","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Operations","page":"TfidfTransformer","title":"Operations","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"transform(mach, Xnew): Based on the vocabulary and IDF learned in training, return the matrix of TF-IDF scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Fitted-parameters","page":"TfidfTransformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"vocab: A vector containing the strings used in the transformer's vocabulary.\nidf_vector: The transformer's calculated IDF vector.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Examples","page":"TfidfTransformer","title":"Examples","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"TfidfTransformer accepts a variety of inputs. The example below transforms tokenized documents:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"using MLJ\nimport TextAnalysis\n\nTfidfTransformer = @load TfidfTransformer pkg=MLJText\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ntfidf_transformer = TfidfTransformer()\n\njulia> tokenized_docs = TextAnalysis.tokenize.(docs)\n2-element Vector{Vector{String}}:\n [\"Hi\", \"my\", \"name\", \"is\", \"Sam\", \".\"]\n [\"How\", \"are\", \"you\", \"today\", \"?\"]\n\nmach = machine(tfidf_transformer, tokenized_docs)\nfit!(mach)\n\nfitted_params(mach)\n\ntfidf_mat = transform(mach, tokenized_docs)","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Alternatively, one can provide documents pre-parsed as ngrams counts:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"using MLJ\nimport TextAnalysis\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncorpus = TextAnalysis.Corpus(TextAnalysis.NGramDocument.(docs, 1, 2))\nngram_docs = TextAnalysis.ngrams.(corpus)\n\njulia> ngram_docs[1]\nDict{AbstractString, Int64} with 11 entries:\n \"is\" => 1\n \"my\" => 1\n \"name\" => 1\n \".\" => 1\n \"Hi\" => 1\n \"Sam\" => 1\n \"my name\" => 1\n \"Hi my\" => 1\n \"name is\" => 1\n \"Sam .\" => 1\n \"is Sam\" => 1\n\ntfidf_transformer = TfidfTransformer()\nmach = machine(tfidf_transformer, ngram_docs)\nMLJ.fit!(mach)\nfitted_params(mach)\n\ntfidf_mat = transform(mach, ngram_docs)","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"See also CountTransformer, BM25Transformer","category":"page"},{"location":"machines/#Machines","page":"Machines","title":"Machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Recall from Getting Started that a machine binds a model (i.e., a choice of algorithm + hyperparameters) to data (see more at Constructing machines below). A machine is also the object storing learned parameters. Under the hood, calling fit! on a machine calls either MLJBase.fit or MLJBase.update, depending on the machine's internal state (as recorded in private fields old_model and old_rows). These lower-level fit and update methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows keyword argument of fit! (all rows by default).","category":"page"},{"location":"machines/#Warm-restarts","page":"Machines","title":"Warm restarts","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"If a model update method has been implemented for the model, calls to fit! will avoid redundant calculations for certain kinds of model mutations. The main use-case is increasing an iteration parameter, such as the number of epochs in a neural network. To test if SomeIterativeModel supports this feature, check iteration_parameter(SomeIterativeModel) is different from nothing.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"using MLJ; color_off() # hide\ntree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()\nforest = EnsembleModel(model=tree, n=10);\nX, y = @load_iris;\nmach = machine(forest, X, y)\nfit!(mach, verbosity=2);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Generally, changing a hyperparameter triggers retraining on calls to subsequent fit!:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"forest.bagging_fraction = 0.5;\nfit!(mach, verbosity=2);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"However, for this iterative model, increasing the iteration parameter only adds models to the existing ensemble:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"forest.n = 15;\nfit!(mach, verbosity=2);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Call fit! again without making a change and no retraining occurs:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"However, retraining can be forced:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach, force=true);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"And is re-triggered if the view of the data changes:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach, rows=1:100);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach, rows=1:100);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"If an iterative model exposes its iteration parameter as a hyperparameter, and it implements the warm restart behavior above, then it can be wrapped in a \"control strategy\", like an early stopping criterion. See Controlling Iterative Models for details.","category":"page"},{"location":"machines/#Inspecting-machines","page":"Machines","title":"Inspecting machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"There are two principal methods for inspecting the outcomes of training in MLJ. To obtain a named-tuple describing the learned parameters (in a user-friendly way where possible) use fitted_params(mach). All other training-related outcomes are inspected with report(mach).","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"X, y = @load_iris\npca = (@load PCA verbosity=0)()\nmach = machine(pca, X)\nfit!(mach)","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fitted_params(mach)\nreport(mach)","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fitted_params(::Machine)\nreport(::Machine)","category":"page"},{"location":"machines/#MLJModelInterface.fitted_params-Tuple{Machine}","page":"Machines","title":"MLJModelInterface.fitted_params","text":"fitted_params(mach)\n\nReturn the learned parameters for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using the pipeline syntax model1 |> model2 |> ..., then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\njulia> using MLJ\njulia> @load LogisticClassifier pkg=MLJLinearModels\njulia> X, y = @load_crabs;\njulia> pipe = Standardizer() |> LogisticClassifier();\njulia> mach = machine(pipe, X, y) |> fit!;\n\njulia> fitted_params(mach).logistic_classifier\n(classes = CategoricalArrays.CategoricalValue{String,UInt32}[\"B\", \"O\"],\n coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471],\n intercept = 0.0883301599726305,)\n\nSee also report\n\n\n\n\n\n","category":"method"},{"location":"machines/#MLJBase.report-Tuple{Machine}","page":"Machines","title":"MLJBase.report","text":"report(mach)\n\nReturn the report for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using the pipeline syntax model1 |> model2 |> ..., then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\njulia> using MLJ\njulia> @load LinearBinaryClassifier pkg=GLM\njulia> X, y = @load_crabs;\njulia> pipe = Standardizer() |> LinearBinaryClassifier();\njulia> mach = machine(pipe, X, y) |> fit!;\n\njulia> report(mach).linear_binary_classifier\n(deviance = 3.8893386087844543e-7,\n dof_residual = 195.0,\n stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],\n vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)\n\n\nSee also fitted_params\n\n\n\n\n\n","category":"method"},{"location":"machines/#Training-losses-and-feature-importances","page":"Machines","title":"Training losses and feature importances","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Training losses and feature importances, if reported by a model, will be available in the machine's report (see above). However, there are also direct access methods where supported:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"training_losses(mach::Machine) -> vector_of_losses","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Here vector_of_losses will be in historical order (most recent loss last). This kind of access is supported for model = mach.model if supports_training_losses(model) == true.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"feature_importances(mach::Machine) -> vector_of_pairs","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Here a vector_of_pairs is a vector of elements of the form feature => importance_value, where feature is a symbol. For example, vector_of_pairs = [:gender => 0.23, :height => 0.7, :weight => 0.1]. If a model does not support feature importances for some model hyperparameters, every importance_value will be zero. This kind of access is supported for model = mach.model if reports_feature_importances(model) == true.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"If a model can report multiple types of feature importances, then there will be a model hyper-parameter controlling the active type.","category":"page"},{"location":"machines/#Constructing-machines","page":"Machines","title":"Constructing machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"A machine is constructed with the syntax machine(model, args...) where the possibilities for args (called training arguments) are summarized in the table below. Here X and y represent inputs and target, respectively, and Xout is the output of a transform call. Machines for supervised models may have additional training arguments, such as a vector of per-observation weights (in which case supports_weights(model) == true).","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"model supertype machine constructor calls operation calls (first compulsory)\nDeterministic <: Supervised machine(model, X, y, extras...) predict(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)\nProbabilistic <: Supervised machine(model, X, y, extras...) predict(mach, Xnew), predict_mean(mach, Xnew), predict_median(mach, Xnew), predict_mode(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)\nUnsupervised (except Static) machine(model, X) transform(mach, Xnew), inverse_transform(mach, Xout), predict(mach, Xnew)\nStatic machine(model) transform(mach, Xnews...), inverse_transform(mach, Xout)","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"All operations on machines (predict, transform, etc) have exactly one argument (Xnew or Xout above) after mach, the machine instance. An exception is a machine bound to a Static model, which can have any number of arguments after mach. For more on Static transformers (which have no training arguments) see Static transformers.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"A machine is reconstructed from a file using the syntax machine(\"my_machine.jlso\"), or machine(\"my_machine.jlso\", args...) if retraining using new data. See Saving machines below.","category":"page"},{"location":"machines/#Lowering-memory-demands","page":"Machines","title":"Lowering memory demands","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"For large data sets, you may be able to save memory by suppressing data caching that some models perform to increase speed. To do this, specify cache=false, as in","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"machine(model, X, y, cache=false)","category":"page"},{"location":"machines/#Constructing-machines-in-learning-networks","page":"Machines","title":"Constructing machines in learning networks","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Instead of data X, y, etc, the machine constructor is provided Node or Source objects (\"dynamic data\") when building a learning network. See Learning Networks for more on this advanced feature.","category":"page"},{"location":"machines/#Saving-machines","page":"Machines","title":"Saving machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Users can save and restore MLJ machines using any external serialization package by suitably preparing their Machine object, and applying a post-processing step to the deserialized object. This is explained under Using an arbitrary serializer below.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"However, if a user is happy to use Julia's standard library Serialization module, there is a simplified workflow described first.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"The usual serialization provisos apply. For example, when deserializing you need to have all code on which the serialization object depended loaded at the time of deserialization also. If a hyper-parameter happens to be a user-defined function, then that function must be defined at deserialization. And you should only deserialize objects from trusted sources.","category":"page"},{"location":"machines/#Using-Julia's-native-serializer","page":"Machines","title":"Using Julia's native serializer","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"MLJBase.save","category":"page"},{"location":"machines/#MLJModelInterface.save","page":"Machines","title":"MLJModelInterface.save","text":"MLJ.save(filename, mach::Machine)\nMLJ.save(io, mach::Machine)\n\nMLJBase.save(filename, mach::Machine)\nMLJBase.save(io, mach::Machine)\n\nSerialize the machine mach to a file with path filename, or to an input/output stream io (at least IOBuffer instances are supported) using the Serialization module.\n\nTo serialise using a different format, see serializable.\n\nMachines are deserialized using the machine constructor as shown in the example below.\n\nThe implementation of save for machines changed in MLJ 0.18 (MLJBase 0.20). You can only restore a machine saved using older versions of MLJ using an older version.\n\nExample\n\nusing MLJ\nTree = @load DecisionTreeClassifier\nX, y = @load_iris\nmach = fit!(machine(Tree(), X, y))\n\nMLJ.save(\"tree.jls\", mach)\nmach_predict_only = machine(\"tree.jls\")\npredict(mach_predict_only, X)\n\n# using a buffer:\nio = IOBuffer()\nMLJ.save(io, mach)\nseekstart(io)\npredict_only_mach = machine(io)\npredict(predict_only_mach, X)\n\nwarning: Only load files from trusted sources\nMaliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.\n\nSee also serializable, machine.\n\n\n\n\n\n","category":"function"},{"location":"machines/#Using-an-arbitrary-serializer","page":"Machines","title":"Using an arbitrary serializer","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Since machines contain training data, serializing a machine directly is not recommended. Also, the learned parameters of models implemented in a language other than Julia may not have persistent representations, which means serializing them is useless. To address these two issues, users:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Call serializable(mach) on a machine mach they wish to save (to remove data and create persistent learned parameters)\nSerialize the returned object using SomeSerializationPkg","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"To restore the original machine (minus training data) they:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Deserialize using SomeSerializationPkg to obtain a new object mach\nCall restore!(mach) to ensure mach can be used to predict or transform new data.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"MLJBase.serializable\nMLJBase.restore!","category":"page"},{"location":"machines/#MLJBase.serializable","page":"Machines","title":"MLJBase.serializable","text":"serializable(mach::Machine)\n\nReturns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.\n\nAny general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.\n\nIf using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.\n\nA machine returned by serializable is characterized by the property mach.state == -1.\n\nExample using JLSO\n\nusing MLJ\nusing JLSO\nTree = @load DecisionTreeClassifier\ntree = Tree()\nX, y = @load_iris\nmach = fit!(machine(tree, X, y))\n\n# This machine can now be serialized\nsmach = serializable(mach)\nJLSO.save(\"machine.jlso\", :machine => smach)\n\n# Deserialize and restore learned parameters to useable form:\nloaded_mach = JLSO.load(\"machine.jlso\")[:machine]\nrestore!(loaded_mach)\n\npredict(loaded_mach, X)\npredict(mach, X)\n\nSee also restore!, MLJBase.save.\n\n\n\n\n\n","category":"function"},{"location":"machines/#MLJBase.restore!","page":"Machines","title":"MLJBase.restore!","text":"restore!(mach::Machine)\n\nRestore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.\n\nFor an example see serializable.\n\n\n\n\n\n","category":"function"},{"location":"machines/#Internals","page":"Machines","title":"Internals","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"For a supervised machine, the predict method calls a lower-level MLJBase.predict method, dispatched on the underlying model and the fitresult (see below). To see predict in action, as well as its unsupervised cousins transform and inverse_transform, see Getting Started.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Except for model, a Machine instance has several fields which the user should not directly access; these include:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"model - the struct containing the hyperparameters to be used in calls to fit!\nfitresult - the learned parameters in a raw form, initially undefined\nargs - a tuple of the data, each element wrapped in a source node; see Learning Networks (in the supervised learning example above, args = (source(X), source(y)))\nreport - outputs of training not encoded in fitresult (eg, feature rankings), initially undefined\nold_model - a deep copy of the model used in the last call to fit!\nold_rows - a copy of the row indices used in the last call to fit!\ncache","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"The interested reader can learn more about machine internals by examining the simplified code excerpt in Internals.","category":"page"},{"location":"machines/#API-Reference","page":"Machines","title":"API Reference","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"MLJBase.machine\nfit!\nfit_only!","category":"page"},{"location":"machines/#MLJBase.machine","page":"Machines","title":"MLJBase.machine","text":"machine(model, args...; cache=true, scitype_check_level=1)\n\nConstruct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:\n\nusing MLJModels\nX, y = make_regression()\n\nPCA = @load PCA pkg=MultivariateStats\nmodel = PCA()\nmach = machine(model, X)\nfit!(mach, rows=1:50)\ntransform(mach, selectrows(X, 51:100)) # or transform(mach, rows=51:100)\n\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nmodel = DecisionTreeRegressor()\nmach = machine(model, X, y)\nfit!(mach, rows=1:50)\npredict(mach, selectrows(X, 51:100)) # or predict(mach, rows=51:100)\n\nSpecify cache=false to prioritize memory management over speed.\n\nWhen building a learning network, Node objects can be substituted for the concrete data but no type or dimension checks are applied.\n\nChecks on the types of training data\n\nA model articulates its data requirements using scientific types, i.e., using the scitype function instead of the typeof function.\n\nIf scitype_check_level > 0 then the scitype of each arg in args is computed, and this is compared with the scitypes expected by the model, unless args contains Unknown scitypes and scitype_check_level < 4, in which case no further action is taken. Whether warnings are issued or errors thrown depends the level. For details, see default_scitype_check_level, a method to inspect or change the default level (1 at startup).\n\nMachines with model placeholders\n\nA symbol can be substituted for a model in machine constructors to act as a placeholder for a model specified at training time. The symbol must be the field name for a struct whose corresponding value is a model, as shown in the following example:\n\nmutable struct MyComposite\n transformer\n classifier\nend\n\nmy_composite = MyComposite(Standardizer(), ConstantClassifier)\n\nX, y = make_blobs()\nmach = machine(:classifier, X, y)\nfit!(mach, composite=my_composite)\n\nThe last two lines are equivalent to\n\nmach = machine(ConstantClassifier(), X, y)\nfit!(mach)\n\nDelaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.\n\nSee also fit!, default_scitype_check_level, MLJBase.save, serializable.\n\n\n\n\n\n","category":"function"},{"location":"machines/#StatsAPI.fit!","page":"Machines","title":"StatsAPI.fit!","text":"fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)\n\nFit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.\n\nTo attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!\n\n\n\n\n\nfit!(N::Node;\n rows=nothing,\n verbosity=1,\n force=false,\n acceleration=CPU1())\n\nTrain all machines required to call the node N, in an appropriate order, but parallelizing where possible using specified acceleration mode. These machines are those returned by machines(N).\n\nSupported modes of acceleration: CPU1(), CPUThreads().\n\n\n\n\n\n","category":"function"},{"location":"machines/#MLJBase.fit_only!","page":"Machines","title":"MLJBase.fit_only!","text":"MLJBase.fit_only!(\n mach::Machine;\n rows=nothing,\n verbosity=1,\n force=false,\n composite=nothing,\n)\n\nWithout mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:\n\nAb initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.\nTraining update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.\nNo-operation. Leave existing learned parameters untouched. Do not increment mach.state.\n\nIf the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.\n\nTraining action logic\n\nFor the action to be a no-operation, either mach.frozen == true or or none of the following apply:\n\n(i) mach has never been trained (mach.state == 0).\n(ii) force == true.\n(iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).\n(iv) The specified rows have changed since the last retraining and mach.model does not have Static type.\n(v) mach.model is a model and different from the last model used for training, but has the same type.\n(vi) mach.model is a model but has a type different from the last model used for training.\n(vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.\n(viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.\n\nIn any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.\n\nTo freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).\n\nImplementation details\n\nThe data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the \"Adding models for general use\" section of the MLJ documentation for more on these lower-level training methods.\n\n\n\n\n\n","category":"function"},{"location":"models/AutoEncoder_BetaML/#AutoEncoder_BetaML","page":"AutoEncoder","title":"AutoEncoder","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"mutable struct AutoEncoder <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks","category":"page"},{"location":"models/AutoEncoder_BetaML/#Parameters:","page":"AutoEncoder","title":"Parameters:","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"encoded_size: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def: 0.33]\nlayers_size: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def: nothing that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. If e_layers or d_layers are specified, this parameter is ignored for the respective part.\ne_layers: The layers (vector of AbstractLayers) responsable of the encoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers\nd_layers: The layers (vector of AbstractLayers) responsable of the decoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers\nloss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as (n x d) matrices.\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dsquared_cost if loss==squared_cost, nothing otherwise, i.e. use the derivative of the squared cost or autodiff]\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]\nbatch_size: Size of each individual batch [def: 8]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()] See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ntunemethod: The method - and its parameters - to employ for hyperparameters autotuning. See SuccessiveHalvingSearch for the default method. To implement automatic hyperparameter tuning during the (first) fit! call simply set autotune=true and eventually change the default tunemethod options (including the parameter ranges, the resources to employ and the loss function to adopt).\ndescr: An optional title and/or description for this model\nrng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/AutoEncoder_BetaML/#Notes:","page":"AutoEncoder","title":"Notes:","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"data must be numerical\nuse transform to obtain the encoded data, and inverse_trasnform to decode to the original data","category":"page"},{"location":"models/AutoEncoder_BetaML/#Example:","page":"AutoEncoder","title":"Example:","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load AutoEncoder pkg = \"BetaML\" verbosity=0;\n\njulia> model = modelType(encoded_size=2,layers_size=10);\n\njulia> mach = machine(model, X)\nuntrained Machine; caches model-specific representations of data\n model: AutoEncoder(e_layers = nothing, …)\n args: \n 1:\tSource @334 ⏎ Table{AbstractVector{Continuous}}\n\njulia> fit!(mach,verbosity=2)\n[ Info: Training machine(AutoEncoder(e_layers = nothing, …), …).\n***\n*** Training for 200 epochs with algorithm BetaML.Nn.ADAM.\nTraining.. \t avg loss on epoch 1 (1): \t 35.48243542158747\nTraining.. \t avg loss on epoch 20 (20): \t 0.07528042222678126\nTraining.. \t avg loss on epoch 40 (40): \t 0.06293071729378613\nTraining.. \t avg loss on epoch 60 (60): \t 0.057035588828991145\nTraining.. \t avg loss on epoch 80 (80): \t 0.056313167754822875\nTraining.. \t avg loss on epoch 100 (100): \t 0.055521461091809436\nTraining the Neural Network... 52%|██████████████████████████████████████ | ETA: 0:00:01Training.. \t avg loss on epoch 120 (120): \t 0.06015206472927942\nTraining.. \t avg loss on epoch 140 (140): \t 0.05536835903285201\nTraining.. \t avg loss on epoch 160 (160): \t 0.05877560142428245\nTraining.. \t avg loss on epoch 180 (180): \t 0.05476302769966953\nTraining.. \t avg loss on epoch 200 (200): \t 0.049240864053557445\nTraining the Neural Network... 100%|█████████████████████████████████████████████████████████████████████████| Time: 0:00:01\nTraining of 200 epoch completed. Final epoch error: 0.049240864053557445.\ntrained Machine; caches model-specific representations of data\n model: AutoEncoder(e_layers = nothing, …)\n args: \n 1:\tSource @334 ⏎ Table{AbstractVector{Continuous}}\n\n\njulia> X_latent = transform(mach, X)\n150×2 Matrix{Float64}:\n 7.01701 -2.77285\n 6.50615 -2.9279\n 6.5233 -2.60754\n ⋮ \n 6.70196 -10.6059\n 6.46369 -11.1117\n 6.20212 -10.1323\n\njulia> X_recovered = inverse_transform(mach,X_latent)\n150×4 Matrix{Float64}:\n 5.04973 3.55838 1.43251 0.242215\n 4.73689 3.19985 1.44085 0.295257\n 4.65128 3.25308 1.30187 0.244354\n ⋮ \n 6.50077 2.93602 5.3303 1.87647\n 6.38639 2.83864 5.54395 2.04117\n 6.01595 2.67659 5.03669 1.83234\n\njulia> BetaML.relative_mean_error(MLJ.matrix(X),X_recovered)\n0.03387721261716176\n\n","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/#SVMLinearRegressor_MLJScikitLearnInterface","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"","category":"section"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"SVMLinearRegressor","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"A model type for constructing a linear support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"SVMLinearRegressor = @load SVMLinearRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"Do model = SVMLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearRegressor(epsilon=...).","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMLinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"epsilon = 0.0\ntol = 0.0001\nC = 1.0\nloss = epsilon_insensitive\nfit_intercept = true\nintercept_scaling = 1.0\ndual = true\nrandom_state = nothing\nmax_iter = 1000","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/#DecisionTreeRegressor_BetaML","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"mutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/#Hyperparameters:","page":"DecisionTreeRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/#Example:","page":"DecisionTreeRegressor","title":"Example:","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load DecisionTreeRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Trees.DecisionTreeRegressor\n\njulia> model = modelType()\nDecisionTreeRegressor(\n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.variance, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(DecisionTreeRegressor(max_depth = 0, …), …).\n\njulia> ŷ = predict(mach, X);\n\njulia> hcat(y,ŷ)\n506×2 Matrix{Float64}:\n 24.0 26.35\n 21.6 21.6\n 34.7 34.8\n ⋮ \n 23.9 23.75\n 22.0 22.2\n 11.9 13.2","category":"page"},{"location":"models/LinearSVC_LIBSVM/#LinearSVC_LIBSVM","page":"LinearSVC","title":"LinearSVC","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"LinearSVC","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"LinearSVC = @load LinearSVC pkg=LIBSVM","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"Reference for algorithm and core C-library: Rong-En Fan et al (2008): \"LIBLINEAR: A Library for Large Linear Classification.\" Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf. ","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Training-data","page":"LinearSVC","title":"Training data","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"where","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)\nw: a dictionary of class weights, keyed on levels(y).","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Hyper-parameters","page":"LinearSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"solver=LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: linear solver, which must be one of the following from the LIBSVM.jl package:\nLIBSVM.Linearsolver.L2R_LR: L2-regularized logistic regression (primal))\nLIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: L2-regularized L2-loss support vector classification (dual)\nLIBSVM.Linearsolver.L2R_L2LOSS_SVC: L2-regularized L2-loss support vector classification (primal)\nLIBSVM.Linearsolver.L2R_L1LOSS_SVC_DUAL: L2-regularized L1-loss support vector classification (dual)\nLIBSVM.Linearsolver.MCSVM_CS: support vector classification by Crammer and Singer) LIBSVM.Linearsolver.L1R_L2LOSS_SVC: L1-regularized L2-loss support vector classification)\nLIBSVM.Linearsolver.L1R_LR: L1-regularized logistic regression\nLIBSVM.Linearsolver.L2R_LR_DUAL: L2-regularized logistic regression (dual)\ntolerance::Float64=Inf: tolerance for the stopping criterion;\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\nbias= -1.0: if bias >= 0, instance x becomes [x; bias]; if bias < 0, no bias term added (default -1)","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Operations","page":"LinearSVC","title":"Operations","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Fitted-parameters","page":"LinearSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Examples","page":"LinearSVC","title":"Examples","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"using MLJ\nimport LIBSVM\n\nLinearSVC = @load LinearSVC pkg=LIBSVM ## model type\nmodel = LinearSVC(solver=LIBSVM.Linearsolver.L2R_LR) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"versicolor\"\n \"virginica\"","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Incorporating-class-weights","page":"LinearSVC","title":"Incorporating class weights","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"weights = Dict(\"virginica\" => 1, \"versicolor\" => 20, \"setosa\" => 1)\nmach = machine(model, X, y, weights) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"versicolor\"\n \"versicolor\"\n \"versicolor\"","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"model_browser/#Model-Browser","page":"Model Browser","title":"Model Browser","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"Models may appear under multiple categories.","category":"page"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"Below an encoder is any transformer that does not fall under another category, such as \"Missing Value Imputation\" or \"Dimension Reduction\".","category":"page"},{"location":"model_browser/#Categories","page":"Model Browser","title":"Categories","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"Regression | Classification | Outlier Detection | Iterative Models | Ensemble Models | Clustering | Dimension Reduction | Bayesian Models | Class Imbalance | Encoders | Static Models | Missing Value Imputation | Distribution Fitter | Text Analysis | Image Processing","category":"page"},{"location":"model_browser/#Regression","page":"Model Browser","title":"Regression","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ARDRegressor (MLJScikitLearnInterface.jl)\nAdaBoostRegressor (MLJScikitLearnInterface.jl)\nBaggingRegressor (MLJScikitLearnInterface.jl)\nBayesianRidgeRegressor (MLJScikitLearnInterface.jl)\nCatBoostRegressor (CatBoost.jl)\nConstantRegressor (MLJModels.jl)\nDecisionTreeRegressor (BetaML.jl)\nDecisionTreeRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nDeterministicConstantRegressor (MLJModels.jl)\nDummyRegressor (MLJScikitLearnInterface.jl)\nElasticNetCVRegressor (MLJScikitLearnInterface.jl)\nElasticNetRegressor (MLJLinearModels.jl)\nElasticNetRegressor (MLJScikitLearnInterface.jl)\nEpsilonSVR (LIBSVM.jl/MLJLIBSVMInterface.jl)\nEvoLinearRegressor (EvoLinear.jl)\nEvoSplineRegressor (EvoLinear.jl)\nEvoTreeCount (EvoTrees.jl)\nEvoTreeGaussian (EvoTrees.jl)\nEvoTreeMLE (EvoTrees.jl)\nEvoTreeRegressor (EvoTrees.jl)\nExtraTreesRegressor (MLJScikitLearnInterface.jl)\nGaussianMixtureRegressor (BetaML.jl)\nGaussianProcessRegressor (MLJScikitLearnInterface.jl)\nGradientBoostingRegressor (MLJScikitLearnInterface.jl)\nHistGradientBoostingRegressor (MLJScikitLearnInterface.jl)\nHuberRegressor (MLJLinearModels.jl)\nHuberRegressor (MLJScikitLearnInterface.jl)\nKNNRegressor (NearestNeighborModels.jl)\nKNeighborsRegressor (MLJScikitLearnInterface.jl)\nKPLSRegressor (PartialLeastSquaresRegressor.jl)\nLADRegressor (MLJLinearModels.jl)\nLGBMRegressor (LightGBM.jl)\nLarsCVRegressor (MLJScikitLearnInterface.jl)\nLarsRegressor (MLJScikitLearnInterface.jl)\nLassoCVRegressor (MLJScikitLearnInterface.jl)\nLassoLarsCVRegressor (MLJScikitLearnInterface.jl)\nLassoLarsICRegressor (MLJScikitLearnInterface.jl)\nLassoLarsRegressor (MLJScikitLearnInterface.jl)\nLassoRegressor (MLJLinearModels.jl)\nLassoRegressor (MLJScikitLearnInterface.jl)\nLinearCountRegressor (GLM.jl/MLJGLMInterface.jl)\nLinearRegressor (GLM.jl/MLJGLMInterface.jl)\nLinearRegressor (MLJLinearModels.jl)\nLinearRegressor (MLJScikitLearnInterface.jl)\nLinearRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMultiTaskElasticNetCVRegressor (MLJScikitLearnInterface.jl)\nMultiTaskElasticNetRegressor (MLJScikitLearnInterface.jl)\nMultiTaskLassoCVRegressor (MLJScikitLearnInterface.jl)\nMultiTaskLassoRegressor (MLJScikitLearnInterface.jl)\nMultitargetGaussianMixtureRegressor (BetaML.jl)\nMultitargetKNNRegressor (NearestNeighborModels.jl)\nMultitargetLinearRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMultitargetNeuralNetworkRegressor (BetaML.jl)\nMultitargetNeuralNetworkRegressor (MLJFlux.jl)\nMultitargetRidgeRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMultitargetSRRegressor (SymbolicRegression.jl)\nNeuralNetworkRegressor (BetaML.jl)\nNeuralNetworkRegressor (MLJFlux.jl)\nNuSVR (LIBSVM.jl/MLJLIBSVMInterface.jl)\nOrthogonalMatchingPursuitCVRegressor (MLJScikitLearnInterface.jl)\nOrthogonalMatchingPursuitRegressor (MLJScikitLearnInterface.jl)\nPLSRegressor (PartialLeastSquaresRegressor.jl)\nPartLS (PartitionedLS.jl)\nPassiveAggressiveRegressor (MLJScikitLearnInterface.jl)\nQuantileRegressor (MLJLinearModels.jl)\nRANSACRegressor (MLJScikitLearnInterface.jl)\nRandomForestRegressor (BetaML.jl)\nRandomForestRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestRegressor (MLJScikitLearnInterface.jl)\nRidgeRegressor (MLJLinearModels.jl)\nRidgeRegressor (MLJScikitLearnInterface.jl)\nRidgeRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nRobustRegressor (MLJLinearModels.jl)\nSGDRegressor (MLJScikitLearnInterface.jl)\nSRRegressor (SymbolicRegression.jl)\nSVMLinearRegressor (MLJScikitLearnInterface.jl)\nSVMNuRegressor (MLJScikitLearnInterface.jl)\nSVMRegressor (MLJScikitLearnInterface.jl)\nStableForestRegressor (SIRUS.jl)\nStableRulesRegressor (SIRUS.jl)\nTheilSenRegressor (MLJScikitLearnInterface.jl)\nXGBoostCount (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostRegressor (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Classification","page":"Model Browser","title":"Classification","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AdaBoostClassifier (MLJScikitLearnInterface.jl)\nAdaBoostStumpClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nBaggingClassifier (MLJScikitLearnInterface.jl)\nBayesianLDA (MLJScikitLearnInterface.jl)\nBayesianLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBayesianQDA (MLJScikitLearnInterface.jl)\nBayesianSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBernoulliNBClassifier (MLJScikitLearnInterface.jl)\nCatBoostClassifier (CatBoost.jl)\nComplementNBClassifier (MLJScikitLearnInterface.jl)\nConstantClassifier (MLJModels.jl)\nDecisionTreeClassifier (BetaML.jl)\nDecisionTreeClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nDeterministicConstantClassifier (MLJModels.jl)\nDummyClassifier (MLJScikitLearnInterface.jl)\nEvoTreeClassifier (EvoTrees.jl)\nExtraTreesClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)\nGaussianProcessClassifier (MLJScikitLearnInterface.jl)\nGradientBoostingClassifier (MLJScikitLearnInterface.jl)\nHistGradientBoostingClassifier (MLJScikitLearnInterface.jl)\nImageClassifier (MLJFlux.jl)\nKNNClassifier (NearestNeighborModels.jl)\nKNeighborsClassifier (MLJScikitLearnInterface.jl)\nKernelPerceptronClassifier (BetaML.jl)\nLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nLGBMClassifier (LightGBM.jl)\nLinearBinaryClassifier (GLM.jl/MLJGLMInterface.jl)\nLinearSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nLogisticCVClassifier (MLJScikitLearnInterface.jl)\nLogisticClassifier (MLJLinearModels.jl)\nLogisticClassifier (MLJScikitLearnInterface.jl)\nMultinomialClassifier (MLJLinearModels.jl)\nMultinomialNBClassifier (MLJScikitLearnInterface.jl)\nMultinomialNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)\nMultitargetKNNClassifier (NearestNeighborModels.jl)\nNeuralNetworkClassifier (BetaML.jl)\nNeuralNetworkClassifier (MLJFlux.jl)\nNuSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nOneRuleClassifier (OneRule.jl)\nPassiveAggressiveClassifier (MLJScikitLearnInterface.jl)\nPegasosClassifier (BetaML.jl)\nPerceptronClassifier (BetaML.jl)\nPerceptronClassifier (MLJScikitLearnInterface.jl)\nProbabilisticNuSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nProbabilisticSGDClassifier (MLJScikitLearnInterface.jl)\nProbabilisticSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nRandomForestClassifier (BetaML.jl)\nRandomForestClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestClassifier (MLJScikitLearnInterface.jl)\nRidgeCVClassifier (MLJScikitLearnInterface.jl)\nRidgeCVRegressor (MLJScikitLearnInterface.jl)\nRidgeClassifier (MLJScikitLearnInterface.jl)\nSGDClassifier (MLJScikitLearnInterface.jl)\nSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nSVMClassifier (MLJScikitLearnInterface.jl)\nSVMLinearClassifier (MLJScikitLearnInterface.jl)\nSVMNuClassifier (MLJScikitLearnInterface.jl)\nStableForestClassifier (SIRUS.jl)\nStableRulesClassifier (SIRUS.jl)\nSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nXGBoostClassifier (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Outlier-Detection","page":"Model Browser","title":"Outlier Detection","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ABODDetector (OutlierDetectionNeighbors.jl)\nABODDetector (OutlierDetectionPython.jl)\nCBLOFDetector (OutlierDetectionPython.jl)\nCDDetector (OutlierDetectionPython.jl)\nCOFDetector (OutlierDetectionNeighbors.jl)\nCOFDetector (OutlierDetectionPython.jl)\nCOPODDetector (OutlierDetectionPython.jl)\nDNNDetector (OutlierDetectionNeighbors.jl)\nECODDetector (OutlierDetectionPython.jl)\nGMMDetector (OutlierDetectionPython.jl)\nHBOSDetector (OutlierDetectionPython.jl)\nIForestDetector (OutlierDetectionPython.jl)\nINNEDetector (OutlierDetectionPython.jl)\nKDEDetector (OutlierDetectionPython.jl)\nKNNDetector (OutlierDetectionNeighbors.jl)\nKNNDetector (OutlierDetectionPython.jl)\nLMDDDetector (OutlierDetectionPython.jl)\nLOCIDetector (OutlierDetectionPython.jl)\nLODADetector (OutlierDetectionPython.jl)\nLOFDetector (OutlierDetectionNeighbors.jl)\nLOFDetector (OutlierDetectionPython.jl)\nMCDDetector (OutlierDetectionPython.jl)\nOCSVMDetector (OutlierDetectionPython.jl)\nOneClassSVM (LIBSVM.jl/MLJLIBSVMInterface.jl)\nPCADetector (OutlierDetectionPython.jl)\nRODDetector (OutlierDetectionPython.jl)\nSODDetector (OutlierDetectionPython.jl)\nSOSDetector (OutlierDetectionPython.jl)","category":"page"},{"location":"model_browser/#Iterative-Models","page":"Model Browser","title":"Iterative Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"CatBoostClassifier (CatBoost.jl)\nCatBoostRegressor (CatBoost.jl)\nEvoSplineRegressor (EvoLinear.jl)\nEvoTreeClassifier (EvoTrees.jl)\nEvoTreeCount (EvoTrees.jl)\nEvoTreeGaussian (EvoTrees.jl)\nEvoTreeMLE (EvoTrees.jl)\nEvoTreeRegressor (EvoTrees.jl)\nExtraTreesClassifier (MLJScikitLearnInterface.jl)\nExtraTreesRegressor (MLJScikitLearnInterface.jl)\nImageClassifier (MLJFlux.jl)\nLGBMClassifier (LightGBM.jl)\nLGBMRegressor (LightGBM.jl)\nMultitargetNeuralNetworkRegressor (MLJFlux.jl)\nNeuralNetworkClassifier (MLJFlux.jl)\nNeuralNetworkRegressor (MLJFlux.jl)\nPerceptronClassifier (BetaML.jl)\nPerceptronClassifier (MLJScikitLearnInterface.jl)\nRandomForestClassifier (BetaML.jl)\nRandomForestClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestClassifier (MLJScikitLearnInterface.jl)\nRandomForestImputer (BetaML.jl)\nRandomForestRegressor (BetaML.jl)\nRandomForestRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestRegressor (MLJScikitLearnInterface.jl)\nXGBoostClassifier (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostCount (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostRegressor (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Ensemble-Models","page":"Model Browser","title":"Ensemble Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BaggingClassifier (MLJScikitLearnInterface.jl)\nBaggingRegressor (MLJScikitLearnInterface.jl)\nCatBoostClassifier (CatBoost.jl)\nCatBoostRegressor (CatBoost.jl)\nEvoSplineRegressor (EvoLinear.jl)\nEvoTreeClassifier (EvoTrees.jl)\nEvoTreeCount (EvoTrees.jl)\nEvoTreeGaussian (EvoTrees.jl)\nEvoTreeMLE (EvoTrees.jl)\nEvoTreeRegressor (EvoTrees.jl)\nLGBMClassifier (LightGBM.jl)\nLGBMRegressor (LightGBM.jl)\nRandomForestClassifier (BetaML.jl)\nRandomForestClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestClassifier (MLJScikitLearnInterface.jl)\nRandomForestImputer (BetaML.jl)\nRandomForestRegressor (BetaML.jl)\nRandomForestRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestRegressor (MLJScikitLearnInterface.jl)\nXGBoostClassifier (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostCount (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostRegressor (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Clustering","page":"Model Browser","title":"Clustering","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AffinityPropagation (MLJScikitLearnInterface.jl)\nAgglomerativeClustering (MLJScikitLearnInterface.jl)\nBirch (MLJScikitLearnInterface.jl)\nBisectingKMeans (MLJScikitLearnInterface.jl)\nDBSCAN (Clustering.jl/MLJClusteringInterface.jl)\nDBSCAN (MLJScikitLearnInterface.jl)\nFeatureAgglomeration (MLJScikitLearnInterface.jl)\nGaussianMixtureClusterer (BetaML.jl)\nHDBSCAN (MLJScikitLearnInterface.jl)\nHierarchicalClustering (Clustering.jl/MLJClusteringInterface.jl)\nKMeans (Clustering.jl/MLJClusteringInterface.jl)\nKMeans (MLJScikitLearnInterface.jl)\nKMeans (ParallelKMeans.jl)\nKMeansClusterer (BetaML.jl)\nKMedoids (Clustering.jl/MLJClusteringInterface.jl)\nKMedoidsClusterer (BetaML.jl)\nMeanShift (MLJScikitLearnInterface.jl)\nMiniBatchKMeans (MLJScikitLearnInterface.jl)\nOPTICS (MLJScikitLearnInterface.jl)\nSelfOrganizingMap (SelfOrganizingMaps.jl)\nSpectralClustering (MLJScikitLearnInterface.jl)","category":"page"},{"location":"model_browser/#Dimension-Reduction","page":"Model Browser","title":"Dimension Reduction","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AutoEncoder (BetaML.jl)\nBayesianLDA (MLJScikitLearnInterface.jl)\nBayesianLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBayesianQDA (MLJScikitLearnInterface.jl)\nBayesianSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBirch (MLJScikitLearnInterface.jl)\nBisectingKMeans (MLJScikitLearnInterface.jl)\nFactorAnalysis (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nFeatureSelector (MLJModels.jl)\nKMeans (Clustering.jl/MLJClusteringInterface.jl)\nKMeans (MLJScikitLearnInterface.jl)\nKMeans (ParallelKMeans.jl)\nKMedoids (Clustering.jl/MLJClusteringInterface.jl)\nKernelPCA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMiniBatchKMeans (MLJScikitLearnInterface.jl)\nPCA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nPPCA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nSelfOrganizingMap (SelfOrganizingMaps.jl)\nSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nTSVDTransformer (TSVD.jl/MLJTSVDInterface.jl)","category":"page"},{"location":"model_browser/#Bayesian-Models","page":"Model Browser","title":"Bayesian Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ARDRegressor (MLJScikitLearnInterface.jl)\nBayesianLDA (MLJScikitLearnInterface.jl)\nBayesianLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBayesianQDA (MLJScikitLearnInterface.jl)\nBayesianRidgeRegressor (MLJScikitLearnInterface.jl)\nBayesianSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBernoulliNBClassifier (MLJScikitLearnInterface.jl)\nComplementNBClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)\nGaussianProcessClassifier (MLJScikitLearnInterface.jl)\nGaussianProcessRegressor (MLJScikitLearnInterface.jl)\nMultinomialNBClassifier (MLJScikitLearnInterface.jl)\nMultinomialNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)","category":"page"},{"location":"model_browser/#Class-Imbalance","page":"Model Browser","title":"Class Imbalance","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BorderlineSMOTE1 (Imbalance.jl)\nClusterUndersampler (Imbalance.jl)\nENNUndersampler (Imbalance.jl)\nROSE (Imbalance.jl)\nRandomOversampler (Imbalance.jl)\nRandomUndersampler (Imbalance.jl)\nRandomWalkOversampler (Imbalance.jl)\nSMOTE (Imbalance.jl)\nSMOTEN (Imbalance.jl)\nSMOTENC (Imbalance.jl)\nTomekUndersampler (Imbalance.jl)","category":"page"},{"location":"model_browser/#Encoders","page":"Model Browser","title":"Encoders","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BM25Transformer (MLJText.jl)\nContinuousEncoder (MLJModels.jl)\nCountTransformer (MLJText.jl)\nICA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nOneHotEncoder (MLJModels.jl)\nStandardizer (MLJModels.jl)\nTfidfTransformer (MLJText.jl)\nUnivariateBoxCoxTransformer (MLJModels.jl)\nUnivariateDiscretizer (MLJModels.jl)\nUnivariateStandardizer (MLJModels.jl)\nUnivariateTimeTypeToContinuous (MLJModels.jl)","category":"page"},{"location":"model_browser/#Static-Models","page":"Model Browser","title":"Static Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AgglomerativeClustering (MLJScikitLearnInterface.jl)\nDBSCAN (Clustering.jl/MLJClusteringInterface.jl)\nDBSCAN (MLJScikitLearnInterface.jl)\nFeatureAgglomeration (MLJScikitLearnInterface.jl)\nHDBSCAN (MLJScikitLearnInterface.jl)\nInteractionTransformer (MLJModels.jl)\nOPTICS (MLJScikitLearnInterface.jl)\nSpectralClustering (MLJScikitLearnInterface.jl)","category":"page"},{"location":"model_browser/#Missing-Value-Imputation","page":"Model Browser","title":"Missing Value Imputation","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"FillImputer (MLJModels.jl)\nGaussianMixtureImputer (BetaML.jl)\nGeneralImputer (BetaML.jl)\nRandomForestImputer (BetaML.jl)\nSimpleImputer (BetaML.jl)\nUnivariateFillImputer (MLJModels.jl)","category":"page"},{"location":"model_browser/#Distribution-Fitter","page":"Model Browser","title":"Distribution Fitter","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"GaussianMixtureClusterer (BetaML.jl)\nGaussianMixtureImputer (BetaML.jl)\nGaussianMixtureRegressor (BetaML.jl)\nMultitargetGaussianMixtureRegressor (BetaML.jl)","category":"page"},{"location":"model_browser/#Text-Analysis","page":"Model Browser","title":"Text Analysis","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BM25Transformer (MLJText.jl)\nCountTransformer (MLJText.jl)\nTfidfTransformer (MLJText.jl)","category":"page"},{"location":"model_browser/#Image-Processing","page":"Model Browser","title":"Image Processing","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ImageClassifier (MLJFlux.jl)","category":"page"},{"location":"linear_pipelines/#Linear-Pipelines","page":"Linear Pipelines","title":"Linear Pipelines","text":"","category":"section"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"In MLJ a pipeline is a composite model in which models are chained together in a linear (non-branching) chain. For other arrangements, including custom architectures via learning networks, see Composing Models.","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"For purposes of illustration, consider a supervised learning problem with the following toy data:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"using MLJ\nX = (age = [23, 45, 34, 25, 67],\n gender = categorical(['m', 'm', 'f', 'm', 'f']));\ny = [67.0, 81.5, 55.6, 90.0, 61.1]\n nothing # hide","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"We would like to train using a K-nearest neighbor model, but the model type KNNRegressor assumes the features are all Continuous. This can be fixed by first:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"coercing the :age feature to have Continuous type by replacing X with coerce(X, :age=>Continuous)\nstandardizing continuous features and one-hot encoding the Multiclass features using the ContinuousEncoder model","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"However, we can avoid separately applying these preprocessing steps (two of which require fit! steps) by combining them with the supervised KKNRegressor model in a new pipeline model, using Julia's |> syntax:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels\npipe = (X -> coerce(X, :age=>Continuous)) |> ContinuousEncoder() |> KNNRegressor(K=2)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"We see above that pipe is a model whose hyperparameters are themselves other models or a function. (The names of these hyper-parameters are automatically generated. To specify your own names, use the explicit Pipeline constructor instead.)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"The |> syntax can also be used to extend an existing pipeline or concatenate two existing pipelines. So, we could instead have defined:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"pipe_transformer = (X -> coerce(X, :age=>Continuous)) |> ContinuousEncoder()\npipe = pipe_transformer |> KNNRegressor(K=2)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"A pipeline is just a model like any other. For example, we can evaluate its performance on the data above:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"evaluate(pipe, X, y, resampling=CV(nfolds=3), measure=mae)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"To include target transformations in a pipeline, wrap the supervised component using TransformedTargetModel.","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"Pipeline","category":"page"},{"location":"linear_pipelines/#MLJBase.Pipeline","page":"Linear Pipelines","title":"MLJBase.Pipeline","text":"Pipeline(component1, component2, ... , componentk; options...)\nPipeline(name1=component1, name2=component2, ..., namek=componentk; options...)\ncomponent1 |> component2 |> ... |> componentk\n\nCreate an instance of a composite model type which sequentially composes the specified components in order. This means component1 receives inputs, whose output is passed to component2, and so forth. A \"component\" is either a Model instance, a model type (converted immediately to its default instance) or any callable object. Here the \"output\" of a model is what predict returns if it is Supervised, or what transform returns if it is Unsupervised.\n\nNames for the component fields are automatically generated unless explicitly specified, as in\n\nPipeline(encoder=ContinuousEncoder(drop_last=false),\n stand=Standardizer())\n\nThe Pipeline constructor accepts keyword options discussed further below.\n\nOrdinary functions (and other callables) may be inserted in the pipeline as shown in the following example:\n\nPipeline(X->coerce(X, :age=>Continuous), OneHotEncoder, ConstantClassifier)\n\nSyntactic sugar\n\nThe |> operator is overloaded to construct pipelines out of models, callables, and existing pipelines:\n\nLinearRegressor = @load LinearRegressor pkg=MLJLinearModels add=true\nPCA = @load PCA pkg=MultivariateStats add=true\n\npipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer\npipe2 = PCA |> LinearRegressor\npipe1 |> pipe2\n\nAt most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.\n\nSpecial operations\n\nIf all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.\n\nOptional key-word arguments\n\nprediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)\noperation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)\ncache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)\n\nwarning: Warning\nSet cache=false to guarantee data anonymization.\n\nTo build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.\n\n\n\n\n\n","category":"function"},{"location":"models/InteractionTransformer_MLJModels/#InteractionTransformer_MLJModels","page":"InteractionTransformer","title":"InteractionTransformer","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"InteractionTransformer","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"A model type for constructing a interaction transformer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"InteractionTransformer = @load InteractionTransformer pkg=MLJModels","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"Do model = InteractionTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in InteractionTransformer(order=...).","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"Generates all polynomial interaction terms up to the given order for the subset of chosen columns. Any column that contains elements with scitype <:Infinite is a valid basis to generate interactions. If features is not specified, all such columns with scitype <:Infinite in the table are used as a basis.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"In MLJ or MLJBase, you can transform features X with the single call","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"transform(machine(model), X)","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"See also the example below.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/#Hyper-parameters","page":"InteractionTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"order: Maximum order of interactions to be generated.\nfeatures: Restricts interations generation to those columns","category":"page"},{"location":"models/InteractionTransformer_MLJModels/#Operations","page":"InteractionTransformer","title":"Operations","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"transform(machine(model), X): Generates polynomial interaction terms out of table X using the hyper-parameters specified in model.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/#Example","page":"InteractionTransformer","title":"Example","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"using MLJ\n\nX = (\n A = [1, 2, 3],\n B = [4, 5, 6],\n C = [7, 8, 9],\n D = [\"x₁\", \"x₂\", \"x₃\"]\n)\nit = InteractionTransformer(order=3)\nmach = machine(it)\n\njulia> transform(mach, X)\n(A = [1, 2, 3],\n B = [4, 5, 6],\n C = [7, 8, 9],\n D = [\"x₁\", \"x₂\", \"x₃\"],\n A_B = [4, 10, 18],\n A_C = [7, 16, 27],\n B_C = [28, 40, 54],\n A_B_C = [28, 80, 162],)\n\nit = InteractionTransformer(order=2, features=[:A, :B])\nmach = machine(it)\n\njulia> transform(mach, X)\n(A = [1, 2, 3],\n B = [4, 5, 6],\n C = [7, 8, 9],\n D = [\"x₁\", \"x₂\", \"x₃\"],\n A_B = [4, 10, 18],)\n","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#HierarchicalClustering_Clustering","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"HierarchicalClustering","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"A model type for constructing a hierarchical clusterer, based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"HierarchicalClustering = @load HierarchicalClustering pkg=Clustering","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"Do model = HierarchicalClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HierarchicalClustering(linkage=...).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"Hierarchical Clustering is a clustering algorithm that organizes the data in a dendrogram based on distances between groups of points and computes cluster assignments by cutting the dendrogram at a given height. More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. The dendrogram and the dendrogram cutter are accessed from the machine report (see below).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"In MLJ or MLJBase, create a machine with","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"mach = machine(model)","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Hyper-parameters","page":"HierarchicalClustering","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"linkage = :single: linkage method (:single, :average, :complete, :ward, :ward_presquared)\nmetric = SqEuclidean: metric (see Distances.jl for available metrics)\nbranchorder = :r: branchorder (:r, :barjoseph, :optimal)\nh = nothing: height at which the dendrogram is cut\nk = 3: number of clusters.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"If both k and h are specified, it is guaranteed that the number of clusters is not less than k and their height is not above h.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Operations","page":"HierarchicalClustering","title":"Operations","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Report","page":"HierarchicalClustering","title":"Report","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"After calling predict(mach), the fields of report(mach) are:","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"dendrogram: the dendrogram that was computed when calling predict.\ncutter: a dendrogram cutter that can be called with a height h or a number of clusters k, to obtain a new assignment of the data points to clusters (see example below).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Examples","page":"HierarchicalClustering","title":"Examples","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"using MLJ\n\nX, labels = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X\n\nHierarchicalClustering = @load HierarchicalClustering pkg=Clustering\nmodel = HierarchicalClustering(linkage = :complete)\nmach = machine(model)\n\n## compute and output cluster assignments for observations in `X`:\nyhat = predict(mach, X)\n\n## plot dendrogram:\nusing StatsPlots\nplot(report(mach).dendrogram)\n\n## make new predictions by cutting the dendrogram at another height\nreport(mach).cutter(h = 2.5)","category":"page"},{"location":"models/SMOTENC_Imbalance/#SMOTENC_Imbalance","page":"SMOTENC","title":"SMOTENC","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Initiate a SMOTENC model with the given hyper-parameters.","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"SMOTENC","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"A model type for constructing a smotenc, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"SMOTENC = @load SMOTENC pkg=Imbalance","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Do model = SMOTENC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTENC(k=...).","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"SMOTENC implements the SMOTENC algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.","category":"page"},{"location":"models/SMOTENC_Imbalance/#Training-data","page":"SMOTENC","title":"Training data","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"mach = machine(model)","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"model = SMOTENC()","category":"page"},{"location":"models/SMOTENC_Imbalance/#Hyperparameters","page":"SMOTENC","title":"Hyperparameters","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"k=5: Number of nearest neighbors to consider in the SMOTENC algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nknn_tree: Decides the tree used in KNN computations. Either \"Brute\" or \"Ball\". BallTree can be much faster but may lead to inaccurate results.\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/SMOTENC_Imbalance/#Transform-Inputs","page":"SMOTENC","title":"Transform Inputs","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/SMOTENC_Imbalance/#Transform-Outputs","page":"SMOTENC","title":"Transform Outputs","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/SMOTENC_Imbalance/#Operations","page":"SMOTENC","title":"Operations","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"transform(mach, X, y): resample the data X and y using SMOTENC, returning both the new and original observations","category":"page"},{"location":"models/SMOTENC_Imbalance/#Example","page":"SMOTENC","title":"Example","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"using MLJ\nusing ScientificTypes\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows = 100\nnum_continuous_feats = 3\n## want two categorical features with three and two possible values respectively\nnum_vals_per_category = [3, 2]\n\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, num_vals_per_category, rng=42) \njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\njulia> ScientificTypes.schema(X).scitypes\n(Continuous, Continuous, Continuous, Continuous, Continuous)\n## coerce nominal columns to a finite scitype (multiclass or ordered factor)\nX = coerce(X, :Column4=>Multiclass, :Column5=>Multiclass)\n\n## load SMOTE-NC\nSMOTENC = @load SMOTENC pkg=Imbalance\n\n## wrap the model in a machine\noversampler = SMOTENC(k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#EvoTreeCount_EvoTrees","page":"EvoTreeCount","title":"EvoTreeCount","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"EvoTreeCount(;kwargs...)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"A model type for constructing a EvoTreeCount, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeCount is used to perform Poisson probabilistic regression on count target.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Hyper-parameters","page":"EvoTreeCount","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Internal-API","page":"EvoTreeCount","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Do config = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(max_depth=...).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Training-model","page":"EvoTreeCount","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Inference","page":"EvoTreeCount","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Predictions are obtained using predict which returns a Vector of length nobs:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"model(X)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#MLJ","page":"EvoTreeCount","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"EvoTreeCount = @load EvoTreeCount pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Do model = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(loss=...).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Training-data","page":"EvoTreeCount","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Count; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Operations","page":"EvoTreeCount","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"predict(mach, Xnew): returns a vector of Poisson distributions given features Xnew having the same scitype as X above. Predictions are probabilistic.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Specific metrics can also be predicted using:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"predict_mean(mach, Xnew)\npredict_mode(mach, Xnew)\npredict_median(mach, Xnew)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Fitted-parameters","page":"EvoTreeCount","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Report","page":"EvoTreeCount","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Examples","page":"EvoTreeCount","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeCount(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(0:2, nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"using MLJ\nEvoTreeCount = @load EvoTreeCount pkg=EvoTrees\nmodel = EvoTreeCount(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nX, y = randn(nobs, nfeats), rand(0:2, nobs)\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mean(mach, X)\npreds = predict_mode(mach, X)\npreds = predict_median(mach, X)\n","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"See also EvoTrees.jl.","category":"page"},{"location":"list_of_supported_models/#model_list","page":"List of Supported Models","title":"List of Supported Models","text":"","category":"section"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"For a list of models organized around function (\"classification\", \"regression\", etc.), see the Model Browser.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"MLJ provides access to a wide variety of machine learning models. We are always looking for help adding new models or testing existing ones. Currently available models are listed below; for the most up-to-date list, run using MLJ; models(). ","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"Indications of \"maturity\" in the table below are approximate, surjective, and possibly out-of-date. A decision to use or not use a model in a critical application should be based on a user's independent assessment.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"experimental: indicates the package is fairly new and/or is under active development; you can help by testing these packages and making them more robust,\nlow: indicate a package that has reached a roughly stable form in terms of interface and which is unlikely to contain serious bugs. It may be missing some functionality found in similar packages. It has not benefited from a high level of use\nmedium: indicates the package is fairly mature but may benefit from optimizations and/or extra features; you can help by suggesting either,\nhigh: indicates the package is very mature and functionalities are expected to have been fairly optimiser and tested.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"Package Interface Pkg Models Maturity Note\nBetaML.jl - DecisionTreeClassifier, RandomForestClassifier, NeuralNetworkClassifier, PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeRegressor, RandomForestRegressor, NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor, GaussianMixtureRegressor, MultitargetGaussianMixtureRegressor, KMeansClusterer, KMedoidsClusterer, GaussianMixtureClusterer, SimpleImputer, GaussianMixtureImputer, RandomForestImputer, GeneralImputer, AutoEncoder medium \nCatBoost.jl - CatBoostRegressor, CatBoostClassifier high \nClustering.jl MLJClusteringInterface.jl KMeans, KMedoids, DBSCAN, HierarchicalClustering high² \nDecisionTree.jl MLJDecisionTreeInterface.jl DecisionTreeClassifier, DecisionTreeRegressor, AdaBoostStumpClassifier, RandomForestClassifier, RandomForestRegressor high \nEvoTrees.jl - EvoTreeRegressor, EvoTreeClassifier, EvoTreeCount, EvoTreeGaussian, EvoTreeMLE medium tree-based gradient boosting models\nEvoLinear.jl - EvoLinearRegressor medium linear boosting models\nGLM.jl MLJGLMInterface.jl LinearRegressor, LinearBinaryClassifier, LinearCountRegressor medium² \nImbalance.jl - RandomOversampler, RandomWalkOversampler, ROSE, SMOTE, BorderlineSMOTE1, SMOTEN, SMOTENC, RandomUndersampler, ClusterUndersampler, ENNUndersampler, TomekUndersampler, low \nLIBSVM.jl MLJLIBSVMInterface.jl LinearSVC, SVC, NuSVC, NuSVR, EpsilonSVR, OneClassSVM high also via ScikitLearn.jl\nLightGBM.jl - LGBMClassifier, LGBMRegressor high \nFlux.jl MLJFlux.jl NeuralNetworkRegressor, NeuralNetworkClassifier, MultitargetNeuralNetworkRegressor, ImageClassifier low \nMLJBalancing.jl - BalancedBaggingClassifier low \nMLJLinearModels.jl - LinearRegressor, RidgeRegressor, LassoRegressor, ElasticNetRegressor, QuantileRegressor, HuberRegressor, RobustRegressor, LADRegressor, LogisticClassifier, MultinomialClassifier medium \nMLJModels.jl (built-in) - ConstantClassifier, ConstantRegressor, ContinuousEncoder, DeterministicConstantClassifier, DeterministicConstantRegressor, FeatureSelector, FillImputer, InteractionTransformer, OneHotEncoder, Standardizer, UnivariateBoxCoxTransformer, UnivariateDiscretizer, UnivariateFillImputer, UnivariateTimeTypeToContinuous, Standardizer, BinaryThreshholdPredictor medium \nMLJText.jl - TfidfTransformer, BM25Transformer, CountTransformer low \nMultivariateStats.jl MLJMultivariateStatsInterface.jl LinearRegressor, MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor, PCA, KernelPCA, ICA, LDA, BayesianLDA, SubspaceLDA, BayesianSubspaceLDA, FactorAnalysis, PPCA high \nNaiveBayes.jl MLJNaiveBayesInterface.jl GaussianNBClassifier, MultinomialNBClassifier, HybridNBClassifier low \nNearestNeighborModels.jl - KNNClassifier, KNNRegressor, MultitargetKNNClassifier, MultitargetKNNRegressor high \nOneRule.jl - OneRuleClassifier experimental \nOutlierDetectionNeighbors.jl - ABODDetector, COFDetector, DNNDetector, KNNDetector, LOFDetector medium \nOutlierDetectionNetworks.jl - AEDetector, DSADDetector, ESADDetector medium \nOutlierDetectionPython.jl - ABODDetector, CBLOFDetector, CDDetector, COFDetector, COPODDetector, ECODDetector, GMMDetector, HBOSDetector, IForestDetector, INNEDetector, KDEDetector, KNNDetector, LMDDDetector, LOCIDetector, LODADetector, LOFDetector, MCDDetector, OCSVMDetector, PCADetector, RODDetector, SODDetector, SOSDetector high \nParallelKMeans.jl - KMeans experimental \nPartialLeastSquaresRegressor.jl - PLSRegressor, KPLSRegressor experimental \nPartitionedLS.jl - PartLS low \nScikitLearn.jl MLJScikitLearnInterface.jl ARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressor high² \nSIRUS.jl - StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressor low \nSymbolicRegression.jl - MultitargetSRRegressor, SRRegressor experimental \nTSVD.jl MLJTSVDInterface.jl TSVDTransformer high \nXGBoost.jl MLJXGBoostInterface.jl XGBoostRegressor, XGBoostClassifier, XGBoostCount high ","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"Notes ","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"¹Models not in the MLJ registry are not included in integration tests. Consult package documentation to see how to load them. There may be issues loading these models simultaneously with other registered models.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"²Some models are missing and assistance is welcome to complete the interface. Post a message on the Julia #mlj Slack channel if you would like to help, thanks!","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/#GaussianProcessClassifier_MLJScikitLearnInterface","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"","category":"section"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"GaussianProcessClassifier","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"A model type for constructing a Gaussian process classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"GaussianProcessClassifier = @load GaussianProcessClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"Do model = GaussianProcessClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessClassifier(kernel=...).","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"GaussianProcessClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"kernel = nothing\noptimizer = fmin_l_bfgs_b\nn_restarts_optimizer = 0\ncopy_X_train = true\nrandom_state = nothing\nmax_iter_predict = 100\nwarm_start = false\nmulti_class = one_vs_rest","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/#SpectralClustering_MLJScikitLearnInterface","page":"SpectralClustering","title":"SpectralClustering","text":"","category":"section"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"SpectralClustering","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"A model type for constructing a spectral clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"SpectralClustering = @load SpectralClustering pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"Do model = SpectralClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SpectralClustering(n_clusters=...).","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"Apply clustering to a projection of the normalized Laplacian. In practice spectral clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plane.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#ElasticNetRegressor_MLJLinearModels","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"A model type for constructing a elastic net regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Elastic net is a linear model with objective function","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"$","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"where n is the number of observations.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"If scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"$","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"|Xθ - y|₂²/2 + λ|θ|₂²/2 + γ|θ|₁ $","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":".","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#Training-data","page":"ElasticNetRegressor","title":"Training data","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"where:","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#Hyperparameters","page":"ElasticNetRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"lambda::Real: strength of the L2 regularization. Default: 1.0\ngamma::Real: strength of the L1 regularization. Default: 0.0\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad.\nIf solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#Example","page":"ElasticNetRegressor","title":"Example","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(ElasticNetRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"See also LassoRegressor.","category":"page"},{"location":"models/KMeans_Clustering/#KMeans_Clustering","page":"KMeans","title":"KMeans","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"KMeans","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"A model type for constructing a K-means clusterer, based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"KMeans = @load KMeans pkg=Clustering","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(k=...).","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"K-means is a classical method for clustering or vector quantization. It produces a fixed number of clusters, each associated with a center (also known as a prototype), and each data point is assigned to a cluster with the nearest center.","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"From a mathematical standpoint, K-means is a coordinate descent algorithm that solves the following optimization problem:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":":$","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"\\text{minimize} \\ \\sum{i=1}^n \\| \\mathbf{x}i - \\boldsymbol{\\mu}{zi} \\|^2 \\ \\text{w.r.t.} \\ (\\boldsymbol{\\mu}, z) :$","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Here, boldsymbolmu_k is the center of the k-th cluster, and z_i is an index of the cluster for i-th point mathbfx_i.","category":"page"},{"location":"models/KMeans_Clustering/#Training-data","page":"KMeans","title":"Training data","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"mach = machine(model, X)","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Here:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KMeans_Clustering/#Hyper-parameters","page":"KMeans","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"k=3: The number of centroids to use in clustering.\nmetric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.\ninit = :kmpp: One of the following options to indicate how cluster seeds should be initialized:\n:kmpp: KMeans++\n:kmenc: K-medoids initialization based on centrality\n:rand: random\nan instance of Clustering.SeedingAlgorithm from Clustering.jl\nan integer vector of length k that provides the indices of points to use as initial cluster centers.\nSee documentation of Clustering.jl.","category":"page"},{"location":"models/KMeans_Clustering/#Operations","page":"KMeans","title":"Operations","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.\ntransform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.","category":"page"},{"location":"models/KMeans_Clustering/#Fitted-parameters","page":"KMeans","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"centers: The coordinates of the cluster centers.","category":"page"},{"location":"models/KMeans_Clustering/#Report","page":"KMeans","title":"Report","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"The fields of report(mach) are:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"assignments: The cluster assignments of each point in the training data.\ncluster_labels: The labels assigned to each cluster.","category":"page"},{"location":"models/KMeans_Clustering/#Examples","page":"KMeans","title":"Examples","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"using MLJ\nKMeans = @load KMeans pkg=Clustering\n\ntable = load_iris()\ny, X = unpack(table, ==(:target), rng=123)\nmodel = KMeans(k=3)\nmach = machine(model, X) |> fit!\n\nyhat = predict(mach, X)\n@assert yhat == report(mach).assignments\n\ncompare = zip(yhat, y) |> collect;\ncompare[1:8] ## clusters align with classes\n\ncenter_dists = transform(mach, fitted_params(mach).centers')\n\n@assert center_dists[1][1] == 0.0\n@assert center_dists[2][2] == 0.0\n@assert center_dists[3][3] == 0.0","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"See also KMedoids","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/#PassiveAggressiveClassifier_MLJScikitLearnInterface","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"","category":"section"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"PassiveAggressiveClassifier","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"A model type for constructing a passive aggressive classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"PassiveAggressiveClassifier = @load PassiveAggressiveClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"Do model = PassiveAggressiveClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveClassifier(C=...).","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"PassiveAggressiveClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"C = 1.0\nfit_intercept = true\nmax_iter = 100\ntol = 0.001\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nshuffle = true\nverbose = 0\nloss = hinge\nn_jobs = nothing\nrandom_state = 0\nwarm_start = false\nclass_weight = nothing\naverage = false","category":"page"},{"location":"tuning_models/#Tuning-Models","page":"Tuning Models","title":"Tuning Models","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"MLJ provides several built-in and third-party options for optimizing a model's hyper-parameters. The quick-reference table below omits some advanced keyword options.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tuning strategy notes package to import package providing the core algorithm\nGrid(goal=nothing, resolution=10) shuffled by default; goal is upper bound for number of grid points MLJ.jl or MLJTuning.jl MLJTuning.jl\nRandomSearch(rng=GLOBAL_RNG) with customizable priors MLJ.jl or MLJTuning.jl MLJTuning.jl\nLatinHypercube(rng=GLOBAL_RNG) with discrete parameter support MLJ.jl or MLJTuning.jl LatinHypercubeSampling\nMLJTreeParzenTuning() See this example for usage TreeParzen.jl TreeParzen.jl (port to Julia of hyperopt)\nParticleSwarm(n_particles=3, rng=GLOBAL_RNG) Standard Kennedy-Eberhart algorithm, plus discrete parameter support MLJParticleSwarmOptimization.jl MLJParticleSwarmOptimization.jl\nAdaptiveParticleSwarm(n_particles=3, rng=GLOBAL_RNG) Zhan et al. variant with automated swarm coefficient updates, plus discrete parameter support MLJParticleSwarmOptimization.jl MLJParticleSwarmOptimization.jl\nExplicit() For an explicit list of models of varying type MLJ.jl or MLJTuning.jl MLJTuning.jl","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Below we illustrate hyperparameter optimization using the Grid, RandomSearch, LatinHypercube and Explicit tuning strategies.","category":"page"},{"location":"tuning_models/#Overview","page":"Tuning Models","title":"Overview","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"In MLJ model tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine called mach, calling fit!(mach) instigates a search for optimal model hyperparameters, within a specified range, and then uses all supplied data to train the best model. To predict using that model, one then calls predict(mach, Xnew). In this way, the wrapped model may be viewed as a \"self-tuning\" version of the unwrapped model. That is, wrapping the model simply transforms certain hyper-parameters into learned parameters.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"A corollary of the tuning-as-wrapper approach is that the evaluation of the performance of a TunedModel instance using evaluate! implies nested resampling. This approach is inspired by MLR. See also below.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"In MLJ, tuning is an iterative procedure, with an iteration parameter n, the total number of model instances to be evaluated. Accordingly, tuning can be controlled using MLJ's IteratedModel wrapper. After familiarizing oneself with the TunedModel wrapper described below, see Controlling model tuning for more on this advanced feature.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For a more in-depth overview of tuning in MLJ, or for implementation details, see the MLJTuning documentation. For a complete list of options see the TunedModel doc-string below.","category":"page"},{"location":"tuning_models/#Tuning-a-single-hyperparameter-using-a-grid-search-(regression-example)","page":"Tuning Models","title":"Tuning a single hyperparameter using a grid search (regression example)","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using MLJ\nX = MLJ.table(rand(100, 10));\ny = 2X.x1 - X.x2 + 0.05*rand(100);\nTree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0;\ntree = Tree()","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Let's tune min_purity_increase in the model above, using a grid-search. To do so we will use the simplest range object, a one-dimensional range object constructed using the range method:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"r = range(tree, :min_purity_increase, lower=0.001, upper=1.0, scale=:log);\nself_tuning_tree = TunedModel(\n model=tree,\n resampling=CV(nfolds=3),\n tuning=Grid(resolution=10),\n range=r,\n measure=rms\n);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Incidentally, a grid is generated internally \"over the range\" by calling the iterator method with an appropriate resolution:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"iterator(r, 5)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Non-numeric hyperparameters are handled a little differently:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"selector = FeatureSelector();\nr2 = range(selector, :features, values = [[:x1,], [:x1, :x2]]);\niterator(r2)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Unbounded ranges are also permitted. See the range and iterator docstrings below for details, and the sampler docstring for generating random samples from one-dimensional ranges (used internally by the RandomSearch strategy).","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Returning to the wrapped tree model:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"mach = machine(self_tuning_tree, X, y);\nfit!(mach, verbosity=0)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We can inspect the detailed results of the grid search with report(mach) or just retrieve the optimal model, as here:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"fitted_params(mach).best_model","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For more detailed information, we can look at report(mach), for example:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry = report(mach).best_history_entry","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Predicting on new input observations using the optimal model, trained on all the data bound to mach:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Xnew = MLJ.table(rand(3, 10));\npredict(mach, Xnew)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Or predicting on some subset of the observations bound to mach:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"test = 1:3\npredict(mach, rows=test)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For tuning using only a subset train of all observation indices, specify rows=train in the above fit! call. In that case, the above predict calls would be based on training the optimal model on all train rows.","category":"page"},{"location":"tuning_models/#A-probabilistic-classifier-example","page":"Tuning Models","title":"A probabilistic classifier example","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Tuning a classifier is not essentially different from tuning a regressor. A common gotcha however is to overlook the distinction between supervised models that make point predictions (subtypes of Deterministic) and those that make probabilistic predictions (subtypes of Probabilistic). The DecisionTreeRegressor model in the preceding illustration was deterministic, so this example will consider a probabilistic classifier:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"info(\"KNNClassifier\").prediction_type","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"X, y = @load_iris\nKNN = @load KNNClassifier verbosity=0\nknn = KNN()","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We'll tune the hyperparameter K in the model above, using a grid-search once more:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"K_range = range(knn, :K, lower=5, upper=20);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Since the model is probabilistic, we can choose either: (i) a probabilistic measure, such as brier_loss; or (ii) use a deterministic measure, such as misclassification_rate (which means predict_mean is called instead of predict under the hood).","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Case (i) - probabilistic measure:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_knn = TunedModel(\n model=knn,\n resampling = CV(nfolds=4, rng=1234),\n tuning = Grid(resolution=5),\n range = K_range,\n measure = BrierLoss()\n);\n\nmach = machine(self_tuning_knn, X, y);\nfit!(mach, verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Case (ii) - deterministic measure:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_knn = TunedModel(\n model=knn,\n resampling = CV(nfolds=4, rng=1234),\n tuning = Grid(resolution=5),\n range = K_range,\n measure = MisclassificationRate()\n)\n\nmach = machine(self_tuning_knn, X, y);\nfit!(mach, verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Let's inspect the best model and corresponding evaluation of the metric in case (ii):","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry = report(mach).best_history_entry","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry.model.K","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Recall that fitting mach also retrains the optimal model on all available data. The following is therefore an optimal model prediction based on all available data:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"predict(mach, rows=148:150)","category":"page"},{"location":"tuning_models/#Specifying-a-custom-measure","page":"Tuning Models","title":"Specifying a custom measure","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Users may specify a custom loss or scoring function, so long as it complies with the StatisticalMeasuresBase.jl API and implements the appropriate orientation trait (Score() or Loss()) from that package. For example, we suppose define a \"new\" scoring function custom_accuracy by","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"custom_accuracy(yhat, y) = mean(y .== yhat); # yhat - prediction, y - ground truth","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"In tuning, scores are maximised, while losses are minimised. So here we declare","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"import StatisticalMeasuresBase as SMB\nSMB.orientation(::typeof(custom_accuracy)) = SMB.Score()","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For full details on constructing custom measures, see StatisticalMeasuresBase.jl.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_knn = TunedModel(\n model=knn,\n resampling = CV(nfolds=4),\n tuning = Grid(resolution=5),\n range = K_range,\n measure = [custom_accuracy, MulticlassFScore()],\n operation = predict_mode\n);\n\nmach = machine(self_tuning_knn, X, y)\nfit!(mach, verbosity=0)\nentry = report(mach).best_history_entry","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry.model.K","category":"page"},{"location":"tuning_models/#Tuning-multiple-nested-hyperparameters","page":"Tuning Models","title":"Tuning multiple nested hyperparameters","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"The forest model below has another model, namely a DecisionTreeRegressor, as a hyperparameter:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tree = Tree() # defined above\nforest = EnsembleModel(model=tree)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Ranges for nested hyperparameters are specified using dot syntax. In this case, we will specify a goal for the total number of grid points:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"r1 = range(forest, :(model.n_subfeatures), lower=1, upper=9);\nr2 = range(forest, :bagging_fraction, lower=0.4, upper=1.0);\nself_tuning_forest = TunedModel(\n model=forest,\n tuning=Grid(goal=30),\n resampling=CV(nfolds=6),\n range=[r1, r2],\n measure=rms);\n\nX = MLJ.table(rand(100, 10));\ny = 2X.x1 - X.x2 + 0.05*rand(100);\n\nmach = machine(self_tuning_forest, X, y);\nfit!(mach, verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We can plot the grid search results:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using Plots\nplot(mach)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"(Image: )","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Instead of specifying a goal, we can declare a global resolution, which is overridden for a particular parameter by pairing its range with the resolution desired. In the next example, the default resolution=100 is applied to the r2 field, but a resolution of 3 is applied to the r1 field. Additionally, we ask that the grid points be randomly traversed and the total number of evaluations be limited to 25.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tuning = Grid(resolution=100, shuffle=true, rng=1234)\nself_tuning_forest = TunedModel(\n model=forest,\n tuning=tuning,\n resampling=CV(nfolds=6),\n range=[(r1, 3), r2],\n measure=rms,\n n=25\n);\nfit!(machine(self_tuning_forest, X, y), verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For more options for a grid search, see Grid below.","category":"page"},{"location":"tuning_models/#Tuning-using-a-random-search","page":"Tuning Models","title":"Tuning using a random search","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Let's attempt to tune the same hyperparameters using a RandomSearch tuning strategy. By default, bounded numeric ranges like r1 and r2 are sampled uniformly (before rounding, in the case of the integer range r1). Positive unbounded ranges are sampled using a Gamma distribution by default, and all others using a (truncated) normal distribution.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_forest = TunedModel(\n model=forest,\n tuning=RandomSearch(),\n resampling=CV(nfolds=6),\n range=[r1, r2],\n measure=rms,\n n=25\n);\nX = MLJ.table(rand(100, 10));\ny = 2X.x1 - X.x2 + 0.05*rand(100);\nmach = machine(self_tuning_forest, X, y);\nfit!(mach, verbosity=0)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using Plots\nplot(mach)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"(Image: )","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"The prior distributions used for sampling each hyperparameter can be customized, as can the global fallbacks. See the RandomSearch doc-string below for details.","category":"page"},{"location":"tuning_models/#Tuning-using-Latin-hypercube-sampling","page":"Tuning Models","title":"Tuning using Latin hypercube sampling","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"One can also tune the hyperparameters using the LatinHypercube tuning strategy. This method uses a genetic-based optimization algorithm based on the inverse of the Audze-Eglais function, using the library LatinHypercubeSampling.jl.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We'll work with the data X, y and ranges r1 and r2 defined above and instantiate a Latin hypercube resampling strategy:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"latin = LatinHypercube(gens=2, popsize=120)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Here gens is the number of generations to run the optimisation for and popsize is the population size in the genetic algorithm. For more on these and other LatinHypercube parameters refer to the LatinHypercubeSampling.jl documentation. Pay attention that gens and popsize are not to be confused with the iteration parameter n in the construction of a corresponding TunedModel instance, which specifies the total number of models to be evaluated, independent of the tuning strategy.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For this illustration we'll add a third, nominal, hyper-parameter:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"r3 = range(forest, :(model.post_prune), values=[true, false]);\nself_tuning_forest = TunedModel(\n model=forest,\n tuning=latin,\n resampling=CV(nfolds=6),\n range=[r1, r2, r3],\n measure=rms,\n n=25\n);\nmach = machine(self_tuning_forest, X, y);\nfit!(mach, verbosity=0)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using Plots\nplot(mach)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"(Image: )","category":"page"},{"location":"tuning_models/#explicit","page":"Tuning Models","title":"Comparing models of different type and nested cross-validation","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Instead of mutating hyperparameters of a fixed model, one can instead optimise over an explicit list of models, whose types are allowed to vary. As with other tuning strategies, evaluating the resulting TunedModel itself implies nested resampling (e.g., nested cross-validation) which we now examine in a bit more detail.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()\nknn = (@load KNNClassifier pkg=NearestNeighborModels verbosity=0)()\nmodels = [tree, knn]\nnothing # hide","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"The following model is equivalent to the best in models by using 3-fold cross-validation:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"multi_model = TunedModel(\n models=models,\n resampling=CV(nfolds=3),\n measure=log_loss,\n check_measure=false\n)\nnothing # hide","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Note that there is no need to specify a tuning strategy or range but we do specify models (plural) instead of model. Evaluating multi_model implies nested cross-validation (each model gets evaluated 2 x 3 times):","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"X, y = make_blobs()\n\ne = evaluate(multi_model, X, y, resampling=CV(nfolds=2), measure=log_loss, verbosity=6)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Now, for example, we can get the best model for the first fold out of the two folds:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"e.report_per_fold[1].best_model","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"And the losses in the outer loop (these still have to be matched to the best performing model):","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"e.per_fold","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"It is also possible to get the results for the nested evaluations. For example, for the first fold of the outer loop and the second model:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"e.report_per_fold[2].history[1]","category":"page"},{"location":"tuning_models/#Reference","page":"Tuning Models","title":"Reference","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"MLJBase.range\nMLJBase.iterator\nMLJBase.sampler\nDistributions.fit(::Type{D}, ::MLJBase.NumericRange) where D<:Distributions.Distribution\nMLJTuning.TunedModel\nMLJTuning.Grid\nMLJTuning.RandomSearch\nMLJTuning.LatinHypercube","category":"page"},{"location":"tuning_models/#Base.range","page":"Tuning Models","title":"Base.range","text":"r = range(model, :hyper; values=nothing)\n\nDefine a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.\n\nA nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.\n\nr = range(model, :hyper; upper=nothing, lower=nothing,\n scale=nothing, values=nothing)\n\nAssuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.\n\nNote that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.\n\nBy default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.\n\nA nested hyperparameter is specified using dot notation (see above).\n\nIf scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.\n\nIf values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).\n\nSee also: iterator, sampler\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#MLJBase.iterator","page":"Tuning Models","title":"MLJBase.iterator","text":"iterator([rng, ], r::NominalRange, [,n])\niterator([rng, ], r::NumericRange, n)\n\nReturn an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:\n\n(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:\n\nr.lower r.upper L U\nfinite finite r.lower r.upper\n-Inf finite r.upper - 2r.unit r.upper\nfinite Inf r.lower r.lower + 2r.unit\n-Inf Inf r.origin - r.unit r.origin + r.unit\n\n(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)\n\n(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).\n\n(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#Distributions.sampler","page":"Tuning Models","title":"Distributions.sampler","text":"sampler(r::NominalRange, probs::AbstractVector{<:Real})\nsampler(r::NominalRange)\nsampler(r::NumericRange{T}, d)\n\nConstruct an object s which can be used to generate random samples from a ParamRange object r (a one-dimensional range) using one of the following calls:\n\nrand(s) # for one sample\nrand(s, n) # for n samples\nrand(rng, s [, n]) # to specify an RNG\n\nThe argument probs can be any probability vector with the same length as r.values. The second sampler method above calls the first with a uniform probs vector.\n\nThe argument d can be either an arbitrary instance of UnivariateDistribution from the Distributions.jl package, or one of a Distributions.jl types for which fit(d, ::NumericRange) is defined. These include: Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight, Normal, Gamma, InverseGaussian, Logistic, LogNormal, Cauchy, Gumbel, Laplace, and Poisson; but see the doc-string for Distributions.fit for an up-to-date list.\n\nIf d is an instance, then sampling is from a truncated form of the supplied distribution d, the truncation bounds being r.lower and r.upper (the attributes r.origin and r.unit attributes are ignored). For discrete numeric ranges (T <: Integer) the samples are rounded.\n\nIf d is a type then a suitably truncated distribution is automatically generated using Distributions.fit(d, r).\n\nImportant. Values are generated with no regard to r.scale, except in the special case r.scale is a callable object f. In that case, f is applied to all values generated by rand as described above (prior to rounding, in the case of discrete numeric ranges).\n\nExamples\n\njulia> r = range(Char, :letter, values=collect(\"abc\"))\njulia> s = sampler(r, [0.1, 0.2, 0.7])\njulia> samples = rand(s, 1000);\njulia> StatsBase.countmap(samples)\nDict{Char,Int64} with 3 entries:\n 'a' => 107\n 'b' => 205\n 'c' => 688\n\njulia> r = range(Int, :k, lower=2, upper=6) # numeric but discrete\njulia> s = sampler(r, Normal)\njulia> samples = rand(s, 1000);\njulia> UnicodePlots.histogram(samples)\n ┌ ┐\n[2.0, 2.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 119\n[2.5, 3.0) ┤ 0\n[3.0, 3.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 296\n[3.5, 4.0) ┤ 0\n[4.0, 4.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 275\n[4.5, 5.0) ┤ 0\n[5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221\n[5.5, 6.0) ┤ 0\n[6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89\n └ ┘\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#StatsAPI.fit-Union{Tuple{D}, Tuple{Type{D}, NumericRange}} where D<:Distributions.Distribution","page":"Tuning Models","title":"StatsAPI.fit","text":"Distributions.fit(D, r::MLJBase.NumericRange)\n\nFit and return a distribution d of type D to the one-dimensional range r.\n\nOnly types D in the table below are supported.\n\nThe distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.\n\nDistribution type D Characterization of d0\nArcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight minimum(d) = r.lower, maximum(d) = r.upper\nNormal, Gamma, InverseGaussian, Logistic, LogNormal mean(d) = r.origin, std(d) = r.unit\nCauchy, Gumbel, Laplace, (Normal) Dist.location(d) = r.origin, Dist.scale(d) = r.unit\nPoisson Dist.mean(d) = r.unit\n\nHere Dist = Distributions.\n\n\n\n\n\n","category":"method"},{"location":"tuning_models/#MLJTuning.TunedModel","page":"Tuning Models","title":"MLJTuning.TunedModel","text":"tuned_model = TunedModel(; model=,\n tuning=RandomSearch(),\n resampling=Holdout(),\n range=nothing,\n measure=nothing,\n n=default_n(tuning, range),\n operation=nothing,\n other_options...)\n\nConstruct a model wrapper for hyper-parameter optimization of a supervised learner, specifying the tuning strategy and model whose hyper-parameters are to be mutated.\n\ntuned_model = TunedModel(; models=,\n resampling=Holdout(),\n measure=nothing,\n n=length(models),\n operation=nothing,\n other_options...)\n\nConstruct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.\n\nSee below for a complete list of options.\n\nTraining\n\nCalling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:\n\nInstigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.\nFit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.\n\nSearch space\n\nThe range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.\n\nThe number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.\n\nMeasures (metrics)\n\nIf more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.\n\nImportant. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.\n\nAccessing the fitted parameters and other training (tuning) outcomes\n\nA Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).\n\nOnce a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:\n\nkey value\nbest_model optimal model instance\nbest_fitted_params learned parameters of the optimal model\n\nThe named tuple report(mach) includes these keys/values:\n\nkey value\nbest_model optimal model instance\nbest_history_entry corresponding entry in the history, including performance estimate\nbest_report report generated by fitting the optimal model to all data\nhistory tuning strategy-specific history of all evaluations\n\nplus other key/value pairs specific to the tuning strategy.\n\nEach element of history is a property-accessible object with these properties:\n\nkey value\nmeasure vector of measures (metrics)\nmeasurement vector of measurements, one per measure\nper_fold vector of vectors of unaggregated per-fold measurements\nevaluation full PerformanceEvaluation/CompactPerformaceEvaluation object\n\nComplete list of key-word options\n\nmodel: Supervised model prototype that is cloned and mutated to generate models for evaluation\nmodels: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.\ntuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.\nresampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations\nmeasure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history\nweights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).\nclass_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).\nrepeats=1: for generating train/test sets multiple times in resampling (\"Monte Carlo\" resampling); see evaluate! for details\noperation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.\nrange: range object; tuning strategy documentation describes supported types\nselection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).\nn: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified\ntrain_best=true: whether to train the optimal model\nacceleration=default_resource(): mode of parallelization for tuning strategies that support this\nacceleration_resampling=CPU1(): mode of parallelization for resampling\ncheck_measure=true: whether to check measure is compatible with the specified model and operation)\ncache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.\ncompact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#MLJTuning.Grid","page":"Tuning Models","title":"MLJTuning.Grid","text":"Grid(goal=nothing, resolution=10, rng=Random.GLOBAL_RNG, shuffle=true)\n\nInstantiate a Cartesian grid-based hyperparameter tuning strategy with a specified number of grid points as goal, or using a specified default resolution in each numeric dimension.\n\nSupported ranges:\n\nA single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in Grid search, the range field of a TunedModel instance can be:\n\nA single one-dimensional range - ie, ParamRange object - r, or pair of the form (r, res) where res specifies a resolution to override the default resolution.\nAny vector of objects of the above form\n\nTwo elements of a range vector may share the same field attribute, with the effect that their grids are combined, as in Example 3 below.\n\nParamRange objects are constructed using the range method.\n\nExample 1:\n\nrange(model, :hyper1, lower=1, origin=2, unit=1)\n\nExample 2:\n\n[(range(model, :hyper1, lower=1, upper=10), 15),\n range(model, :hyper2, lower=2, upper=4),\n range(model, :hyper3, values=[:ball, :tree])]\n\nExample 3:\n\n# a range generating the grid `[1, 2, 10, 20, 30]` for `:hyper1`:\n[range(model, :hyper1, values=[1, 2]),\n (range(model, :hyper1, lower= 10, upper=30), 3)]\n\nNote: All the field values of the ParamRange objects (:hyper1, :hyper2, :hyper3 in the preceding example) must refer to field names a of single model (the model specified during TunedModel construction).\n\nAlgorithm\n\nThis is a standard grid search with the following specifics: In all cases all values of each specified NominalRange are exhausted. If goal is specified, then all resolutions are ignored, and a global resolution is applied to the NumericRange objects that maximizes the number of grid points, subject to the restriction that this not exceed goal. (This assumes no field appears twice in the range vector.) Otherwise the default resolution and any parameter-specific resolutions apply.\n\nIn all cases the models generated are shuffled using rng, unless shuffle=false.\n\nSee also TunedModel, range.\n\n\n\n\n\n","category":"type"},{"location":"tuning_models/#MLJTuning.RandomSearch","page":"Tuning Models","title":"MLJTuning.RandomSearch","text":"RandomSearch(bounded=Distributions.Uniform,\n positive_unbounded=Distributions.Gamma,\n other=Distributions.Normal,\n rng=Random.GLOBAL_RNG)\n\nInstantiate a random search tuning strategy, for searching over Cartesian hyperparameter domains, with customizable priors in each dimension.\n\nSupported ranges\n\nA single one-dimensional range or vector of one-dimensioinal ranges can be specified. If not paired with a prior, then one is fitted, according to fallback distribution types specified by the tuning strategy hyperparameters. Specifically, in RandomSearch, the range field of a TunedModel instance can be:\n\na single one-dimensional range (ParamRange object) r\na pair of the form (r, d), with r as above and where d is:\na probability vector of the same length as r.values (r a NominalRange)\nany Distributions.UnivariateDistribution instance (r a NumericRange)\none of the subtypes of Distributions.UnivariateDistribution listed in the table below, for automatic fitting using Distributions.fit(d, r), a distribution whose support always lies between r.lower and r.upper (r a NumericRange)\nany pair of the form (field, s), where field is the (possibly nested) name of a field of the model to be tuned, and s an arbitrary sampler object for that field. This means only that rand(rng, s) is defined and returns valid values for the field.\nany vector of objects of the above form\n\nA range vector may contain multiple entries for the same model field, as in range = [(:lambda, s1), (:alpha, s), (:lambda, s2)]. In that case the entry used in each iteration is random.\n\ndistribution types for fitting to ranges of this type\nArcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight bounded\nGamma, InverseGaussian, Poisson positive (bounded or unbounded)\nNormal, Logistic, LogNormal, Cauchy, Gumbel, Laplace any\n\nParamRange objects are constructed using the range method.\n\nExamples\n\nusing Distributions\n\nrange1 = range(model, :hyper1, lower=0, upper=1)\n\nrange2 = [(range(model, :hyper1, lower=1, upper=10), Arcsine),\n range(model, :hyper2, lower=2, upper=Inf, unit=1, origin=3),\n (range(model, :hyper2, lower=2, upper=4), Normal(0, 3)),\n (range(model, :hyper3, values=[:ball, :tree]), [0.3, 0.7])]\n\n# uniform sampling of :(atom.λ) from [0, 1] without defining a NumericRange:\nstruct MySampler end\nBase.rand(rng::Random.AbstractRNG, ::MySampler) = rand(rng)\nrange3 = (:(atom.λ), MySampler())\n\nAlgorithm\n\nIn each iteration, a model is generated for evaluation by mutating the fields of a deep copy of model. The range vector is shuffled and the fields sampled according to the new order (repeated fields being mutated more than once). For a range entry of the form (field, s) the algorithm calls rand(rng, s) and mutates the field field of the model clone to have this value. For an entry of the form (r, d), s is substituted with sampler(r, d). If no d is specified, then sampling is uniform (with replacement) if r is a NominalRange, and is otherwise given by the defaults specified by the tuning strategy parameters bounded, positive_unbounded, and other, depending on the field values of the NumericRange object r.\n\nSee also TunedModel, range, sampler.\n\n\n\n\n\n","category":"type"},{"location":"tuning_models/#MLJTuning.LatinHypercube","page":"Tuning Models","title":"MLJTuning.LatinHypercube","text":"LatinHypercube(gens = 1,\n popsize = 100,\n ntour = 2,\n ptour = 0.8.,\n interSampleWeight = 1.0,\n ae_power = 2,\n periodic_ae = false,\n rng=Random.GLOBAL_RNG)\n\nInstantiate grid-based hyperparameter tuning strategy using the library LatinHypercubeSampling.jl.\n\nAn optimised Latin Hypercube sampling plan is created using a genetic based optimization algorithm based on the inverse of the Audze-Eglais function. The optimization is run for nGenerations and creates n models for evaluation, where n is specified by a corresponding TunedModel instance, as in\n\ntuned_model = TunedModel(model=...,\n tuning=LatinHypercube(...),\n range=...,\n measures=...,\n n=...)\n\n(See TunedModel for complete options.)\n\nTo use a periodic version of the Audze-Eglais function (to reduce clustering along the boundaries) specify periodic_ae = true.\n\nSupported ranges:\n\nA single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in LatinHypercubeSampling search, the range field of a TunedModel instance can be:\n\nA single one-dimensional range - ie, ParamRange object - r, constructed\n\nusing the range method.\n\nAny vector of objects of the above form\n\nBoth NumericRanges and NominalRanges are supported, and hyper-parameter values are sampled on a scale specified by the range (eg, r.scale = :log).\n\n\n\n\n\n","category":"type"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/#DummyClassifier_MLJScikitLearnInterface","page":"DummyClassifier","title":"DummyClassifier","text":"","category":"section"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"DummyClassifier","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"A model type for constructing a dummy classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"DummyClassifier = @load DummyClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"Do model = DummyClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyClassifier(strategy=...).","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"DummyClassifier is a classifier that makes predictions using simple rules.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#StableForestRegressor_SIRUS","page":"StableForestRegressor","title":"StableForestRegressor","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"StableForestRegressor","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"A model type for constructing a stable forest regressor, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"StableForestRegressor = @load StableForestRegressor pkg=SIRUS","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"Do model = StableForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestRegressor(rng=...).","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"StableForestRegressor implements the random forest regressor with a stabilized forest structure (Bénard et al., 2021).","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Training-data","page":"StableForestRegressor","title":"Training data","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"where","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Hyperparameters","page":"StableForestRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Fitted-parameters","page":"StableForestRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"fitresult: A StableForest object.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Operations","page":"StableForestRegressor","title":"Operations","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#ContinuousEncoder_MLJModels","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"ContinuousEncoder","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"If ftr is already Continuous retain it.\nIf ftr is Multiclass, one-hot encode it.\nIf ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.\nIf ftr is Count, replace it with coerce(ftr, Continuous).\nIf ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Training-data","page":"ContinuousEncoder","title":"Training data","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"mach = machine(model, X)","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"where","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Hyper-parameters","page":"ContinuousEncoder","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.\none_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Fitted-parameters","page":"ContinuousEncoder","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"features_to_keep: names of features that will not be dropped from the table\none_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding\none_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Report","page":"ContinuousEncoder","title":"Report","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"features_to_keep: names of input features that will not be dropped from the table\nnew_features: names of all output features","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Example","page":"ContinuousEncoder","title":"Example","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"X = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3],\n comments=[\"the force\", \"be\", \"with you\", \"too\"])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n│ comments │ Textual │\n└───────────┴──────────────────┘\n\nencoder = ContinuousEncoder(drop_last=true)\nmach = fit!(machine(encoder, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Continuous │\n└──────────────┴────────────┘\n\njulia> setdiff(schema(X).names, report(mach).features_to_keep) ## dropped features\n1-element Vector{Symbol}:\n :comments\n","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"See also OneHotEncoder","category":"page"},{"location":"models/SVC_LIBSVM/#SVC_LIBSVM","page":"SVC","title":"SVC","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"SVC","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"A model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"SVC = @load SVC pkg=LIBSVM","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"Do model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/SVC_LIBSVM/#Training-data","page":"SVC","title":"Training data","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"where","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)\nw: a dictionary of class weights, keyed on levels(y).","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/SVC_LIBSVM/#Hyper-parameters","page":"SVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/SVC_LIBSVM/#Operations","page":"SVC","title":"Operations","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/SVC_LIBSVM/#Fitted-parameters","page":"SVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/SVC_LIBSVM/#Report","page":"SVC","title":"Report","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/SVC_LIBSVM/#Examples","page":"SVC","title":"Examples","text":"","category":"section"},{"location":"models/SVC_LIBSVM/#Using-a-built-in-kernel","page":"SVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"using MLJ\nimport LIBSVM\n\nSVC = @load SVC pkg=LIBSVM ## model type\nmodel = SVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/SVC_LIBSVM/#User-defined-kernels","page":"SVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = SVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/SVC_LIBSVM/#Incorporating-class-weights","page":"SVC","title":"Incorporating class weights","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"In either scenario above, we can do:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"weights = Dict(\"virginica\" => 1, \"versicolor\" => 20, \"setosa\" => 1)\nmach = machine(model, X, y, weights) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"versicolor\"\n \"versicolor\"\n \"versicolor\"","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"modifying_behavior/#Modifying-Behavior","page":"Modifying Behavior","title":"Modifying Behavior","text":"","category":"section"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"To modify behavior of MLJ you will need to clone the relevant component package (e.g., MLJBase.jl) - or a fork thereof - and modify your local julia environment to use your local clone in place of the official release. For example, you might proceed something like this:","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"using Pkg\nPkg.activate(\"my_MLJ_enf\", shared=true)\nPkg.develop(\"path/to/my/local/MLJBase\")","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"To test your local clone, do","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"Pkg.test(\"MLJBase\")","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"For more on package management, see here.","category":"page"},{"location":"models/INNEDetector_OutlierDetectionPython/#INNEDetector_OutlierDetectionPython","page":"INNEDetector","title":"INNEDetector","text":"","category":"section"},{"location":"models/INNEDetector_OutlierDetectionPython/","page":"INNEDetector","title":"INNEDetector","text":"INNEDetector(n_estimators=200,\n max_samples=\"auto\",\n random_state=None)","category":"page"},{"location":"models/INNEDetector_OutlierDetectionPython/","page":"INNEDetector","title":"INNEDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.inne","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#COFDetector_OutlierDetectionNeighbors","page":"COFDetector","title":"COFDetector","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"COFDetector(k = 5,\n metric = Euclidean(),\n algorithm = :kdtree,\n leafsize = 10,\n reorder = true,\n parallel = false)","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Local outlier density based on chaining distance between graphs of neighbors, as described in [1].","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#Parameters","page":"COFDetector","title":"Parameters","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"k::Integer","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"metric::Metric","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"leafsize::Int","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"reorder::Bool","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"parallel::Bool","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#Examples","page":"COFDetector","title":"Examples","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"using OutlierDetection: COFDetector, fit, transform\ndetector = COFDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#References","page":"COFDetector","title":"References","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"[1] Tang, Jian; Chen, Zhixiang; Fu, Ada Wai-Chee; Cheung, David Wai-Lok (2002): Enhancing Effectiveness of Outlier Detections for Low Density Patterns.","category":"page"},{"location":"models/SMOTEN_Imbalance/#SMOTEN_Imbalance","page":"SMOTEN","title":"SMOTEN","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Initiate a SMOTEN model with the given hyper-parameters.","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"SMOTEN","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"A model type for constructing a smoten, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"SMOTEN = @load SMOTEN pkg=Imbalance","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Do model = SMOTEN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTEN(k=...).","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"SMOTEN implements the SMOTEN algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTEN: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.","category":"page"},{"location":"models/SMOTEN_Imbalance/#Training-data","page":"SMOTEN","title":"Training data","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"mach = machine(model)","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"model = SMOTEN()","category":"page"},{"location":"models/SMOTEN_Imbalance/#Hyperparameters","page":"SMOTEN","title":"Hyperparameters","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"k=5: Number of nearest neighbors to consider in the SMOTEN algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/SMOTEN_Imbalance/#Transform-Inputs","page":"SMOTEN","title":"Transform Inputs","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"X: A matrix of integers or a table with element scitypes that subtype Finite. That is, for table inputs each column should have either OrderedFactor or Multiclass as the element scitype.\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/SMOTEN_Imbalance/#Transform-Outputs","page":"SMOTEN","title":"Transform Outputs","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/SMOTEN_Imbalance/#Operations","page":"SMOTEN","title":"Operations","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"transform(mach, X, y): resample the data X and y using SMOTEN, returning both the new and original observations","category":"page"},{"location":"models/SMOTEN_Imbalance/#Example","page":"SMOTEN","title":"Example","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"using MLJ\nusing ScientificTypes\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows = 100\nnum_continuous_feats = 0\n## want two categorical features with three and two possible values respectively\nnum_vals_per_category = [3, 2]\n\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, num_vals_per_category, rng=42) \njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\njulia> ScientificTypes.schema(X).scitypes\n(Count, Count)\n\n## coerce to a finite scitype (multiclass or ordered factor)\nX = coerce(X, autotype(X, :few_to_finite))\n\n## load SMOTEN\nSMOTEN = @load SMOTEN pkg=Imbalance\n\n## wrap the model in a machine\noversampler = SMOTEN(k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#NeuralNetworkClassifier_BetaML","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"mutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#Parameters:","page":"NeuralNetworkClassifier","title":"Parameters:","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers. The last \"softmax\" layer is automatically added.\nloss: Loss (cost) function [def: BetaML.crossentropy]. Should always assume y and ŷ as matrices.\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dcrossentropy, i.e. the derivative of the cross-entropy]. Use nothing for autodiff.\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]\nbatch_size: Size of each individual batch [def: 16]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ndescr: An optional title and/or description for this model\ncb: A call back function to provide information during training [def: BetaML.fitting_info]\ncategories: The categories to represent as columns. [def: nothing, i.e. unique training values].\nhandle_unknown: How to handle categories not seens in training or not present in the provided categories array? \"error\" (default) rises an error, \"infrequent\" adds a specific column for these categories.\nother_categories_name: Which value during prediction to assign to this \"other\" category (i.e. categories not seen on training or not present in the provided categories array? [def: nothing, i.e. typemax(Int64) for integer vectors and \"other\" for other types]. This setting is active only if handle_unknown=\"infrequent\" and in that case it MUST be specified if Y is neither integer or strings\nrng: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#Notes:","page":"NeuralNetworkClassifier","title":"Notes:","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"data must be numerical\nthe label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#Example:","page":"NeuralNetworkClassifier","title":"Example:","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load NeuralNetworkClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Nn.NeuralNetworkClassifier\n\njulia> layers = [BetaML.DenseLayer(4,8,f=BetaML.relu),BetaML.DenseLayer(8,8,f=BetaML.relu),BetaML.DenseLayer(8,3,f=BetaML.relu),BetaML.VectorFunctionLayer(3,f=BetaML.softmax)];\n\njulia> model = modelType(layers=layers,opt_alg=BetaML.ADAM())\nNeuralNetworkClassifier(\n layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.376173352338049 0.7029289511758696 -0.5589563304592478 -0.21043274001651874; 0.044758889527899415 0.6687689636685921 0.4584331114653877 0.6820506583840453; … ; -0.26546358457167507 -0.28469736227283804 -0.164225549922154 -0.516785639164486; -0.5146043550684141 -0.0699113265130964 0.14959906603941908 -0.053706860039406834], [0.7003943613125758, -0.23990840466587576, -0.23823126271387746, 0.4018101580410387, 0.2274483050356888, -0.564975060667734, 0.1732063297031089, 0.11880299829896945], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.029467850439546583 0.4074661266592745 … 0.36775675246760053 -0.595524555448422; 0.42455597698371306 -0.2458082732997091 … -0.3324220683462514 0.44439454998610595; … ; -0.2890883863364267 -0.10109249362508033 … -0.0602680568207582 0.18177278845097555; -0.03432587226449335 -0.4301192922760063 … 0.5646018168286626 0.47269177680892693], [0.13777442835428688, 0.5473306726675433, 0.3781939472904011, 0.24021813428130567, -0.0714779477402877, -0.020386373530818958, 0.5465466618404464, -0.40339790713616525], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([0.6565120540082393 0.7139211611842745 … 0.07809812467915389 -0.49346311403373844; -0.4544472987041656 0.6502667641568863 … 0.43634608676548214 0.7213049952968921; 0.41212264783075303 -0.21993289366360613 … 0.25365007887755064 -0.5664469566269569], [-0.6911986792747682, -0.2149343209329364, -0.6347727539063817], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.VectorFunctionLayer{0}(fill(NaN), 3, 3, BetaML.Utils.softmax, BetaML.Utils.dsoftmax, nothing)], \n loss = BetaML.Utils.crossentropy, \n dloss = BetaML.Utils.dcrossentropy, \n epochs = 100, \n batch_size = 32, \n opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var\"#90#93\"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]), \n shuffle = true, \n descr = \"\", \n cb = BetaML.Nn.fitting_info, \n categories = nothing, \n handle_unknown = \"error\", \n other_categories_name = nothing, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> classes_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.575, versicolor=>0.213, virginica=>0.213)\n UnivariateFinite{Multiclass{3}}(setosa=>0.573, versicolor=>0.213, virginica=>0.213)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.236, versicolor=>0.236, virginica=>0.529)\n UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/#GradientBoostingRegressor_MLJScikitLearnInterface","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"","category":"section"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"GradientBoostingRegressor","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"GradientBoostingRegressor = @load GradientBoostingRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"Do model = GradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingRegressor(loss=...).","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#BayesianLDA_MultivariateStats","page":"BayesianLDA","title":"BayesianLDA","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"A model type for constructing a Bayesian LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA = @load BayesianLDA pkg=MultivariateStats","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(method=...).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"The Bayesian multiclass LDA algorithm learns a projection matrix as described in ordinary LDA. Predicted class posterior probability distributions are derived by applying Bayes' rule with a multivariate Gaussian class-conditional distribution. A prior class distribution can be specified by the user or inferred from training data class frequency.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"See also the package documentation. For more information about the algorithm, see Li, Zhu and Ogihara (2006): Using Discriminant Analysis for Multi-class Classification: An Experimental Investigation.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Training-data","page":"BayesianLDA","title":"Training data","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"Here:","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Hyper-parameters","page":"BayesianLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"method::Symbol=:gevd: choice of solver, one of :gevd or :whiten methods.\ncov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.\ncov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).\noutdim::Int=0: The output dimension, i.e., dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.\nregcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.\npriors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Operations","page":"BayesianLDA","title":"Operations","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Fitted-parameters","page":"BayesianLDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).\npriors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Report","page":"BayesianLDA","title":"Report","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The mean of the untransformed training data. A vector of length indim.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).\nclass_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).\nclass_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nSb: The between class scatter matrix.\nSw: The within class scatter matrix.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Examples","page":"BayesianLDA","title":"Examples","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"using MLJ\n\nBayesianLDA = @load BayesianLDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = BayesianLDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"See also LDA, SubspaceLDA, BayesianSubspaceLDA","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/#GaussianMixtureClusterer_BetaML","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"","category":"section"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/#Hyperparameters:","page":"GaussianMixtureClusterer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::AbstractVector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the ?GMM module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the initialisation_strategy parameter is set to \"gived\". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector of n_classes mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def: [DiagonalGaussian() for i in 1:n_classes]]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nmaximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/#Example:","page":"GaussianMixtureClusterer","title":"Example:","text":"","category":"section"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"\njulia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load GaussianMixtureClusterer pkg = \"BetaML\" verbosity=0\nBetaML.GMM.GaussianMixtureClusterer\n\njulia> model = modelType()\nGaussianMixtureClusterer(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"kmeans\", \n maximum_iterations = 9223372036854775807, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(GaussianMixtureClusterer(n_classes = 3, …), …).\nIter. 1: Var. of the post 10.800150114964184 Log-likelihood -650.0186451891216\n\njulia> classes_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.17e-15, 3=>2.1900000000000003e-31)\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>1.25e-13, 3=>5.87e-31)\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.5e-15, 3=>1.55e-32)\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>6.93e-14, 3=>3.37e-31)\n ⋮\n UnivariateFinite{Multiclass{3}}(1=>5.39e-25, 2=>0.0167, 3=>0.983)\n UnivariateFinite{Multiclass{3}}(1=>7.5e-29, 2=>0.000106, 3=>1.0)\n UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)","category":"page"},{"location":"models/RandomForestClassifier_BetaML/#RandomForestClassifier_BetaML","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"","category":"section"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"mutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/RandomForestClassifier_BetaML/#Hyperparameters:","page":"RandomForestClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"n_trees::Int64\nmax_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimensions]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.\nβ::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour \"better\" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/RandomForestClassifier_BetaML/#Example-:","page":"RandomForestClassifier","title":"Example :","text":"","category":"section"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load RandomForestClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Trees.RandomForestClassifier\n\njulia> model = modelType()\nRandomForestClassifier(\n n_trees = 30, \n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.gini, \n β = 0.0, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(RandomForestClassifier(n_trees = 30, …), …).\n\njulia> cat_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#DecisionTreeClassifier_DecisionTree","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"DecisionTreeClassifier","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"A model type for constructing a CART decision tree classifier, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"Do model = DecisionTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeClassifier(max_depth=...).","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"DecisionTreeClassifier implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): \"Classification and regression trees\". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Training-data","page":"DecisionTreeClassifier","title":"Training data","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"where","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Hyperparameters","page":"DecisionTreeClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: max number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=0: number of features to select at random (0 for all)\npost_prune=false: set to true for post-fit pruning\nmerge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold\ndisplay_depth=5: max depth to show when displaying the tree\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Operations","page":"DecisionTreeClassifier","title":"Operations","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.\npredict_mode(mach, Xnew): instead return the mode of each prediction above.","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Fitted-parameters","page":"DecisionTreeClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"raw_tree: the raw Node, Leaf or Root object returned by the core DecisionTree.jl algorithm\ntree: a visualizable, wrapped version of raw_tree implementing the AbstractTrees.jl interface; see \"Examples\" below\nencoding: dictionary of target classes keyed on integers used internally by DecisionTree.jl\nfeatures: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Report","page":"DecisionTreeClassifier","title":"Report","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"classes_seen: list of target classes actually observed in training\nprint_tree: alternative method to print the fitted tree, with single argument the tree depth; interpretation requires internal integer-class encoding (see \"Fitted parameters\" above).\nfeatures: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Accessor-functions","page":"DecisionTreeClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Examples","page":"DecisionTreeClassifier","title":"Examples","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"using MLJ\nDecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree\nmodel = DecisionTreeClassifier(max_depth=3, min_samples_split=3)\n\nX, y = @load_iris\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\nyhat = predict(mach, Xnew) ## probabilistic predictions\npredict_mode(mach, Xnew) ## point predictions\npdf.(yhat, \"virginica\") ## probabilities for the \"verginica\" class\n\njulia> tree = fitted_params(mach).tree\npetal_length < 2.45\n├─ setosa (50/50)\n└─ petal_width < 1.75\n ├─ petal_length < 4.95\n │ ├─ versicolor (47/48)\n │ └─ virginica (4/6)\n └─ petal_length < 4.85\n ├─ virginica (2/3)\n └─ virginica (43/43)\n\nusing Plots, TreeRecipe\nplot(tree) ## for a graphical representation of the tree\n\nfeature_importances(mach)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeClassifier.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#DNNDetector_OutlierDetectionNeighbors","page":"DNNDetector","title":"DNNDetector","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"DNNDetector(d = 0,\n metric = Euclidean(),\n algorithm = :kdtree,\n leafsize = 10,\n reorder = true,\n parallel = false)","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"Anomaly score based on the number of neighbors in a hypersphere of radius d. Knorr et al. [1] directly converted the resulting outlier scores to labels, thus this implementation does not fully reflect the approach from the paper.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#Parameters","page":"DNNDetector","title":"Parameters","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"d::Real","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"The hypersphere radius used to calculate the global density of an instance.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"metric::Metric","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"leafsize::Int","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"reorder::Bool","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"parallel::Bool","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#Examples","page":"DNNDetector","title":"Examples","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"using OutlierDetection: DNNDetector, fit, transform\ndetector = DNNDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#References","page":"DNNDetector","title":"References","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"[1] Knorr, Edwin M.; Ng, Raymond T. (1998): Algorithms for Mining Distance-Based Outliers in Large Datasets.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#RidgeRegressor_MultivariateStats","page":"RidgeRegressor","title":"RidgeRegressor","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"A model type for constructing a ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(lambda=...).","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor adds a quadratic penalty term to least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. Options exist to specify a bias term, and to adjust the strength of the penalty term.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Training-data","page":"RidgeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Here:","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Hyper-parameters","page":"RidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.\nbias=true: Include the bias term if true, otherwise fit without bias term.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Operations","page":"RidgeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Fitted-parameters","page":"RidgeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"coefficients: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Examples","page":"RidgeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"using MLJ\n\nRidgeRegressor = @load RidgeRegressor pkg=MultivariateStats\npipe = Standardizer() |> RidgeRegressor(lambda=10)\n\nX, y = @load_boston\n\nmach = machine(pipe, X, y) |> fit!\nyhat = predict(mach, X)\ntraining_error = l1(yhat, y) |> mean","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"See also LinearRegressor, MultitargetLinearRegressor, MultitargetRidgeRegressor","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/#GradientBoostingClassifier_MLJScikitLearnInterface","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"","category":"section"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"GradientBoostingClassifier","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"A model type for constructing a gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"GradientBoostingClassifier = @load GradientBoostingClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"Do model = GradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingClassifier(loss=...).","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/#SGDClassifier_MLJScikitLearnInterface","page":"SGDClassifier","title":"SGDClassifier","text":"","category":"section"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"SGDClassifier","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"A model type for constructing a sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"SGDClassifier = @load SGDClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"Do model = SGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDClassifier(loss=...).","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SGDClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"loss = hinge\npenalty = l2\nalpha = 0.0001\nl1_ratio = 0.15\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\nepsilon = 0.1\nn_jobs = nothing\nrandom_state = nothing\nlearning_rate = optimal\neta0 = 0.0\npower_t = 0.5\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nclass_weight = nothing\nwarm_start = false\naverage = false","category":"page"},{"location":"models/PCA_MultivariateStats/#PCA_MultivariateStats","page":"PCA","title":"PCA","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"PCA","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"A model type for constructing a pca, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"PCA = @load PCA pkg=MultivariateStats","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Do model = PCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PCA(maxoutdim=...).","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Principal component analysis learns a linear projection onto a lower dimensional space while preserving most of the initial variance seen in the training data.","category":"page"},{"location":"models/PCA_MultivariateStats/#Training-data","page":"PCA","title":"Training data","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"mach = machine(model, X)","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Here:","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/PCA_MultivariateStats/#Hyper-parameters","page":"PCA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"maxoutdim=0: Together with variance_ratio, controls the output dimension outdim chosen by the model. Specifically, suppose that k is the smallest integer such that retaining the k most significant principal components accounts for variance_ratio of the total variance in the training data. Then outdim = min(outdim, maxoutdim). If maxoutdim=0 (default) then the effective maxoutdim is min(n, indim - 1) where n is the number of observations and indim the number of features in the training data.\nvariance_ratio::Float64=0.99: The ratio of variance preserved after the transformation\nmethod=:auto: The method to use to solve the problem. Choices are\n:svd: Support Vector Decomposition of the matrix.\n:cov: Covariance matrix decomposition.\n:auto: Use :cov if the matrices first dimension is smaller than its second dimension and otherwise use :svd\nmean=nothing: if nothing, centering will be computed and applied, if set to 0 no centering (data is assumed pre-centered); if a vector is passed, the centering is done with that vector.","category":"page"},{"location":"models/PCA_MultivariateStats/#Operations","page":"PCA","title":"Operations","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/PCA_MultivariateStats/#Fitted-parameters","page":"PCA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and output respectively.","category":"page"},{"location":"models/PCA_MultivariateStats/#Report","page":"PCA","title":"Report","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim = min(n, indim, maxoutdim) is the output dimension; here n is the number of observations.\ntprincipalvar: Total variance of the principal components.\ntresidualvar: Total residual variance.\ntvar: Total observation variance (principal + residual variance).\nmean: The mean of the untransformed training data, of length indim.\nprincipalvars: The variance of the principal components. An AbstractVector of length outdim\nloadings: The models loadings, weights for each variable used when calculating principal components. A matrix of size (indim, outdim) where indim and outdim are as defined above.","category":"page"},{"location":"models/PCA_MultivariateStats/#Examples","page":"PCA","title":"Examples","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"using MLJ\n\nPCA = @load PCA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = PCA(maxoutdim=2)\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"See also KernelPCA, ICA, FactorAnalysis, PPCA","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#ENNUndersampler_Imbalance","page":"ENNUndersampler","title":"ENNUndersampler","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"Initiate a ENN undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"ENNUndersampler","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"A model type for constructing a enn undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"ENNUndersampler = @load ENNUndersampler pkg=Imbalance","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"Do model = ENNUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ENNUndersampler(k=...).","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"ENNUndersampler undersamples a dataset by removing (\"cleaning\") points that violate a certain condition such as having a different class compared to the majority of the neighbors as proposed in Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Training-data","page":"ENNUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by \tmach = machine(model)","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"For default values of the hyper-parameters, model can be constructed by \tmodel = ENNUndersampler()","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Hyperparameters","page":"ENNUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"k::Integer=5: Number of nearest neighbors to consider in the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class.\nkeep_condition::AbstractString=\"mode\": The condition that leads to cleaning a point upon violation. Takes one of \"exists\", \"mode\", \"only mode\" and \"all\"","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"- `\"exists\"`: the point has at least one neighbor from the same class\n- `\"mode\"`: the class of the point is one of the most frequent classes of the neighbors (there may be many)\n- `\"only mode\"`: the class of the point is the single most frequent class of the neighbors\n- `\"all\"`: the class of the point is the same as all the neighbors","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.\nCan be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float minimum ratio for that class\nforce_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.\ntry_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Transform-Inputs","page":"ENNUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Transform-Outputs","page":"ENNUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Operations","page":"ENNUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"transform(mach, X, y): resample the data X and y using ENNUndersampler, returning the undersampled versions","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Example","page":"ENNUndersampler","title":"Example","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n min_sep=0.01, stds=[3.0 3.0 3.0], class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y; ref=\"minority\")\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load ENN model type:\nENNUndersampler = @load ENNUndersampler pkg=Imbalance\n\n## underample the majority classes to sizes relative to the minority class:\nundersampler = ENNUndersampler(min_ratios=0.5, rng=42)\nmach = machine(undersampler)\nX_under, y_under = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 24 (240.0%) ","category":"page"},{"location":"acceleration_and_parallelism/#Acceleration-and-Parallelism","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"","category":"section"},{"location":"acceleration_and_parallelism/#User-facing-interface","page":"Acceleration and Parallelism","title":"User-facing interface","text":"","category":"section"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"To enable composable, extensible acceleration of core MLJ methods, ComputationalResources.jl is utilized to provide some basic types and functions to make implementing acceleration easy. However, ambitious users or package authors have the option to define their own types to be passed as resources to acceleration, which must be <:ComputationalResources.AbstractResource.","category":"page"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"Methods which support some form of acceleration support the acceleration keyword argument, which can be passed a \"resource\" from ComputationalResources. For example, passing acceleration=CPUProcesses() will utilize Distributed's multiprocessing functionality to accelerate the computation, while acceleration=CPUThreads() will use Julia's PARTR threading model to perform acceleration.","category":"page"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"The default computational resource is CPU1(), which is simply serial processing via CPU. The default resource can be changed as in this example: MLJ.default_resource(CPUProcesses()). The argument must always have type <:ComputationalResource.AbstractResource. To inspect the current default, use MLJ.default_resource().","category":"page"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"note: Note\nYou cannot use CPUThreads() with models wrapping python code.","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/#MiniBatchKMeans_MLJScikitLearnInterface","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"","category":"section"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"MiniBatchKMeans","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"A model type for constructing a Mini-Batch K-Means clustering., based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"MiniBatchKMeans = @load MiniBatchKMeans pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"Do model = MiniBatchKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MiniBatchKMeans(n_clusters=...).","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/#Hyper-parameters","page":"MiniBatchKMeans","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"n_clusters = 8\nmax_iter = 100\nbatch_size = 100\nverbose = 0\ncompute_labels = true\nrandom_state = nothing\ntol = 0.0\nmax_no_improvement = 10\ninit_size = nothing\nn_init = 3\ninit = k-means++\nreassignment_ratio = 0.01","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#TomekUndersampler_Imbalance","page":"TomekUndersampler","title":"TomekUndersampler","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"Initiate a tomek undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"TomekUndersampler","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"A model type for constructing a tomek undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"TomekUndersampler = @load TomekUndersampler pkg=Imbalance","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"Do model = TomekUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TomekUndersampler(min_ratios=...).","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"TomekUndersampler undersamples by removing any point that is part of a tomek link in the data. As defined in, Ivan Tomek. Two modifications of cnn. IEEE Trans. Systems, Man and Cybernetics, 6:769–772, 1976.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Training-data","page":"TomekUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"For default values of the hyper-parameters, model can be constructed by model = TomekUndersampler()","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Hyperparameters","page":"TomekUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.\nCan be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float minimum ratio for that class\nforce_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.\ntry_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Transform-Inputs","page":"TomekUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Transform-Outputs","page":"TomekUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Operations","page":"TomekUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"transform(mach, X, y): resample the data X and y using TomekUndersampler, returning both the new and original observations","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Example","page":"TomekUndersampler","title":"Example","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n min_sep=0.01, stds=[3.0 3.0 3.0], class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y; ref=\"minority\")\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load TomekUndersampler model type:\nTomekUndersampler = @load TomekUndersampler pkg=Imbalance\n\n## Underample the majority classes to sizes relative to the minority class:\ntomek_undersampler = TomekUndersampler(min_ratios=1.0, rng=42)\nmach = machine(tomek_undersampler)\nX_under, y_under = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 22 (115.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 36 (189.5%)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#OneRuleClassifier_OneRule","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"OneRuleClassifier","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"OneRuleClassifier = @load OneRuleClassifier pkg=OneRule","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"Do model = OneRuleClassifier() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"OneRuleClassifier implements the OneRule method for classification by Robert Holte (\"Very simple classification rules perform well on most commonly used datasets\" in: Machine Learning 11.1 (1993), pp. 63-90). ","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"For more information see:\n\n- Witten, Ian H., Eibe Frank, and Mark A. Hall. \n Data Mining Practical Machine Learning Tools and Techniques Third Edition. \n Morgan Kaufmann, 2017, pp. 93-96.\n- [Machine Learning - (One|Simple) Rule](https://datacadamia.com/data_mining/one_rule)\n- [OneRClassifier - One Rule for Classification](http://rasbt.github.io/mlxtend/user_guide/classifier/OneRClassifier/)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Training-data","page":"OneRuleClassifier","title":"Training data","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Multiclass, OrderedFactor, or <:Finite; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Hyper-parameters","page":"OneRuleClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"This classifier has no hyper-parameters.","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Operations","page":"OneRuleClassifier","title":"Operations","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"predict(mach, Xnew): return (deterministic) predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Fitted-parameters","page":"OneRuleClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"tree: the tree (a OneTree) returned by the core OneTree.jl algorithm\nall_classes: all classes (i.e. levels) of the target (used also internally to transfer levels-information to predict)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Report","page":"OneRuleClassifier","title":"Report","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"tree: The OneTree created based on the training data\nnrules: The number of rules tree contains\nerror_rate: fraction of wrongly classified instances\nerror_count: number of wrongly classified instances\nclasses_seen: list of target classes actually observed in training\nfeatures: the names of the features encountered in training","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Examples","page":"OneRuleClassifier","title":"Examples","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"using MLJ\n\nORClassifier = @load OneRuleClassifier pkg=OneRule\n\norc = ORClassifier()\n\noutlook = [\"sunny\", \"sunny\", \"overcast\", \"rainy\", \"rainy\", \"rainy\", \"overcast\", \"sunny\", \"sunny\", \"rainy\", \"sunny\", \"overcast\", \"overcast\", \"rainy\"]\ntemperature = [\"hot\", \"hot\", \"hot\", \"mild\", \"cool\", \"cool\", \"cool\", \"mild\", \"cool\", \"mild\", \"mild\", \"mild\", \"hot\", \"mild\"]\nhumidity = [\"high\", \"high\", \"high\", \"high\", \"normal\", \"normal\", \"normal\", \"high\", \"normal\", \"normal\", \"normal\", \"high\", \"normal\", \"high\"]\nwindy = [\"false\", \"true\", \"false\", \"false\", \"false\", \"true\", \"true\", \"false\", \"false\", \"false\", \"true\", \"true\", \"false\", \"true\"]\n\nweather_data = (outlook = outlook, temperature = temperature, humidity = humidity, windy = windy)\nplay_data = [\"no\", \"no\", \"yes\", \"yes\", \"yes\", \"no\", \"yes\", \"no\", \"yes\", \"yes\", \"yes\", \"yes\", \"yes\", \"no\"]\n\nweather = coerce(weather_data, Textual => Multiclass)\nplay = coerce(play, Multiclass)\n\nmach = machine(orc, weather, play)\nfit!(mach)\n\nyhat = MLJ.predict(mach, weather) ## in a real context 'new' `weather` data would be used\none_tree = fitted_params(mach).tree\nreport(mach).error_rate","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"See also OneRule.jl.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#MultinomialNBClassifier_NaiveBayes","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"A model type for constructing a multinomial naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"The multinomial naive Bayes classifier is often applied when input features consist of a counts (scitype Count) and when observations for a fixed target class are generated from a multinomial distribution with fixed probability vector, but whose sample length varies from observation to observation. For example, features might represent word counts in text documents being classified by sentiment.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Training-data","page":"MultinomialNBClassifier","title":"Training data","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Here:","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Count; check the column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Hyper-parameters","page":"MultinomialNBClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"alpha=1: Lindstone smoothing in estimation of multinomial probability vectors from training histograms (default corresponds to Laplacian smoothing).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Operations","page":"MultinomialNBClassifier","title":"Operations","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.\npredict_mode(mach, Xnew): Return the mode of above predictions.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Fitted-parameters","page":"MultinomialNBClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"c_counts: A dictionary containing the observed count of each input class.\nx_counts: A dictionary containing the categorical counts of each input class.\nx_totals: The sum of each count (input feature), ungrouped.\nn_obs: The total number of observations in the training data.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Examples","page":"MultinomialNBClassifier","title":"Examples","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"using MLJ\nimport TextAnalysis\n\nCountTransformer = @load CountTransformer pkg=MLJText\nMultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes\n\ntokenized_docs = TextAnalysis.tokenize.([\n \"I am very mad. You never listen.\",\n \"You seem to be having trouble? Can I help you?\",\n \"Our boss is mad at me. I hope he dies.\",\n \"His boss wants to help me. She is nice.\",\n \"Thank you for your help. It is nice working with you.\",\n \"Never do that again! I am so mad. \",\n])\n\nsentiment = [\n \"negative\",\n \"positive\",\n \"negative\",\n \"positive\",\n \"positive\",\n \"negative\",\n]\n\nmach1 = machine(CountTransformer(), tokenized_docs) |> fit!\n\n## matrix of counts:\nX = transform(mach1, tokenized_docs)\n\n## to ensure scitype(y) <: AbstractVector{<:OrderedFactor}:\ny = coerce(sentiment, OrderedFactor)\n\nclassifier = MultinomialNBClassifier()\nmach2 = machine(classifier, X, y)\nfit!(mach2, rows=1:4)\n\n## probabilistic predictions:\ny_prob = predict(mach2, rows=5:6) ## distributions\npdf.(y_prob, \"positive\") ## probabilities for \"positive\"\nlog_loss(y_prob, y[5:6])\n\n## point predictions:\nyhat = mode.(y_prob) ## or `predict_mode(mach2, rows=5:6)`","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"See also GaussianNBClassifier","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#ProbabilisticNuSVC_LIBSVM","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"ProbabilisticNuSVC","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"A model type for constructing a probabilistic ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Do model = ProbabilisticNuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticNuSVC(kernel=...).","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"This model is identical to NuSVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to total computation time.","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Platt, John (1999): \"Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.\"","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Training-data","page":"ProbabilisticNuSVC","title":"Training data","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"where","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Hyper-parameters","page":"ProbabilisticNuSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\nnu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Operations","page":"ProbabilisticNuSVC","title":"Operations","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Fitted-parameters","page":"ProbabilisticNuSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Report","page":"ProbabilisticNuSVC","title":"Report","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Examples","page":"ProbabilisticNuSVC","title":"Examples","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Using-a-built-in-kernel","page":"ProbabilisticNuSVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"using MLJ\nimport LIBSVM\n\nProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM ## model type\nmodel = ProbabilisticNuSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> probs = predict(mach, Xnew)\n3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.00313, versicolor=>0.0247, virginica=>0.972)\n UnivariateFinite{Multiclass{3}}(setosa=>0.000598, versicolor=>0.0155, virginica=>0.984)\n UnivariateFinite{Multiclass{3}}(setosa=>2.27e-6, versicolor=>2.73e-6, virginica=>1.0)\n\njulia> yhat = mode.(probs)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#User-defined-kernels","page":"ProbabilisticNuSVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = ProbabilisticNuSVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\nprobs = predict(mach, Xnew)","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"See also the classifiers NuSVC, SVC, ProbabilisticSVC and LinearSVC. And see LIVSVM.jl and the original C implementation. documentation.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#UnivariateFillImputer_MLJModels","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"UnivariateFillImputer","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"For imputing missing values in tabular data, use FillImputer instead.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Training-data","page":"UnivariateFillImputer","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"where","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Hyper-parameters","page":"UnivariateFillImputer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Operations","page":"UnivariateFillImputer","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Fitted-parameters","page":"UnivariateFillImputer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"filler: the fill value to be imputed in all new data","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Examples","page":"UnivariateFillImputer","title":"Examples","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"using MLJ\nimputer = UnivariateFillImputer()\n\nx_continuous = [1.0, 2.0, missing, 3.0]\nx_multiclass = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass)\nx_count = [1, 1, 1, 2, missing, 3, 3]\n\nmach = machine(imputer, x_continuous)\nfit!(mach)\n\njulia> fitted_params(mach)\n(filler = 2.0,)\n\njulia> transform(mach, [missing, missing, 101.0])\n3-element Vector{Float64}:\n 2.0\n 2.0\n 101.0\n\nmach2 = machine(imputer, x_multiclass) |> fit!\n\njulia> transform(mach2, x_multiclass)\n5-element CategoricalArray{String,1,UInt32}:\n \"y\"\n \"n\"\n \"y\"\n \"y\"\n \"y\"\n\nmach3 = machine(imputer, x_count) |> fit!\n\njulia> transform(mach3, [missing, missing, 5])\n3-element Vector{Int64}:\n 2\n 2\n 5","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"For imputing tabular data, use FillImputer.","category":"page"},{"location":"models/RandomForestImputer_BetaML/#RandomForestImputer_BetaML","page":"RandomForestImputer","title":"RandomForestImputer","text":"","category":"section"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"mutable struct RandomForestImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/RandomForestImputer_BetaML/#Hyperparameters:","page":"RandomForestImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"n_trees::Int64: Number of (decision) trees in the forest [def: 30]\nmax_depth::Union{Nothing, Int64}: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: nothing, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Union{Nothing, Int64}: The maximum number of (random) features to consider at each partitioning [def: nothing, i.e. square root of the data dimension]\nforced_categorical_cols::Vector{Int64}: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]\nsplitting_criterion::Union{Nothing, Function}: Either gini, entropy or variance. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: nothing, i.e. gini for categorical labels (classification task) and variance for numerical labels(regression task)]. It can be an anonymous function.\nrecursive_passages::Int64: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/RandomForestImputer_BetaML/#Example:","page":"RandomForestImputer","title":"Example:","text":"","category":"section"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"julia> using MLJ\n\njulia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;\n\njulia> modelType = @load RandomForestImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.RandomForestImputer\n\njulia> model = modelType(n_trees=40)\nRandomForestImputer(\n n_trees = 40, \n max_depth = nothing, \n min_gain = 0.0, \n min_records = 2, \n max_features = nothing, \n forced_categorical_cols = Int64[], \n splitting_criterion = nothing, \n recursive_passages = 1, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(RandomForestImputer(n_trees = 40, …), …).\n\njulia> X_full = transform(mach) |> MLJ.matrix\n9×2 Matrix{Float64}:\n 1.0 10.5\n 1.5 10.3909\n 1.8 8.0\n 1.7 15.0\n 3.2 40.0\n 2.88375 8.66125\n 3.3 38.0\n 3.98125 -2.3\n 5.2 -2.4","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#BayesianSubspaceLDA_MultivariateStats","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"BayesianSubspaceLDA","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"A model type for constructing a Bayesian subspace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"Do model = BayesianSubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianSubspaceLDA(normalize=...).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"The Bayesian multiclass subspace linear discriminant analysis algorithm learns a projection matrix as described in SubspaceLDA. The posterior class probability distribution is derived as in BayesianLDA.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Training-data","page":"BayesianSubspaceLDA","title":"Training data","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"Here:","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Hyper-parameters","page":"BayesianSubspaceLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Operations","page":"BayesianSubspaceLDA","title":"Operations","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Fitted-parameters","page":"BayesianSubspaceLDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).\npriors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Report","page":"BayesianSubspaceLDA","title":"Report","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The overall mean of the training data.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nexplained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Examples","page":"BayesianSubspaceLDA","title":"Examples","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"using MLJ\n\nBayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = BayesianSubspaceLDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"See also LDA, BayesianLDA, SubspaceLDA","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#DecisionTreeRegressor_DecisionTree","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"DecisionTreeRegressor","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"A model type for constructing a CART decision tree regressor, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"Do model = DecisionTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeRegressor(max_depth=...).","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"DecisionTreeRegressor implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): \"Classification and regression trees\". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Training-data","page":"DecisionTreeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"where","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Hyperparameters","page":"DecisionTreeRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: max number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=0: number of features to select at random (0 for all)\npost_prune=false: set to true for post-fit pruning\nmerge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Operations","page":"DecisionTreeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Fitted-parameters","page":"DecisionTreeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"tree: the tree or stump object returned by the core DecisionTree.jl algorithm\nfeatures: the names of the features encountered in training","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Report","page":"DecisionTreeRegressor","title":"Report","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Accessor-functions","page":"DecisionTreeRegressor","title":"Accessor functions","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Examples","page":"DecisionTreeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"using MLJ\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nmodel = DecisionTreeRegressor(max_depth=3, min_samples_split=3)\n\nX, y = make_regression(100, 4; rng=123) ## synthetic data\nmach = machine(model, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2; rng=123)\nyhat = predict(mach, Xnew) ## new predictions\n\njulia> fitted_params(mach).tree\nx1 < 0.2758\n├─ x2 < 0.9137\n│ ├─ x1 < -0.9582\n│ │ ├─ 0.9189256882087312 (0/12)\n│ │ └─ -0.23180616021065256 (0/38)\n│ └─ -1.6461153800037722 (0/9)\n└─ x1 < 1.062\n ├─ x2 < -0.4969\n │ ├─ -0.9330755147107384 (0/5)\n │ └─ -2.3287967825015548 (0/17)\n └─ x2 < 0.4598\n ├─ -2.931299926506291 (0/11)\n └─ -4.726518740473489 (0/8)\n\nfeature_importances(mach) ## get feature importances","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeRegressor.","category":"page"},{"location":"models/IForestDetector_OutlierDetectionPython/#IForestDetector_OutlierDetectionPython","page":"IForestDetector","title":"IForestDetector","text":"","category":"section"},{"location":"models/IForestDetector_OutlierDetectionPython/","page":"IForestDetector","title":"IForestDetector","text":"IForestDetector(n_estimators = 100,\n max_samples = \"auto\",\n max_features = 1.0\n bootstrap = false,\n random_state = nothing,\n verbose = 0,\n n_jobs = 1)","category":"page"},{"location":"models/IForestDetector_OutlierDetectionPython/","page":"IForestDetector","title":"IForestDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.iforest","category":"page"},{"location":"models/RODDetector_OutlierDetectionPython/#RODDetector_OutlierDetectionPython","page":"RODDetector","title":"RODDetector","text":"","category":"section"},{"location":"models/RODDetector_OutlierDetectionPython/","page":"RODDetector","title":"RODDetector","text":"RODDetector(parallel_execution = false)","category":"page"},{"location":"models/RODDetector_OutlierDetectionPython/","page":"RODDetector","title":"RODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.rod","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/#RandomForestClassifier_MLJScikitLearnInterface","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"","category":"section"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A model type for constructing a random forest classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier = @load RandomForestClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(n_estimators=...).","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.","category":"page"},{"location":"about_mlj/#About-MLJ","page":"About MLJ","title":"About MLJ","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing over 180 machine learning models written in Julia and other languages. In particular MLJ wraps a large number of scikit-learn models.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ is released under the MIT license.","category":"page"},{"location":"about_mlj/#Lightning-tour","page":"About MLJ","title":"Lightning tour","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"For help learning to use MLJ, see Learning MLJ.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"A self-contained notebook and julia script of this demonstration is also available here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"The first code snippet below creates a new Julia environment MLJ_tour and installs just those packages needed for the tour. See Installation for more on creating a Julia environment for use with MLJ.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Julia installation instructions are here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"using Pkg\nPkg.activate(\"MLJ_tour\", shared=true)\nPkg.add(\"MLJ\")\nPkg.add(\"MLJIteration\")\nPkg.add(\"EvoTrees\")","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"In MLJ a model is just a container for hyper-parameters, and that's all. Here we will apply several kinds of model composition before binding the resulting \"meta-model\" to data in a machine for evaluation using cross-validation.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Loading and instantiating a gradient tree-boosting model:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"using MLJ\nBooster = @load EvoTreeRegressor # loads code defining a model type\nbooster = Booster(max_depth=2) # specify hyper-parameter at construction\nbooster.nrounds = 50 # or mutate afterwards","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"This model is an example of an iterative model. As it stands, the number of iterations nrounds is fixed.","category":"page"},{"location":"about_mlj/#Composition-1:-Wrapping-the-model-to-make-it-\"self-iterating\"","page":"About MLJ","title":"Composition 1: Wrapping the model to make it \"self-iterating\"","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Let's create a new model that automatically learns the number of iterations, using the NumberSinceBest(3) criterion, as applied to an out-of-sample l1 loss:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"using MLJIteration\niterated_booster = IteratedModel(model=booster,\n resampling=Holdout(fraction_train=0.8),\n controls=[Step(2), NumberSinceBest(3), NumberLimit(300)],\n measure=l1,\n retrain=true)","category":"page"},{"location":"about_mlj/#Composition-2:-Preprocess-the-input-features","page":"About MLJ","title":"Composition 2: Preprocess the input features","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Combining the model with categorical feature encoding:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"pipe = ContinuousEncoder() |> iterated_booster","category":"page"},{"location":"about_mlj/#Composition-3:-Wrapping-the-model-to-make-it-\"self-tuning\"","page":"About MLJ","title":"Composition 3: Wrapping the model to make it \"self-tuning\"","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"First, we define a hyper-parameter range for optimization of a (nested) hyper-parameter:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"max_depth_range = range(pipe,\n :(deterministic_iterated_model.model.max_depth),\n lower = 1,\n upper = 10)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Now we can wrap the pipeline model in an optimization strategy to make it \"self-tuning\":","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"self_tuning_pipe = TunedModel(model=pipe,\n tuning=RandomSearch(),\n ranges=max_depth_range,\n resampling=CV(nfolds=3, rng=456),\n measure=l1,\n acceleration=CPUThreads(),\n n=50)","category":"page"},{"location":"about_mlj/#Binding-to-data-and-evaluating-performance","page":"About MLJ","title":"Binding to data and evaluating performance","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Loading a selection of features and labels from the Ames House Price dataset:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"X, y = @load_reduced_ames","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Evaluating the \"self-tuning\" pipeline model's performance using 5-fold cross-validation (implies multiple layers of nested resampling):","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> evaluate(self_tuning_pipe, X, y,\n measures=[l1, l2],\n resampling=CV(nfolds=5, rng=123),\n acceleration=CPUThreads(),\n verbosity=2)\nPerformanceEvaluation object with these fields:\n measure, measurement, operation, per_fold,\n per_observation, fitted_params_per_fold,\n report_per_fold, train_test_pairs\nExtract:\n┌───────────────┬─────────────┬───────────┬───────────────────────────────────────────────┐\n│ measure │ measurement │ operation │ per_fold │\n├───────────────┼─────────────┼───────────┼───────────────────────────────────────────────┤\n│ LPLoss(p = 1) │ 17200.0 │ predict │ [16500.0, 17100.0, 16300.0, 17500.0, 18900.0] │\n│ LPLoss(p = 2) │ 6.83e8 │ predict │ [6.14e8, 6.64e8, 5.98e8, 6.37e8, 9.03e8] │\n└───────────────┴─────────────┴───────────┴───────────────────────────────────────────────┘","category":"page"},{"location":"about_mlj/#Key-goals","page":"About MLJ","title":"Key goals","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Offer a consistent way to use, compose and tune machine learning models in Julia,\nPromote the improvement of the Julia ML/Stats ecosystem by making it easier to use models from a wide range of packages,\nUnlock performance gains by exploiting Julia's support for parallelism, automatic differentiation, GPU, optimization etc.","category":"page"},{"location":"about_mlj/#Key-features","page":"About MLJ","title":"Key features","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Data agnostic, train most models on any data X supported by the Tables.jl interface (needs Tables.istable(X) == true).\nExtensive, state-of-the-art, support for model composition (pipelines, stacks and, more generally, learning networks). See more below.\nConvenient syntax to tune and evaluate (composite) models.\nConsistent interface to handle probabilistic predictions.\nExtensible tuning interface, to support a growing number of optimization strategies, and designed to play well with model composition.\nOptions to accelerate model evaluation and tuning with multithreading and/or distributed processing.","category":"page"},{"location":"about_mlj/#Model-composability","page":"About MLJ","title":"Model composability","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"The generic model composition API's provided by other toolboxes we have surveyed share one or more of the following shortcomings, which do not exist in MLJ:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Composite models do not inherit all the behavior of ordinary models.\nComposition is limited to linear (non-branching) pipelines.\nSupervised components in a linear pipeline can only occur at the end of the pipeline.\nOnly static (unlearned) target transformations/inverse transformations are supported.\nHyper-parameters in homogeneous model ensembles cannot be coupled.\nModel stacking, with out-of-sample predictions for base learners, cannot be implemented (using the generic API alone).\nHyper-parameters and/or learned parameters of component models are not easily inspected or manipulated (by tuning algorithms, for example)\nComposite models cannot implement multiple operations, for example, both a predict and transform method (as in clustering models) or both a transform and inverse_transform method.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Some of these features are demonstrated in this notebook","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"For more information see the MLJ design paper or our detailed paper on the composition interface.","category":"page"},{"location":"about_mlj/#Getting-help-and-reporting-problems","page":"About MLJ","title":"Getting help and reporting problems","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Users are encouraged to provide feedback on their experience using MLJ and to report issues.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"For a query to have maximum exposure to maintainers and users, start a discussion thread at Julia Discourse Machine Learning and tag your issue \"mlj\". Queries can also be posted as issues, or on the #mlj slack workspace in the Julia Slack channel.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Bugs, suggestions, and feature requests can be posted here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Users are also welcome to join the #mlj Julia slack channel to ask questions and make suggestions.","category":"page"},{"location":"about_mlj/#Installation","page":"About MLJ","title":"Installation","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Initially, it is recommended that MLJ and associated packages be installed in a new environment to avoid package conflicts. You can do this with","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> using Pkg; Pkg.activate(\"my_MLJ_env\", shared=true)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Installing MLJ is also done with the package manager:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> Pkg.add(\"MLJ\")","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Optional: To test your installation, run","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> Pkg.test(\"MLJ\")","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"It is important to note that MLJ is essentially a big wrapper providing unified access to model-providing packages. For this reason, one generally needs to add further packages to your environment to make model-specific code available. This happens automatically when you use MLJ's interactive load command @iload, as in","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> Tree = @iload DecisionTreeClassifier # load type\njulia> tree = Tree() # instance","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"where you will also be asked to choose a providing package, for more than one provide a DecisionTreeClassifier model. For more on identifying the name of an applicable model, see Model Search. For non-interactive loading of code (e.g., from a module or function) see Loading Model Code.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"It is recommended that you start with models from more mature packages such as DecisionTree.jl, ScikitLearn.jl or XGBoost.jl.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ is supported by several satellite packages (MLJTuning, MLJModelInterface, etc) which the general user is not required to install directly. Developers can learn more about these here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"See also the alternative installation instructions for Modifying Behavior.","category":"page"},{"location":"about_mlj/#Funding","page":"About MLJ","title":"Funding","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ was initially created as a Tools, Practices and Systems project at the Alan Turing Institute in 2019. Current funding is provided by a New Zealand Strategic Science Investment Fund awarded to the University of Auckland.","category":"page"},{"location":"about_mlj/#Citing-MLJ","page":"About MLJ","title":"Citing MLJ","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"An overview of MLJ design:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"(Image: DOI)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"@article{Blaom2020,\n doi = {10.21105/joss.02704},\n url = {https://doi.org/10.21105/joss.02704},\n year = {2020},\n publisher = {The Open Journal},\n volume = {5},\n number = {55},\n pages = {2704},\n author = {Anthony D. Blaom and Franz Kiraly and Thibaut Lienart and Yiannis Simillides and Diego Arenas and Sebastian J. Vollmer},\n title = {{MLJ}: A Julia package for composable machine learning},\n journal = {Journal of Open Source Software}\n}","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"An in-depth view of MLJ's model composition design:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"(Image: arXiv)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"@misc{blaom2020flexible,\n title={Flexible model composition in machine learning and its implementation in {MLJ}},\n author={Anthony D. Blaom and Sebastian J. Vollmer},\n year={2020},\n eprint={2012.15505},\n archivePrefix={arXiv},\n primaryClass={cs.LG}\n}","category":"page"},{"location":"models/PPCA_MultivariateStats/#PPCA_MultivariateStats","page":"PPCA","title":"PPCA","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"PPCA","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"A model type for constructing a probabilistic PCA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"PPCA = @load PPCA pkg=MultivariateStats","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Do model = PPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PPCA(maxoutdim=...).","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Probabilistic principal component analysis is a dimension-reduction algorithm which represents a constrained form of the Gaussian distribution in which the number of free parameters can be restricted while still allowing the model to capture the dominant correlations in a data set. It is expressed as the maximum likelihood solution of a probabilistic latent variable model. For details, see Bishop (2006): C. M. Pattern Recognition and Machine Learning.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Training-data","page":"PPCA","title":"Training data","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"mach = machine(model, X)","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Here:","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/PPCA_MultivariateStats/#Hyper-parameters","page":"PPCA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.\nmethod::Symbol=:ml: The method to use to solve the problem, one of :ml, :em, :bayes.\nmaxiter::Int=1000: The maximum number of iterations.\ntol::Real=1e-6: The convergence tolerance.\nmean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Operations","page":"PPCA","title":"Operations","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Fitted-parameters","page":"PPCA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a principal component.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Report","page":"PPCA","title":"Report","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data.\ntvat: The variance of the components.\nloadings: The model's loadings matrix. A matrix of size (indim, outdim) where indim and outdim as as defined above.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Examples","page":"PPCA","title":"Examples","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"using MLJ\n\nPPCA = @load PPCA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = PPCA(maxoutdim=2)\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"See also KernelPCA, ICA, FactorAnalysis, PCA","category":"page"},{"location":"models/BM25Transformer_MLJText/#BM25Transformer_MLJText","page":"BM25Transformer","title":"BM25Transformer","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"BM25Transformer","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"A model type for constructing a b m25 transformer, based on MLJText.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"BM25Transformer = @load BM25Transformer pkg=MLJText","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Do model = BM25Transformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BM25Transformer(max_doc_freq=...).","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of Okapi BM25 document-word statistics. The BM25 scoring function uses both term frequency (TF) and inverse document frequency (IDF, defined below), as in TfidfTransformer, but additionally adjusts for the probability that a user will consider a search result relevant based, on the terms in the search query and those in each document.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"References:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"http://ethen8181.github.io/machine-learning/search/bm25_intro.html\nhttps://en.wikipedia.org/wiki/Okapi_BM25\nhttps://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html","category":"page"},{"location":"models/BM25Transformer_MLJText/#Training-data","page":"BM25Transformer","title":"Training data","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"mach = machine(model, X)","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Here:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:\nA vector of abstract strings (tokens), e.g., [\"I\", \"like\", \"Sam\", \".\", \"Sam\", \"is\", \"nice\", \".\"] (scitype AbstractVector{Textual})\nA dictionary of counts, indexed on abstract strings, e.g., Dict(\"I\"=>1, \"Sam\"=>2, \"Sam is\"=>1) (scitype Multiset{Textual}})\nA dictionary of counts, indexed on plain ngrams, e.g., Dict((\"I\",)=>1, (\"Sam\",)=>2, (\"I\", \"Sam\")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/BM25Transformer_MLJText/#Hyper-parameters","page":"BM25Transformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.\nmin_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.\nκ=2: The term frequency saturation characteristic. Higher values represent slower saturation. What we mean by saturation is the degree to which a term occurring extra times adds to the overall score.\nβ=0.075: Amplifies the particular document length compared to the average length. The bigger β is, the more document length is amplified in terms of the overall score. The default value is 0.75, and the bounds are restricted between 0 and 1.\nsmooth_idf=true: Control which definition of IDF to use (see above).","category":"page"},{"location":"models/BM25Transformer_MLJText/#Operations","page":"BM25Transformer","title":"Operations","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"transform(mach, Xnew): Based on the vocabulary, IDF, and mean word counts learned in training, return the matrix of BM25 scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.","category":"page"},{"location":"models/BM25Transformer_MLJText/#Fitted-parameters","page":"BM25Transformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"vocab: A vector containing the string used in the transformer's vocabulary.\nidf_vector: The transformer's calculated IDF vector.\nmean_words_in_docs: The mean number of words in each document.","category":"page"},{"location":"models/BM25Transformer_MLJText/#Examples","page":"BM25Transformer","title":"Examples","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"BM25Transformer accepts a variety of inputs. The example below transforms tokenized documents:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"using MLJ\nimport TextAnalysis\n\nBM25Transformer = @load BM25Transformer pkg=MLJText\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\nbm25_transformer = BM25Transformer()\n\njulia> tokenized_docs = TextAnalysis.tokenize.(docs)\n2-element Vector{Vector{String}}:\n [\"Hi\", \"my\", \"name\", \"is\", \"Sam\", \".\"]\n [\"How\", \"are\", \"you\", \"today\", \"?\"]\n\nmach = machine(bm25_transformer, tokenized_docs)\nfit!(mach)\n\nfitted_params(mach)\n\ntfidf_mat = transform(mach, tokenized_docs)","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Alternatively, one can provide documents pre-parsed as ngrams counts:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"using MLJ\nimport TextAnalysis\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncorpus = TextAnalysis.Corpus(TextAnalysis.NGramDocument.(docs, 1, 2))\nngram_docs = TextAnalysis.ngrams.(corpus)\n\njulia> ngram_docs[1]\nDict{AbstractString, Int64} with 11 entries:\n \"is\" => 1\n \"my\" => 1\n \"name\" => 1\n \".\" => 1\n \"Hi\" => 1\n \"Sam\" => 1\n \"my name\" => 1\n \"Hi my\" => 1\n \"name is\" => 1\n \"Sam .\" => 1\n \"is Sam\" => 1\n\nbm25_transformer = BM25Transformer()\nmach = machine(bm25_transformer, ngram_docs)\nMLJ.fit!(mach)\nfitted_params(mach)\n\ntfidf_mat = transform(mach, ngram_docs)","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"See also TfidfTransformer, CountTransformer","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/#DeterministicConstantClassifier_MLJModels","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"","category":"section"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"DeterministicConstantClassifier","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"A model type for constructing a deterministic constant classifier, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"DeterministicConstantClassifier = @load DeterministicConstantClassifier pkg=MLJModels","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"Do model = DeterministicConstantClassifier() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/#RidgeRegressor_MLJScikitLearnInterface","page":"RidgeRegressor","title":"RidgeRegressor","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"A model type for constructing a ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor = @load RidgeRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(alpha=...).","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"alpha = 1.0\nfit_intercept = true\ncopy_X = true\nmax_iter = 1000\ntol = 0.0001\nsolver = auto\nrandom_state = nothing","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/#MultiTaskLassoRegressor_MLJScikitLearnInterface","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"","category":"section"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"MultiTaskLassoRegressor","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"A model type for constructing a multi-target lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"MultiTaskLassoRegressor = @load MultiTaskLassoRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"Do model = MultiTaskLassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoRegressor(alpha=...).","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskLassoRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"alpha = 1.0\nfit_intercept = true\nmax_iter = 1000\ntol = 0.0001\ncopy_X = true\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/#BaggingClassifier_MLJScikitLearnInterface","page":"BaggingClassifier","title":"BaggingClassifier","text":"","category":"section"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"BaggingClassifier","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"A model type for constructing a bagging ensemble classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"BaggingClassifier = @load BaggingClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"Do model = BaggingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingClassifier(estimator=...).","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.","category":"page"},{"location":"models/FeatureSelector_MLJModels/#FeatureSelector_MLJModels","page":"FeatureSelector","title":"FeatureSelector","text":"","category":"section"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"FeatureSelector","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"A model type for constructing a feature selector, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"FeatureSelector = @load FeatureSelector pkg=MLJModels","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"Use this model to select features (columns) of a table, usually as part of a model Pipeline.","category":"page"},{"location":"models/FeatureSelector_MLJModels/#Training-data","page":"FeatureSelector","title":"Training data","text":"","category":"section"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"mach = machine(model, X)","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"where","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"X: any table of input features, where \"table\" is in the sense of Tables.jl","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/FeatureSelector_MLJModels/#Hyper-parameters","page":"FeatureSelector","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"features: one of the following, with the behavior indicated:\n[] (empty, the default): filter out all features (columns) which were not encountered in training\nnon-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)\nfunction or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.\nignore: whether to ignore or keep specified features, as explained above","category":"page"},{"location":"models/FeatureSelector_MLJModels/#Operations","page":"FeatureSelector","title":"Operations","text":"","category":"section"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant","category":"page"},{"location":"models/FeatureSelector_MLJModels/#Fitted-parameters","page":"FeatureSelector","title":"Fitted parameters","text":"","category":"section"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"features_to_keep: the features that will be selected","category":"page"},{"location":"models/FeatureSelector_MLJModels/#Example","page":"FeatureSelector","title":"Example","text":"","category":"section"},{"location":"models/FeatureSelector_MLJModels/","page":"FeatureSelector","title":"FeatureSelector","text":"using MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([\"x\", \"y\", \"x\"], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\nselector = FeatureSelector(features=[:ordinal3, ], ignore=true);\n\njulia> transform(fit!(machine(selector, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[\"x\", \"y\", \"x\"],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n","category":"page"},{"location":"models/GeneralImputer_BetaML/#GeneralImputer_BetaML","page":"GeneralImputer","title":"GeneralImputer","text":"","category":"section"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"mutable struct GeneralImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options]), train!(m,X,Y) and predict(m,X).","category":"page"},{"location":"models/GeneralImputer_BetaML/#Hyperparameters:","page":"GeneralImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"cols_to_impute::Union{String, Vector{Int64}}: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords \"auto\" (default) or \"all\". With \"auto\" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use \"all\" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.\nestimator::Any: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be a cols_to_impute-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default: nothing, i.e. use BetaML random forests, handling classification and regression jobs automatically].\nmissing_supported::Union{Bool, Vector{Bool}}: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default: false]\nfit_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.fit!]\npredict_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.predict]\nrecursive_passages::Int64: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.","category":"page"},{"location":"models/GeneralImputer_BetaML/#Examples-:","page":"GeneralImputer","title":"Examples :","text":"","category":"section"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Using BetaML models:","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"julia> using MLJ;\njulia> import BetaML ## The library from which to get the individual estimators to be used for each column imputation\njulia> X = [\"a\" 8.2;\n \"a\" missing;\n \"a\" 7.8;\n \"b\" 21;\n \"b\" 18;\n \"c\" -0.9;\n missing 20;\n \"c\" -1.8;\n missing -2.3;\n \"c\" -2.4] |> table ;\njulia> modelType = @load GeneralImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.GeneralImputer\njulia> model = modelType(estimator=BetaML.DecisionTreeEstimator(),recursive_passages=2);\njulia> mach = machine(model, X);\njulia> fit!(mach);\n[ Info: Training machine(GeneralImputer(cols_to_impute = auto, …), …).\njulia> X_full = transform(mach) |> MLJ.matrix\n10×2 Matrix{Any}:\n \"a\" 8.2\n \"a\" 8.0\n \"a\" 7.8\n \"b\" 21\n \"b\" 18\n \"c\" -0.9\n \"b\" 20\n \"c\" -1.8\n \"c\" -2.3\n \"c\" -2.4","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Using third party packages (in this example DecisionTree):","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"julia> using MLJ;\njulia> import DecisionTree ## An example of external estimators to be used for each column imputation\njulia> X = [\"a\" 8.2;\n \"a\" missing;\n \"a\" 7.8;\n \"b\" 21;\n \"b\" 18;\n \"c\" -0.9;\n missing 20;\n \"c\" -1.8;\n missing -2.3;\n \"c\" -2.4] |> table ;\njulia> modelType = @load GeneralImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.GeneralImputer\njulia> model = modelType(estimator=[DecisionTree.DecisionTreeClassifier(),DecisionTree.DecisionTreeRegressor()], fit_function=DecisionTree.fit!,predict_function=DecisionTree.predict,recursive_passages=2);\njulia> mach = machine(model, X);\njulia> fit!(mach);\n[ Info: Training machine(GeneralImputer(cols_to_impute = auto, …), …).\njulia> X_full = transform(mach) |> MLJ.matrix\n10×2 Matrix{Any}:\n \"a\" 8.2\n \"a\" 7.51111\n \"a\" 7.8\n \"b\" 21\n \"b\" 18\n \"c\" -0.9\n \"b\" 20\n \"c\" -1.8\n \"c\" -2.3\n \"c\" -2.4","category":"page"},{"location":"third_party_packages/#Third-Party-Packages","page":"Third Party Packages","title":"Third Party Packages","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"A list of third-party packages with integration with MLJ.","category":"page"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"Last updated December 2020.","category":"page"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"Pull requests to update this list are very welcome. Otherwise, you may post an issue requesting this here.","category":"page"},{"location":"third_party_packages/#Packages-providing-models-in-the-MLJ-model-registry","page":"Third Party Packages","title":"Packages providing models in the MLJ model registry","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"See List of Supported Models","category":"page"},{"location":"third_party_packages/#Providing-unregistered-models:","page":"Third Party Packages","title":"Providing unregistered models:","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"SossMLJ.jl\nTimeSeriesClassification","category":"page"},{"location":"third_party_packages/#Packages-providing-other-kinds-of-functionality:","page":"Third Party Packages","title":"Packages providing other kinds of functionality:","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"MLJParticleSwarmOptimization.jl (hyper-parameter optimization strategy)\nTreeParzen.jl (hyper-parameter optimization strategy)\nShapley.jl (feature ranking / interpretation)\nShapML.jl (feature ranking / interpretation)\nFairness.jl (FAIRness metrics)\nOutlierDetection.jl (provides the ProbabilisticDetector wrapper and other outlier detection meta-functionality)\nConformalPrediction.jl (predictive uncertainty quantification through conformal prediction)","category":"page"},{"location":"learning_networks/#Learning-Networks","page":"Learning Networks","title":"Learning Networks","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Below is a practical guide to the MLJ implementation of learning networks, which have been described more abstractly in the article:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Anthony D. Blaom and Sebastian J. Voller (2020): Flexible model composition in machine learning and its implementation in MLJ. Preprint, arXiv:2012.15505.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Learning networks, an advanced but powerful MLJ feature, are \"blueprints\" for combining models in flexible ways, beyond ordinary linear pipelines and simple model ensembles. They are simple transformations of your existing workflows which can be \"exported\" to define new, re-usable composite model types (models which typically have other models as hyperparameters).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Pipeline models (see Pipeline), and model stacks (see Stack) are both implemented internally as exported learning networks.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"note: Note\nWhile learning networks can be used for complex machine learning workflows, their main purpose is for defining new stand-alone model types, which behave just like any other model type: Instances can be evaluated, tuned, inserted into pipelines, etc. In serious applications, users are encouraged to export their learning networks, as explained under Exporting a learning network as a new model type below, after testing the network, using a small training dataset.","category":"page"},{"location":"learning_networks/#Learning-networks-by-example","page":"Learning Networks","title":"Learning networks by example","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Learning networks are best explained by way of example.","category":"page"},{"location":"learning_networks/#Lazy-computation","page":"Learning Networks","title":"Lazy computation","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The core idea of a learning network is delayed or lazy computation. Instead of","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X = 4\nY = 3\nZ = 2*X\nW = Y + Z\nW","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"we can do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\n\nX = source(4)\nY = source(3)\nZ = 2*X\nW = Y + Z\nW()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In the first computation X, Y, Z and W are all bound to ordinary data. In the second, they are bound to objects called nodes. The special nodes X and Y constitute \"entry points\" for data, and are called source nodes. As the terminology suggests, we can imagine these objects as part of a \"network\" (a directed acyclic graph) which can aid conceptualization (but is less useful in more complicated examples):","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"(Image: )","category":"page"},{"location":"learning_networks/#The-origin-of-a-node","page":"Learning Networks","title":"The origin of a node","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The source nodes on which a given node depends are called the origins of the node:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"os = origins(W)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X in os","category":"page"},{"location":"learning_networks/#Re-using-a-network","page":"Learning Networks","title":"Re-using a network","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The advantage of lazy evaluation is that we can change data at a source node to repeat the calculation with new data. One way to do this (discouraged in practice) is to use rebind!:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"rebind!(X, 6) # demonstration only!\nZ()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"However, if a node has a unique origin, then one instead calls the node on the new data one would like to rebind to that origin:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"origins(Z)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z(6)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z(4)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"This has the advantage that you don't need to locate the origin and rebind data directly, and the unique-origin restriction turns out to be sufficient for the applications to learning we have in mind.","category":"page"},{"location":"learning_networks/#node_overloading","page":"Learning Networks","title":"Overloading functions for use on nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Several built-in function like * and + above are overloaded in MLJBase to work on nodes, as illustrated above. Others that work out-of-the-box include: MLJBase.matrix, MLJBase.table, vcat, hcat, mean, median, mode, first, last, as well as broadcasted versions of log, exp, mean, mode and median. A function like sqrt is not overloaded, so that Q = sqrt(Z) will throw an error. Instead, we do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q = node(sqrt, Z)\nZ()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"You can learn more about the node function under More on defining new nodes","category":"page"},{"location":"learning_networks/#A-network-that-learns","page":"Learning Networks","title":"A network that learns","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"To incorporate learning in a network of nodes MLJ:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Allows binding of machines to nodes instead of data\nGenerates \"operation\" nodes when calling an operation like predict or transform on a machine and node input data. Such nodes point to both a machine (storing learned parameters) and the node from which to fetch data for applying the operation (which, unlike the nodes seen so far, depend on learned parameters to generate output).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"For an example of a learning network that actually learns, we first synthesize some training data X, y, and production data Xnew:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nX, y = make_blobs(cluster_std=10.0, rng=123) # `X` is a table, `y` a vector\nXnew, _ = make_blobs(3) # `Xnew` is a table with the same number of columns\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We choose a model do some dimension reduction, and another to perform classification:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"pca = (@load PCA pkg=MultivariateStats verbosity=0)()\ntree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"To make our learning lazy, we wrap the training data as source nodes:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Xs = source(X)\nys = source(y)\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"And, finally, proceed as we would in an ordinary MLJ workflow, with the exception that there is no need to fit! our machines, as training will be carried out lazily later:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mach1 = machine(pca, Xs)\nx = transform(mach1, Xs) # defines a new node because `Xs` is a node\n\nmach2 = machine(tree, x, ys)\nyhat = predict(mach2, x) # defines a new node because `x` is a node","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Note that mach1 and mach2 are not themselves nodes. They point to the nodes they need to call to get training data and they are in turn pointed to by other nodes. In fact, an interesting implementation detail is that an \"ordinary\" machine is not actually bound directly to data, but bound to data wrapped in source nodes.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"machine(pca, Xnew).args[1] # `Xnew` is ordinary data","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Before calling a node, we need to fit! the node, to trigger training of all the machines on which it depends:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"fit!(yhat) # can include same keyword options for `fit!(::Machine, ...)`\nyhat()[1:2] # or `yhat(rows=2)`","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"This last represents the prediction on the training data, because that's what resides at our source nodes. However, yhat has the unique origin X (because \"training edges\" in the complete associated directed graph are excluded for this purpose). We can therefore call yhat on our production data to get the corresponding predictions:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"yhat(Xnew)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Training is smart, in the sense that mutating a hyper-parameter of some component model does not force retraining of upstream machines:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"tree.max_depth = 1\nfit!(yhat)\nyhat(Xnew)","category":"page"},{"location":"learning_networks/#Multithreaded-training","page":"Learning Networks","title":"Multithreaded training","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"A more complicated learning network may contain machines that can be trained in parallel. In that case, a call like the following may speed up training:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"tree.max_depth = 2\nfit!(yhat, acceleration=CPUThreads())\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Currently, only CPU1() (default) and CPUThreads() are supported here.","category":"page"},{"location":"learning_networks/#Exporting-a-learning-network-as-a-new-model-type","page":"Learning Networks","title":"Exporting a learning network as a new model type","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Once a learning network has been tested, typically on some small dummy data set, it is ready to be exported as a new, stand-alone, re-usable model type (unattached to any data). We demonstrate the process by way of examples of increasing complexity:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Example A - Mini-pipeline\nMore on replacing models with symbols\nExample B - Multiple operations: transform and inverse transform\nExample C - Blending predictions and exposing internal network state in reports\nExample D - Multiple nodes pointing to the same machine\nExample E - Coupling component model hyper-parameters\nMore on defining new nodes\nExample F - Wrapping a model in a data-dependent tuning strategy","category":"page"},{"location":"learning_networks/#Example-A-Mini-pipeline","page":"Learning Networks","title":"Example A - Mini-pipeline","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"First we export the simple learning network defined above. (This is for illustration purposes; in practice using the Pipeline syntax model1 |> model2 syntax is more convenient.)","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We need a type with two fields, one for the preprocessor (pca in the network above) and one for the classifier (tree in the network above).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The DecisionTreeClassifier type of tree has supertype Probabilistic, because it makes probabilistic predictions, and we assume any other classifier we want to swap out will be the same.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"supertype(typeof(tree))","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In particular, our composite model will also need Probabilistic as supertype. In fact, we must give it the intermediate supertype ProbabilisticNetworkComposite <: Probabilistic, so that we additionally flag it as an exported learning network model type:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mutable struct CompositeA <: ProbabilisticNetworkComposite\n preprocessor\n classifier\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The common alternatives are DeterministicNetworkComposite and UnsupervisedNetworkComposite. But all options can be viewed as follows:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJBase\nNetworkComposite","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We next make our learning network model-generic by substituting each model instance with the corresponding symbol representing a property (field) of the new model struct:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mach1 = machine(:preprocessor, Xs) # <---- `pca` swapped out for `:preprocessor`\nx = transform(mach1, Xs)\nmach2 = machine(:classifier, x, ys) # <---- `tree` swapped out for `:classifier`\nyhat = predict(mach2, x)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Incidentally, this network can be used as before except we must provide an instance of CompositeA in our fit! calls, to indicate what actual models the symbols are being substituted with:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"composite_a = CompositeA(pca, ConstantClassifier())\nfit!(yhat, composite=composite_a)\nyhat(Xnew)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In this case :preprocessor is being substituted by pca, and :classifier by ConstantClassifier() for training.","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Literally copy and paste the learning network above into the definition of a method called prefit, as shown below (if you have implemented your own MLJ model, you will notice this has the same signature as MLJModelInterface.fit):","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"import MLJBase\nfunction MLJBase.prefit(composite::CompositeA, verbosity, X, y)\n\n # the learning network from above:\n Xs = source(X)\n ys = source(y)\n mach1 = machine(:preprocessor, Xs)\n x = transform(mach1, Xs)\n mach2 = machine(:classifier, x, ys)\n yhat = predict(mach2, x)\n\n verbosity > 0 && @info \"I'm a noisy fellow!\"\n\n # return \"learning network interface\":\n return (; predict=yhat)\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"That's it.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Generally, prefit always returns a learning network interface; see MLJBase.prefit for what this means in general. In this example, the interface dictates that calling predict(mach, Xnew) on a machine mach bound to some instance of CompositeA should internally call yhat(Xnew).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's our new composite model type CompositeA in action, combining standardization with KNN classification:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nX, y = @load_iris\n\nknn = (@load KNNClassifier pkg=NearestNeighborModels verbosity=0)()\ncomposite_a = CompositeA(Standardizer(), knn)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mach = machine(composite_a, X, y) |> fit!\npredict(mach, X)[1:2]","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"report(mach).preprocessor","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"fitted_params(mach).classifier","category":"page"},{"location":"learning_networks/#More-on-replacing-models-with-symbols","page":"Learning Networks","title":"More on replacing models with symbols","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Only the first argument model in some expression machine(model, ...) can be replaced with a symbol. These replacements function as hooks for exposing reports and fitted parameters of component models in the report and fitted parameters of the composite model, but these replacements are not absolutely necessary. For example, instead of the line mach1 = machine(:preprocessor, Xs) in the prefit definition, we can do mach1 = machine(composite.preprocessor, Xs). However, report and fittted_params will not include items for the :preprocessor component model in that case.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"If a component model is not explicitly bound to data in a machine (for example, because it is first wrapped in TunedModel) then there are ways to explicitly expose associated fitted parameters or report items. See Example F below.","category":"page"},{"location":"learning_networks/#Example-B-Multiple-operations:-transform-and-inverse-transform","page":"Learning Networks","title":"Example B - Multiple operations: transform and inverse transform","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a second mini-pipeline example composing two transformers which both implement inverse transform. We show how to implement an inverse_transform for the composite model too.","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-2","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeB <: DeterministicNetworkComposite\n transformer1\n transformer2\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-2","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::CompositeB, verbosity, X)\n Xs = source(X)\n\n mach1 = machine(:transformer1, Xs)\n X1 = transform(mach1, Xs)\n mach2 = machine(:transformer2, X1)\n X2 = transform(mach2, X1)\n\n W1 = inverse_transform(mach2, Xs)\n W2 = inverse_transform(mach1, W1)\n\n # the learning network interface:\n return (; transform=X2, inverse_transform=W2)\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a demonstration:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X = rand(100)\n\ncomposite_b = CompositeB(UnivariateBoxCoxTransformer(), Standardizer())\nmach = machine(composite_b, X) |> fit!\nW = transform(mach, X)\n@assert inverse_transform(mach, W) ≈ X","category":"page"},{"location":"learning_networks/#Example-C-Blending-predictions-and-exposing-internal-network-state-in-reports","page":"Learning Networks","title":"Example C - Blending predictions and exposing internal network state in reports","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The code below defines a new composite model type CompositeC that predicts by taking the weighted average of two regressors, and additionally exposes, in the model's report, a measure of disagreement between the two models at time of training. In addition to the two regressors, the new model has two other fields:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mix, controlling the weighting\nacceleration, for the mode of acceleration for training the model (e.g., CPUThreads()).","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-3","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeC <: DeterministicNetworkComposite\n regressor1\n regressor2\n mix::Float64\n acceleration\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-3","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::CompositeC, verbosity, X, y)\n\n Xs = source(X)\n ys = source(y)\n\n mach1 = machine(:regressor1, Xs, ys)\n mach2 = machine(:regressor2, Xs, ys)\n\n yhat1 = predict(mach1, Xs)\n yhat2 = predict(mach2, Xs)\n\n # node to return disagreement between the regressor predictions:\n disagreement = node((y1, y2) -> l2(y1, y2) |> mean, yhat1, yhat2)\n\n # get the weighted average the predictions of the regressors:\n λ = composite.mix\n yhat = (1 - λ)*yhat1 + λ*yhat2\n\n # the learning network interface:\n return (\n predict = yhat,\n report= (; training_disagreement=disagreement),\n acceleration = composite.acceleration,\n )\n\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a demonstration:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, y = make_regression() # a table and a vector\n\nknn = (@load KNNRegressor pkg=NearestNeighborModels verbosity=0)()\ntree = (@load DecisionTreeRegressor pkg=DecisionTree verbosity=0)()\ncomposite_c = CompositeC(knn, tree, 0.2, CPUThreads())\nmach = machine(composite_c, X, y) |> fit!\nXnew, _ = make_regression(3)\npredict(mach, Xnew)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"report(mach)","category":"page"},{"location":"learning_networks/#Example-D-Multiple-nodes-pointing-to-the-same-machine","page":"Learning Networks","title":"Example D - Multiple nodes pointing to the same machine","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"When incorporating learned target transformations (such as a standardization) in supervised learning, it is desirable to apply the inverse transformation to predictions, to return them to the original scale. This means re-using learned parameters from an earlier part of your workflow. This poses no problem here, as the next example demonstrates.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The model type CompositeD defined below applies a preprocessing transformation to input data X (e.g., standardization), learns a transformation for the target y (e.g., an optimal Box-Cox transformation), predicts new target values using a regressor (e.g., Ridge regression), and then inverse-transforms those predictions to restore them to the original scale. (This represents a model we could alternatively build using the TransformedTargetModel wrapper and a Pipeline.)","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-4","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeD <: DeterministicNetworkComposite\n preprocessor\n target_transformer\n regressor\n acceleration\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-4","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Notice that both of the nodes z and yhat in the wrapped learning network point to the same machine (learned parameters) mach2.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::CompositeD, verbosity, X, y)\n\n Xs = source(X)\n ys = source(y)\n\n mach1 = machine(:preprocessor, Xs)\n W = transform(mach1, Xs)\n\n mach2 = machine(:target_transformer, ys)\n z = transform(mach2, ys)\n\n mach3 =machine(:regressor, W, z)\n zhat = predict(mach3, W)\n\n yhat = inverse_transform(mach2, zhat)\n\n # the learning network interface:\n return (\n predict = yhat,\n acceleration = composite.acceleration,\n )\n\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The flow of information in the wrapped learning network is visualized below.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"(Image: )","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's an application of our new composite to the Boston dataset:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, y = @load_boston\n\nstand = Standardizer()\nbox = UnivariateBoxCoxTransformer()\nridge = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)(lambda=92)\ncomposite_d = CompositeD(stand, box, ridge, CPU1())\nevaluate(composite_d, X, y, resampling=CV(nfolds=5), measure=l2, verbosity=0)","category":"page"},{"location":"learning_networks/#Example-E-Coupling-component-model-hyper-parameters","page":"Learning Networks","title":"Example E - Coupling component model hyper-parameters","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The composite model in this example combines a clustering model used to reduce the dimension of the feature space (KMeans or KMedoids from Clustering.jl) with ridge regression, but has the following \"coupling\" of the hyperparameters: The amount of ridge regularization depends on the number of specified clusters k, with less regularization for a greater number of clusters. It includes a user-specified coupling coefficient c, and exposes the solver hyper-parameter of the ridge regressor. (Neither the clusterer nor ridge regressor are themselves hyperparameters of the composite.)","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-5","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeE <: DeterministicNetworkComposite\n clusterer # `:kmeans` or `:kmedoids`\n k::Int # number of clusters\n solver # a ridge regression parameter we want to expose\n c::Float64 # a \"coupling\" coefficient\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-5","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels verbosity=0\nKMeans = @load KMeans pkg=Clustering verbosity=0\nKMedoids = @load KMedoids pkg=Clustering verbosity=0\n\nfunction MLJBase.prefit(composite::CompositeE, verbosity, X, y)\n\n Xs = source(X)\n ys = source(y)\n\n k = composite.k\n solver = composite.solver\n c = composite.c\n\n clusterer = composite.clusterer == :kmeans ? KMeans(; k) : KMedoids(; k)\n mach1 = machine(clusterer, Xs)\n Xsmall = transform(mach1, Xs)\n\n # the coupling - ridge regularization depends on the number of\n # clusters `k` and the coupling coefficient `c`:\n lambda = exp(-c/k)\n\n ridge = RidgeRegressor(; lambda, solver)\n mach2 = machine(ridge, Xsmall, ys)\n yhat = predict(mach2, Xsmall)\n\n return (predict=yhat,)\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's an application to the Boston dataset in which we optimize the coupling coefficient (see Tuning Models for more on hyper-parameter optimization):","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, y = @load_boston # a table and a vector\n\ncomposite_e = CompositeE(:kmeans, 3, nothing, 0.5)\nr = range(composite_e, :c, lower = -2, upper=2, scale=x->10^x)\ntuned_composite_e = TunedModel(\n composite_e,\n range=r,\n tuning=RandomSearch(rng=123),\n measure=l2,\n resampling=CV(nfolds=6),\n n=100,\n)\nmach = machine(tuned_composite_e, X, y) |> fit!\nreport(mach).best_model","category":"page"},{"location":"learning_networks/#More-on-defining-new-nodes","page":"Learning Networks","title":"More on defining new nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Overloading ordinary functions for nodes has already been discussed above. Here's another example:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"divide(x, y) = x/y\n\nX = source(2)\nY = source(3)\n\nZ = node(divide, X, Y)\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"This means Z() returns divide(X(), Y()), which is divide(2, 3) in this case:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We cannot call Z with arguments (e.g., Z(2)) because it does not have a unique origin.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In all the node examples so far, the first argument of node is a function, and all other arguments are nodes - one node for each argument of the function. A node constructed in this way is called a static node. A dynamic node, which directly depends on the outcome of a training event, is constructed by giving a machine as the second argument, to be passed as the first argument of the function in a node call. For example, we can do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Xs = source(rand(4))\nmach = machine(Standardizer(), Xs)\nN = node(transform, mach, Xs) |> fit!\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Then N has the following calling properties:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N() returns transform(mach, Xs())\nN(Xnew) returns transform(mach, Xs(Xnew)); here Xs(Xnew) is just Xnew because Xs is just a source node.)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N(rand(2))","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In fact, this is precisely how the transform method is internally overloaded to work, when called with a node argument (to return a node instead of data). That is, internally there exists code that amounts to the definition","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"transform(mach, X::AbstractNode) = node(transform, mach, X)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here AbstractNode is the common super-type of Node and Source.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"It sometimes useful to create dynamic nodes with no node arguments, as in","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Xs = source(rand(10))\nmach = machine(Standardizer(), Xs)\nN = node(fitted_params, mach) |> fit!\nN()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Static nodes can have also have zero node arguments. These may be viewed as \"constant\" nodes:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N = Node(()-> 42)\nN()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Example F below demonstrates the use of static and dynamic nodes. For more details, see the node docstring.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"There is also an experimental macro @node. If Z is an AbstractNode (Z = source(16), say) then instead of","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q = node(sqrt, Z)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"one can do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q = @node sqrt(Z)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"(so that Q() == 4). Here's a more complicated application of @node to row-shuffle a table:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ, Random\nX = (x1 = [1, 2, 3, 4, 5],\n x2 = [:one, :two, :three, :four, :five])\nrows(X) = 1:nrows(X)\n\nXs = source(X)\nrs = @node rows(Xs)\nW = @node selectrows(Xs, @node shuffle(rs))\n\nW()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Important. An argument not in global scope is assumed by @node to be a node or source.","category":"page"},{"location":"learning_networks/#Example-F-Wrapping-a-model-in-a-data-dependent-tuning-strategy","page":"Learning Networks","title":"Example F - Wrapping a model in a data-dependent tuning strategy","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"When the regularization parameter of a Lasso model is optimized, one commonly searches over a parameter range depending on properties of the training data. Indeed, Lasso (and, more generally, elastic net) implementations commonly provide a method to carry out this data-dependent optimization automatically, using cross-validation. The following example shows how to transform the LassoRegressor model type from MLJLinearModels.jl into a self-tuning model type LassoCVRegressor using the commonly implemented data-dependent tuning strategy. A new dimensionless hyperparameter epsilon controls the lower bound on the parameter range.","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-6","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct LassoCVRegressor <: DeterministicNetworkComposite\n lasso # the atomic lasso model (`lasso.lambda` is ignored)\n epsilon::Float64 # controls lower bound of `lasso.lambda` in tuning\n resampling # resampling strategy for optimization of `lambda`\nend\n\n# keyword constructor for convenience:\nLassoRegressor = @load LassoRegressor pkg=MLJLinearModels verbosity=0\nLassoCVRegressor(;\n lasso=LassoRegressor(),\n epsilon=0.001,\n resampling=CV(nfolds=6),\n) = LassoCVRegressor(\n lasso,\n epsilon,\n resampling,\n)\nnothing # hide","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-6","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In this case, there is no model -> :symbol replacement that makes sense here, because the model is getting wrapped by TunedModel before being bound to nodes in a machine. However, we can expose the the learned lasso coefs and intercept using fitted parameter nodes; and expose the optimal lambda, and range searched, using report nodes (as previously demonstrated in Example C).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::LassoCVRegressor, verbosity, X, y)\n\n λ_max = maximum(abs.(MLJ.matrix(X)'y))\n\n Xs = source(X)\n ys = source(y)\n\n r = range(\n composite.lasso,\n :lambda,\n lower=composite.epsilon*λ_max,\n upper=λ_max,\n scale=:log10,\n )\n\n lambda_range = node(()->r) # a \"constant\" report node\n\n tuned_lasso = TunedModel(\n composite.lasso,\n tuning=Grid(shuffle=false),\n range = r,\n measure = l2,\n resampling=composite.resampling,\n )\n mach = machine(tuned_lasso, Xs, ys)\n\n R = node(report, mach) # `R()` returns `report(mach)`\n lambda = node(r -> r.best_model.lambda, R) # a report node\n\n F = node(fitted_params, mach) # `F()` returns `fitted_params(mach)`\n coefs = node(f->f.best_fitted_params.coefs, F) # a fitted params node\n intercept = node(f->f.best_fitted_params.intercept, F) # a fitted params node\n\n yhat = predict(mach, Xs)\n\n return (\n predict=yhat,\n fitted_params=(; coefs, intercept),\n report=(; lambda, lambda_range),\n )\n\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a demonstration:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, _ = make_regression(1000, 3, rng=123)\ny = X.x2 - X.x2 + 0.005*X.x3 + 0.05*rand(1000)\nlasso_cv = LassoCVRegressor(epsilon=1e-5)\nmach = machine(lasso_cv, X, y) |> fit!\nreport(mach)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"fitted_params(mach)","category":"page"},{"location":"learning_networks/#The-learning-network-API","page":"Learning Networks","title":"The learning network API","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Two new julia types are part of learning networks: Source and Node, which share a common abstract supertype AbstractNode.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Formally, a learning network defines two labeled directed acyclic graphs (DAG's) whose nodes are Node or Source objects, and whose labels are Machine objects. We obtain the first DAG from directed edges of the form N1 - N2 whenever N1 is an argument of N2 (see below). Only this DAG is relevant when calling a node, as discussed in the examples above and below. To form the second DAG (relevant when calling or calling fit! on a node) one adds edges for which N1 is training argument of the machine which labels N1. We call the second, larger DAG, the completed learning network (but note only edges of the smaller network are explicitly drawn in diagrams, for simplicity).","category":"page"},{"location":"learning_networks/#Source-nodes","page":"Learning Networks","title":"Source nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Only source nodes can reference concrete data. A Source object has a single field, data.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"MLJBase.Source\nsource(X)\nrebind!\nsources\norigins","category":"page"},{"location":"learning_networks/#MLJBase.Source","page":"Learning Networks","title":"MLJBase.Source","text":"Source\n\nType for a learning network source node. Constructed using source, as in source() or source(rand(2,3)).\n\nSee also source, Node.\n\n\n\n\n\n","category":"type"},{"location":"learning_networks/#MLJBase.source-Tuple{Any}","page":"Learning Networks","title":"MLJBase.source","text":"Xs = source(X=nothing)\n\nDefine, a learning network Source object, wrapping some input data X, which can be nothing for purposes of exporting the network as stand-alone model. For training and testing the unexported network, appropriate vectors, tables, or other data containers are expected.\n\nThe calling behaviour of a Source object is this:\n\nXs() = X\nXs(rows=r) = selectrows(X, r) # eg, X[r,:] for a DataFrame\nXs(Xnew) = Xnew\n\nSee also: MLJBase.prefit, sources, origins, node.\n\n\n\n\n\n","category":"method"},{"location":"learning_networks/#MLJBase.rebind!","page":"Learning Networks","title":"MLJBase.rebind!","text":"rebind!(s, X)\n\nAttach new data X to an existing source node s. Not a public method.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/#MLJBase.sources","page":"Learning Networks","title":"MLJBase.sources","text":"sources(N::AbstractNode)\n\nA vector of all sources referenced by calls N() and fit!(N). These are the sources of the ancestor graph of N when including training edges.\n\nNot to be confused with origins(N), in which training edges are excluded.\n\nSee also: origins, source.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/#MLJBase.origins","page":"Learning Networks","title":"MLJBase.origins","text":"origins(N)\n\nReturn a list of all origins of a node N accessed by a call N(). These are the source nodes of ancestor graph of N if edges corresponding to training arguments are excluded. A Node object cannot be called on new data unless it has a unique origin.\n\nNot to be confused with sources(N) which refers to the same graph but without the training edge deletions.\n\nSee also: node, source.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/#Nodes","page":"Learning Networks","title":"Nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Node\nnode","category":"page"},{"location":"learning_networks/#MLJBase.Node","page":"Learning Networks","title":"MLJBase.Node","text":"Node{T<:Union{Machine,Nothing}}\n\nType for nodes in a learning network that are not Source nodes.\n\nThe key components of a Node are:\n\nAn operation, which will either be static (a fixed function) or dynamic (such as predict or transform).\nA Machine object, on which to dispatch the operation (nothing if the operation is static). The training arguments of the machine are generally other nodes, including Source nodes.\nUpstream connections to other nodes, called its arguments, possibly including Source nodes, one for each data argument of the operation (typically there's just one).\n\nWhen a node N is called, as in N(), it applies the operation on the machine (if there is one) together with the outcome of calls to its node arguments, to compute the return value. For details on a node's calling behavior, see node.\n\nSee also node, Source, origins, sources, fit!.\n\n\n\n\n\n","category":"type"},{"location":"learning_networks/#MLJBase.node","page":"Learning Networks","title":"MLJBase.node","text":"J = node(f, mach::Machine, args...)\n\nDefines a dynamic Node object J wrapping a dynamic operation f (predict, predict_mean, transform, etc), a nodal machine mach and arguments args. Its calling behaviour, which depends on the outcome of training mach (and, implicitly, on training outcomes affecting its arguments) is this:\n\nJ() = f(mach, args[1](), args[2](), ..., args[n]())\nJ(rows=r) = f(mach, args[1](rows=r), args[2](rows=r), ..., args[n](rows=r))\nJ(X) = f(mach, args[1](X), args[2](X), ..., args[n](X))\n\nGenerally n=1 or n=2 in this latter case.\n\npredict(mach, X::AbsractNode, y::AbstractNode)\npredict_mean(mach, X::AbstractNode, y::AbstractNode)\npredict_median(mach, X::AbstractNode, y::AbstractNode)\npredict_mode(mach, X::AbstractNode, y::AbstractNode)\ntransform(mach, X::AbstractNode)\ninverse_transform(mach, X::AbstractNode)\n\nShortcuts for J = node(predict, mach, X, y), etc.\n\nCalling a node is a recursive operation which terminates in the call to a source node (or nodes). Calling nodes on new data X fails unless the number of such nodes is one.\n\nSee also: Node, @node, source, origins.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"@node","category":"page"},{"location":"learning_networks/#MLJBase.@node","page":"Learning Networks","title":"MLJBase.@node","text":"@node f(...)\n\nConstruct a new node that applies the function f to some combination of nodes, sources and other arguments.\n\nImportant. An argument not in global scope is assumed to be a node or source.\n\nExamples\n\njulia> X = source(π)\njulia> W = @node sin(X)\njulia> W()\n0\n\njulia> X = source(1:10)\njulia> Y = @node selectrows(X, 3:4)\njulia> Y()\n3:4\n\njulia> Y([\"one\", \"two\", \"three\", \"four\"])\n2-element Array{Symbol,1}:\n \"three\"\n \"four\"\n\njulia> X1 = source(4)\njulia> X2 = source(5)\njulia> add(a, b, c) = a + b + c\njulia> N = @node add(X1, 1, X2)\njulia> N()\n10\n\n\nSee also node\n\n\n\n\n\n","category":"macro"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"MLJBase.prefit","category":"page"},{"location":"learning_networks/#MLJBase.prefit","page":"Learning Networks","title":"MLJBase.prefit","text":"MLJBase.prefit(model, verbosity, data...)\n\nReturns a learning network interface (see below) for a learning network with source nodes that wrap data.\n\nA user overloads MLJBase.prefit when exporting a learning network as a new stand-alone model type, of which model above will be an instance. See the MLJ reference manual for details.\n\nA learning network interface is a named tuple declaring certain interface points in a learning network, to be used when \"exporting\" the network as a new stand-alone model type. Examples are\n\n (predict=yhat,)\n (transform=Xsmall, acceleration=CPUThreads())\n (predict=yhat, transform=W, report=(loss=loss_node,))\n\nHere yhat, Xsmall, W and loss_node are nodes in the network.\n\nThe keys of the learning network interface always one of the following:\n\nThe name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See \"Operation keys\" below.\n:report, for exposing results of calling a node with no arguments in the composite model report. See \"Including report nodes\" below.\n:fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See \"Including fitted parameter nodes\" below.\n:acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.\n\nOperation keys\n\nIf the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).\n\nIncluding report nodes\n\nIf the key is :report, then the corresponding value must be a named tuple\n\n (k1=n1, k2=n2, ...)\n\nwhose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the \"Learning Networks\" section of the MLJ manual.\n\nIncluding fitted parameter nodes\n\nIf the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"See more on fitting nodes at fit! and fit_only!.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#MultitargetKNNClassifier_NearestNeighborModels","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"MultitargetKNNClassifier","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"A model type for constructing a multitarget K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"MultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Do model = MultitargetKNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNClassifier(K=...).","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Multi-target K-Nearest Neighbors Classifier (MultitargetKNNClassifier) is a variation of KNNClassifier that assumes the target variable is vector-valued with Multiclass or OrderedFactor components. (Target data must be presented as a table, however.)","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Training-data","page":"MultitargetKNNClassifier","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"OR","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Here:","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\nyis the target, which can be any table of responses whose element scitype is either<:Finite(<:Multiclassor<:OrderedFactorwill do); check the columns scitypes withschema(y). Each column ofy` is assumed to belong to a common categorical pool.\nw is the observation weights which can either be nothing(default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Hyper-parameters","page":"MultitargetKNNClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.\noutput_type::Type{<:MultiUnivariateFinite}=DictTable : One of (ColumnTable, DictTable). The type of table type to use for predictions. Setting to ColumnTable might improve performance for narrow tables while setting to DictTable improves performance for wide tables.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Operations","page":"MultitargetKNNClassifier","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are either a ColumnTable or DictTable of UnivariateFiniteVector columns depending on the value set for the output_type parameter discussed above. The probabilistic predictions are uncalibrated.\npredict_mode(mach, Xnew): Return the modes of each column of the table of probabilistic predictions returned above.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Fitted-parameters","page":"MultitargetKNNClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Examples","page":"MultitargetKNNClassifier","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"using MLJ, StableRNGs\n\n## set rng for reproducibility\nrng = StableRNG(10)\n\n## Dataset generation\nn, p = 10, 3\nX = table(randn(rng, n, p)) ## feature table\nfruit, color = categorical([\"apple\", \"orange\"]), categorical([\"blue\", \"green\"])\ny = [(fruit = rand(rng, fruit), color = rand(rng, color)) for _ in 1:n] ## target_table\n## Each column in y has a common categorical pool as expected\nselectcols(y, :fruit) ## categorical array\nselectcols(y, :color) ## categorical array\n\n## Load MultitargetKNNClassifier\nMultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels\n\n## view possible kernels\nNearestNeighborModels.list_kernels()\n\n## MultitargetKNNClassifier instantiation\nmodel = MultitargetKNNClassifier(K=3, weights = NearestNeighborModels.Inverse())\n\n## wrap model and required data in an MLJ machine and fit\nmach = machine(model, X, y) |> fit!\n\n## predict\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)\n","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"See also KNNClassifier","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/#AdaBoostRegressor_MLJScikitLearnInterface","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"","category":"section"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"AdaBoostRegressor","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"A model type for constructing a AdaBoost ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"AdaBoostRegressor = @load AdaBoostRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"Do model = AdaBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostRegressor(estimator=...).","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"This class implements the algorithm known as AdaBoost.R2.","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/#KMeans_MLJScikitLearnInterface","page":"KMeans","title":"KMeans","text":"","category":"section"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"KMeans","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"A model type for constructing a k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"KMeans = @load KMeans pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(n_clusters=...).","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"K-Means algorithm: find K centroids corresponding to K clusters in the data.","category":"page"},{"location":"models/UnivariateStandardizer_MLJModels/#UnivariateStandardizer_MLJModels","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"","category":"section"},{"location":"models/UnivariateStandardizer_MLJModels/","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"UnivariateStandardizer()","category":"page"},{"location":"models/UnivariateStandardizer_MLJModels/","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"Transformer type for standardizing (whitening) single variable data.","category":"page"},{"location":"models/UnivariateStandardizer_MLJModels/","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"This model may be deprecated in the future. Consider using Standardizer, which handles both tabular and univariate data.","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/#OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"OrthogonalMatchingPursuitRegressor","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"A model type for constructing a orthogonal matching pursuit regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"OrthogonalMatchingPursuitRegressor = @load OrthogonalMatchingPursuitRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"Do model = OrthogonalMatchingPursuitRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitRegressor(n_nonzero_coefs=...).","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"OrthogonalMatchingPursuitRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"n_nonzero_coefs = nothing\ntol = nothing\nfit_intercept = true\nnormalize = false\nprecompute = auto","category":"page"},{"location":"learning_curves/#Learning-Curves","page":"Learning Curves","title":"Learning Curves","text":"","category":"section"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot but generates the data needed to do so.","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"using MLJ\nX, y = @load_boston;\n\natom = (@load RidgeRegressor pkg=MLJLinearModels)()\nensemble = EnsembleModel(model=atom, n=1000)\nmach = machine(ensemble, X, y)\n\nr_lambda = range(ensemble, :(model.lambda), lower=1e-1, upper=100, scale=:log10)\ncurve = MLJ.learning_curve(mach;\n range=r_lambda,\n resampling=CV(nfolds=3),\n measure=l1)","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"using Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"CV estimate of RMS error\")","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"(Image: )","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"If the range hyperparameter is the number of iterations in some iterative model, learning_curve will not restart the training from scratch for each new value, unless a non-holdout resampling strategy is specified (and provided the model implements an appropriate update method). To obtain multiple curves (that are distinct) you will need to pass the name of the model random number generator, rng_name, and specify the random number generators to be used using rngs=... (an integer automatically generates the number specified):","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"atom.lambda = 7.3\nr_n = range(ensemble, :n, lower=1, upper=50)\ncurves = MLJ.learning_curve(mach;\n range=r_n,\n measure=l1,\n verbosity=0,\n rng_name=:rng,\n rngs=4)","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"plot(curves.parameter_values,\n curves.measurements,\n xlab=curves.parameter_name,\n ylab=\"Holdout estimate of RMS error\")","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"(Image: )","category":"page"},{"location":"learning_curves/#API-reference","page":"Learning Curves","title":"API reference","text":"","category":"section"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"MLJTuning.learning_curve","category":"page"},{"location":"learning_curves/#MLJTuning.learning_curve","page":"Learning Curves","title":"MLJTuning.learning_curve","text":"curve = learning_curve(mach; resolution=30,\n resampling=Holdout(),\n repeats=1,\n measure=default_measure(machine.model),\n rows=nothing,\n weights=nothing,\n operation=nothing,\n range=nothing,\n acceleration=default_resource(),\n acceleration_grid=CPU1(),\n rngs=nothing,\n rng_name=nothing)\n\nGiven a supervised machine mach, returns a named tuple of objects suitable for generating a plot of performance estimates, as a function of the single hyperparameter specified in range. The tuple curve has the following keys: :parameter_name, :parameter_scale, :parameter_values, :measurements.\n\nTo generate multiple curves for a model with a random number generator (RNG) as a hyperparameter, specify the name, rng_name, of the (possibly nested) RNG field, and a vector rngs of RNG's, one for each curve. Alternatively, set rngs to the number of curves desired, in which case RNG's are automatically generated. The individual curve computations can be distributed across multiple processes using acceleration=CPUProcesses() or acceleration=CPUThreads(). See the second example below for a demonstration.\n\nX, y = @load_boston;\natom = @load RidgeRegressor pkg=MultivariateStats\nensemble = EnsembleModel(atom=atom, n=1000)\nmach = machine(ensemble, X, y)\nr_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)\ncurve = learning_curve(mach; range=r_lambda, resampling=CV(), measure=mav)\nusing Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"CV estimate of RMS error\")\n\nIf using a Holdout() resampling strategy (with no shuffling) and if the specified hyperparameter is the number of iterations in some iterative model (and that model has an appropriately overloaded MLJModelInterface.update method) then training is not restarted from scratch for each increment of the parameter, ie the model is trained progressively.\n\natom.lambda=200\nr_n = range(ensemble, :n, lower=1, upper=250)\ncurves = learning_curve(mach; range=r_n, verbosity=0, rng_name=:rng, rngs=3)\nplot!(curves.parameter_values,\n curves.measurements,\n xlab=curves.parameter_name,\n ylab=\"Holdout estimate of RMS error\")\n\n\n\nlearning_curve(model::Supervised, X, y; kwargs...)\nlearning_curve(model::Supervised, X, y, w; kwargs...)\n\nPlot a learning curve (or curves) directly, without first constructing a machine.\n\nSummary of key-word options\n\nresolution - number of points generated from range (number model evaluations); default is 30\nacceleration - parallelization option for passing to evaluate!; an instance of CPU1, CPUProcesses or CPUThreads from the ComputationalResources.jl; default is default_resource()\nacceleration_grid - parallelization option for distributing each performancde evaluation\nrngs - for specifying random number generator(s) to be passed to the model (see above)\nrng_name - name of the model hyper-parameter representing a random number generator (see above); possibly nested\n\nOther key-word options are documented at TunedModel.\n\n\n\n\n\n","category":"function"},{"location":"models/EvoLinearRegressor_EvoLinear/#EvoLinearRegressor_EvoLinear","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"EvoLinearRegressor(; kwargs...)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"A model type for constructing a EvoLinearRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Keyword-arguments","page":"EvoLinearRegressor","title":"Keyword arguments","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"loss=:mse: loss function to be minimised. Can be one of:\n:mse\n:logistic\n:poisson\n:gamma\n:tweedie\nnrounds=10: maximum number of training rounds.\neta=1: Learning rate. Typically in the range [1e-2, 1].\nL1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.\nL2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.\nrng=123: random seed. Not used at the moment.\nupdater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.\ndevice=:cpu: Only :cpu is supported at the moment.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Internal-API","page":"EvoLinearRegressor","title":"Internal API","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Do config = EvoLinearRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"EvoLinearRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Training-model","page":"EvoLinearRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"A model is built using fit:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"config = EvoLinearRegressor()\nm = fit(config; x, y, w)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Inference","page":"EvoLinearRegressor","title":"Inference","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument. ","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"preds = m(x)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#MLJ-Interface","page":"EvoLinearRegressor","title":"MLJ Interface","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"EvoLinearRegressor = @load EvoLinearRegressor pkg=EvoLinear","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoLinearRegressor(loss=...).","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Training-model-2","page":"EvoLinearRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where: ","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Operations","page":"EvoLinearRegressor","title":"Operations","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"predict(mach, Xnew): return predictions of the target given","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Fitted-parameters","page":"EvoLinearRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":":fitresult: the EvoLinearModel object returned by EvoLnear.jl fitting algorithm.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Report","page":"EvoLinearRegressor","title":"Report","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":":coef: Vector of coefficients (βs) associated to each of the features.\n:bias: Value of the bias.\n:names: Names of each of the features.","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/#KernelPerceptronClassifier_BetaML","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"","category":"section"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"mutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/#Hyperparameters:","page":"KernelPerceptronClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"kernel::Function: Kernel function to employ. See ?radial_kernel or ?polynomial_kernel (once loaded the BetaML package) for details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radial_kernel]\nepochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 100]\ninitial_errors::Union{Nothing, Vector{Vector{Int64}}}: Initial distribution of the number of errors errors [def: nothing, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as (n_classes * (n_classes - 1)) / 2\nshuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/#Example:","page":"KernelPerceptronClassifier","title":"Example:","text":"","category":"section"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load KernelPerceptronClassifier pkg = \"BetaML\"\n[ Info: For silent loading, specify `verbosity=0`. \nimport BetaML ✔\nBetaML.Perceptron.KernelPerceptronClassifier\n\njulia> model = modelType()\nKernelPerceptronClassifier(\n kernel = BetaML.Utils.radial_kernel, \n epochs = 100, \n initial_errors = nothing, \n shuffle = true, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> est_classes = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)\n UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.245, virginica=>0.665)\n UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)","category":"page"},{"location":"model_search/#model_search","page":"Model Search","title":"Model Search","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"MLJ has a model registry, allowing the user to search models and their properties, without loading all the packages containing model code. In turn, this allows one to efficiently find all models solving a given machine learning task. The task itself is specified with the help of the matching method, and the search executed with the models methods, as detailed below.","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"For commonly encountered problems with model search, see also Preparing Data.","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"A table of all models is also given at List of Supported Models.","category":"page"},{"location":"model_search/#Model-metadata","page":"Model Search","title":"Model metadata","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Terminology. In this section the word \"model\" refers to a metadata entry in the model registry, as opposed to an actual model struct that such an entry represents. One can obtain such an entry with the info command:","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"info(\"PCA\")","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"So a \"model\" in the present context is just a named tuple containing metadata, and not an actual model type or instance. If two models with the same name occur in different packages, the package name must be specified, as in info(\"LinearRegressor\", pkg=\"GLM\").","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Model document strings can be retreived, without importing the defining code, using the doc function:","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")","category":"page"},{"location":"model_search/#General-model-queries","page":"Model Search","title":"General model queries","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"We list all models (named tuples) using models(), and list the models for which code is already loaded with localmodels():","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"localmodels()\nlocalmodels()[2]","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"One can search for models containing specified strings or regular expressions in their docstring attributes, as in","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models(\"forest\")","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"or by specifying a filter (Bool-valued function):","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"filter(model) = model.is_supervised &&\n model.input_scitype >: MLJ.Table(Continuous) &&\n model.target_scitype >: AbstractVector{<:Multiclass{3}} &&\n model.prediction_type == :deterministic\nmodels(filter)","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Multiple test arguments may be passed to models, which are applied conjunctively.","category":"page"},{"location":"model_search/#Matching-models-to-data","page":"Model Search","title":"Matching models to data","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Common searches are streamlined with the help of the matching command, defined as follows:","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"matching(model, X, y) == true exactly when model is supervised and admits inputs and targets with the scientific types of X and y, respectively\nmatching(model, X) == true exactly when model is unsupervised and admits inputs with the scientific types of X.","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"So, to search for all supervised probabilistic models handling input X and target y, one can define the testing function task by","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"task(model) = matching(model, X, y) && model.prediction_type == :probabilistic","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"And execute the search with","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models(task)","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Also defined are Bool-valued callable objects matching(model), matching(X, y) and matching(X), with obvious behavior. For example, matching(X, y)(model) = matching(model, X, y).","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"So, to search for all models compatible with input X and target y, for example, one executes","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models(matching(X, y))","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"while the preceding search can also be written","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models() do model\n matching(model, X, y) &&\n model.prediction_type == :probabilistic\nend","category":"page"},{"location":"model_search/#API","page":"Model Search","title":"API","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models\nlocalmodels","category":"page"},{"location":"model_search/#MLJModels.models","page":"Model Search","title":"MLJModels.models","text":"models()\n\nList all models in the MLJ registry. Here and below model means the registry metadata entry for a genuine model type (a proxy for types whose defining code may not be loaded).\n\nmodels(filters..)\n\nList all models m for which filter(m) is true, for each filter in filters.\n\nmodels(matching(X, y))\n\nList all supervised models compatible with training data X, y.\n\nmodels(matching(X))\n\nList all unsupervised models compatible with training data X.\n\nExcluded in the listings are the built-in model-wraps, like EnsembleModel, TunedModel, and IteratedModel.\n\nExample\n\nIf\n\ntask(model) = model.is_supervised && model.is_probabilistic\n\nthen models(task) lists all supervised models making probabilistic predictions.\n\nSee also: localmodels.\n\n\n\n\n\nmodels(needle::Union{AbstractString,Regex})\n\nList all models whole name or docstring matches a given needle.\n\n\n\n\n\n","category":"function"},{"location":"model_search/#MLJModels.localmodels","page":"Model Search","title":"MLJModels.localmodels","text":"localmodels(; modl=Main)\nlocalmodels(filters...; modl=Main)\nlocalmodels(needle::Union{AbstractString,Regex}; modl=Main)\n\nList all models currently available to the user from the module modl without importing a package, and which additional pass through the specified filters. Here a filter is a Bool-valued function on models.\n\nUse load_path to get the path to some model returned, as in these examples:\n\nms = localmodels()\nmodel = ms[1]\nload_path(model)\n\nSee also models, load_path.\n\n\n\n\n\n","category":"function"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/#HistGradientBoostingClassifier_MLJScikitLearnInterface","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"","category":"section"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"HistGradientBoostingClassifier","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"A model type for constructing a hist gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"HistGradientBoostingClassifier = @load HistGradientBoostingClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"Do model = HistGradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingClassifier(loss=...).","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#LinearBinaryClassifier_GLM","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"LinearBinaryClassifier","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"A model type for constructing a linear binary classifier, based on GLM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"LinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"Do model = LinearBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearBinaryClassifier(fit_intercept=...).","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"LinearBinaryClassifier is a generalized linear model, specialised to the case of a binary target variable, with a user-specified link function. Options exist to specify an intercept or offset feature.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Training-data","page":"LinearBinaryClassifier","title":"Training data","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"Here","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor(2) or <:Multiclass(2); check the scitype with schema(y)\nw: is a vector of Real per-observation weights","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Hyper-parameters","page":"LinearBinaryClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)\nlink=GLM.LogitLink: The function which links the linear prediction function to the probability of a particular outcome or class. This must have type GLM.Link01. Options include GLM.LogitLink(), GLM.ProbitLink(), CloglogLink(),CauchitLink()`.\noffsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.\nmaxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.\natol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nrtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nminstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.\nreport_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Operations","page":"LinearBinaryClassifier","title":"Operations","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Fitted-parameters","page":"LinearBinaryClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"features: The names of the features used during model fitting.\ncoef: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Report","page":"LinearBinaryClassifier","title":"Report","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares\ndof_residual: The degrees of freedom for residuals, when meaningful.\nstderror: The standard errors of the coefficients.\nvcov: The estimated variance-covariance matrix of the coefficient estimates.\ncoef_table: Table which displays coefficients and summarizes their significance and confidence intervals.\nglm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Examples","page":"LinearBinaryClassifier","title":"Examples","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"using MLJ\nimport GLM ## namespace must be available\n\nLinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM\nclf = LinearBinaryClassifier(fit_intercept=false, link=GLM.ProbitLink())\n\nX, y = @load_crabs\n\nmach = machine(clf, X, y) |> fit!\n\nXnew = (;FL = [8.1, 24.8, 7.2],\n RW = [5.1, 25.7, 6.4],\n CL = [15.9, 46.7, 14.3],\n CW = [18.7, 59.7, 12.2],\n BD = [6.2, 23.6, 8.4],)\n\nyhat = predict(mach, Xnew) ## probabilistic predictions\npdf(yhat, levels(y)) ## probability matrix\np_B = pdf.(yhat, \"B\")\nclass_labels = predict_mode(mach, Xnew)\n\nfitted_params(mach).features\nfitted_params(mach).coef\nfitted_params(mach).intercept\n\nreport(mach)","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"See also LinearRegressor, LinearCountRegressor","category":"page"},{"location":"models/SOSDetector_OutlierDetectionPython/#SOSDetector_OutlierDetectionPython","page":"SOSDetector","title":"SOSDetector","text":"","category":"section"},{"location":"models/SOSDetector_OutlierDetectionPython/","page":"SOSDetector","title":"SOSDetector","text":"SOSDetector(perplexity = 4.5,\n metric = \"minkowski\",\n eps = 1e-5)","category":"page"},{"location":"models/SOSDetector_OutlierDetectionPython/","page":"SOSDetector","title":"SOSDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sos","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/#BayesianQDA_MLJScikitLearnInterface","page":"BayesianQDA","title":"BayesianQDA","text":"","category":"section"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"BayesianQDA","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"A model type for constructing a Bayesian quadratic discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"BayesianQDA = @load BayesianQDA pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"Do model = BayesianQDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianQDA(priors=...).","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/#Hyper-parameters","page":"BayesianQDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"priors = nothing\nreg_param = 0.0\nstore_covariance = false\ntol = 0.0001","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/#XGBoostClassifier_XGBoost","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"","category":"section"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"XGBoostClassifier","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"A model type for constructing a eXtreme Gradient Boosting Classifier, based on XGBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"Do model = XGBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostClassifier(test=...).","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"Univariate classification using xgboost.","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/#Training-data","page":"XGBoostClassifier","title":"Training data","text":"","category":"section"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"m = machine(model, X, y)","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"where","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.\ny: is an AbstractVector Finite target.","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"Train using fit!(m, rows=...).","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/#Hyper-parameters","page":"XGBoostClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"See https://xgboost.readthedocs.io/en/stable/parameter.html.","category":"page"},{"location":"models/LODADetector_OutlierDetectionPython/#LODADetector_OutlierDetectionPython","page":"LODADetector","title":"LODADetector","text":"","category":"section"},{"location":"models/LODADetector_OutlierDetectionPython/","page":"LODADetector","title":"LODADetector","text":"LODADetector(n_bins = 10,\n n_random_cuts = 100)","category":"page"},{"location":"models/LODADetector_OutlierDetectionPython/","page":"LODADetector","title":"LODADetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.loda","category":"page"},{"location":"models/RandomOversampler_Imbalance/#RandomOversampler_Imbalance","page":"RandomOversampler","title":"RandomOversampler","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Initiate a random oversampling model with the given hyper-parameters.","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"RandomOversampler","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"A model type for constructing a random oversampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"RandomOversampler = @load RandomOversampler pkg=Imbalance","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Do model = RandomOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomOversampler(ratios=...).","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"RandomOversampler implements naive oversampling by repeating existing observations with replacement.","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Training-data","page":"RandomOversampler","title":"Training data","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"For default values of the hyper-parameters, model can be constructed by model = RandomOverSampler()","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Hyperparameters","page":"RandomOversampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"ratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Transform-Inputs","page":"RandomOversampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Transform-Outputs","page":"RandomOversampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Operations","page":"RandomOversampler","title":"Operations","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"transform(mach, X, y): resample the data X and y using RandomOversampler, returning both the new and original observations","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Example","page":"RandomOversampler","title":"Example","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n## load RandomOversampler\nRandomOversampler = @load RandomOversampler pkg=Imbalance\n\n## wrap the model in a machine\noversampler = RandomOversampler(ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/#DummyRegressor_MLJScikitLearnInterface","page":"DummyRegressor","title":"DummyRegressor","text":"","category":"section"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"DummyRegressor","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"A model type for constructing a dummy regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"DummyRegressor = @load DummyRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"Do model = DummyRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyRegressor(strategy=...).","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"DummyRegressor is a regressor that makes predictions using simple rules.","category":"page"},{"location":"models/PegasosClassifier_BetaML/#PegasosClassifier_BetaML","page":"PegasosClassifier","title":"PegasosClassifier","text":"","category":"section"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"mutable struct PegasosClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"The gradient-based linear \"pegasos\" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/PegasosClassifier_BetaML/#Hyperparameters:","page":"PegasosClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]\ninitial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]\nlearning_rate::Function: Learning rate [def: (epoch -> 1/sqrt(epoch))]\nlearning_rate_multiplicative::Float64: Multiplicative term of the learning rate [def: 0.5]\nepochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]\nshuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\nforce_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]\nreturn_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/PegasosClassifier_BetaML/#Example:","page":"PegasosClassifier","title":"Example:","text":"","category":"section"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load PegasosClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Perceptron.PegasosClassifier\n\njulia> model = modelType()\nPegasosClassifier(\n initial_coefficients = nothing, \n initial_constant = nothing, \n learning_rate = BetaML.Perceptron.var\"#71#73\"(), \n learning_rate_multiplicative = 0.5, \n epochs = 1000, \n shuffle = true, \n force_origin = false, \n return_mean_hyperplane = false, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> est_classes = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.817, versicolor=>0.153, virginica=>0.0301)\n UnivariateFinite{Multiclass{3}}(setosa=>0.791, versicolor=>0.177, virginica=>0.0318)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.5, virginica=>0.246)\n UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/#TheilSenRegressor_MLJScikitLearnInterface","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"","category":"section"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"TheilSenRegressor","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"A model type for constructing a Theil-Sen regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"TheilSenRegressor = @load TheilSenRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"Do model = TheilSenRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TheilSenRegressor(fit_intercept=...).","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"TheilSenRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"fit_intercept = true\ncopy_X = true\nmax_subpopulation = 10000\nn_subsamples = nothing\nmax_iter = 300\ntol = 0.001\nrandom_state = nothing\nn_jobs = nothing\nverbose = false","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/#MultiTaskLassoCVRegressor_MLJScikitLearnInterface","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"","category":"section"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"MultiTaskLassoCVRegressor","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"A model type for constructing a multi-target lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"MultiTaskLassoCVRegressor = @load MultiTaskLassoCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"Do model = MultiTaskLassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoCVRegressor(eps=...).","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskLassoCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"eps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nmax_iter = 300\ntol = 0.0001\ncopy_X = true\ncv = 5\nverbose = false\nn_jobs = 1\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"evaluating_model_performance/#Evaluating-Model-Performance","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"In addition to hold-out and cross-validation, the user can specify an explicit list of train/test pairs of row indices for resampling, or define new resampling strategies.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"For simultaneously evaluating multiple models, see Comparing models of different type and nested cross-validation.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"For externally logging the outcomes of performance evaluation experiments, see Logging Workflows","category":"page"},{"location":"evaluating_model_performance/#Evaluating-against-a-single-measure","page":"Evaluating Model Performance","title":"Evaluating against a single measure","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"using MLJ\nX = (a=rand(12), b=rand(12), c=rand(12));\ny = X.a + 2X.b + 0.05*rand(12);\nmodel = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)()\ncv = CV(nfolds=3)\nevaluate(model, X, y, resampling=cv, measure=l2, verbosity=0)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Alternatively, instead of applying evaluate to a model + data, one may call evaluate! on an existing machine wrapping the model in data:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"mach = machine(model, X, y)\nevaluate!(mach, resampling=cv, measure=l2, verbosity=0)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"(The latter call is a mutating call as the learned parameters stored in the machine potentially change. )","category":"page"},{"location":"evaluating_model_performance/#Multiple-measures","page":"Evaluating Model Performance","title":"Multiple measures","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Multiple measures are specified as a vector:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"evaluate!(\n mach,\n resampling=cv,\n measures=[l1, rms, rmslp1],\n verbosity=0,\n)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Custom measures can also be provided.","category":"page"},{"location":"evaluating_model_performance/#Specifying-weights","page":"Evaluating Model Performance","title":"Specifying weights","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Per-observation weights can be passed to measures. If a measure does not support weights, the weights are ignored:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"holdout = Holdout(fraction_train=0.8)\nweights = [1, 1, 2, 1, 1, 2, 3, 1, 1, 2, 3, 1];\nevaluate!(\n mach,\n resampling=CV(nfolds=3),\n measure=[l2, rsquared],\n weights=weights,\n)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"In classification problems, use class_weights=... to specify a class weight dictionary.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.evaluate!\nMLJBase.evaluate\nMLJBase.PerformanceEvaluation","category":"page"},{"location":"evaluating_model_performance/#MLJBase.evaluate!","page":"Evaluating Model Performance","title":"MLJBase.evaluate!","text":"evaluate!(mach; resampling=CV(), measure=nothing, options...)\n\nEstimate the performance of a machine mach wrapping a supervised model in data, using the specified resampling strategy (defaulting to 6-fold cross-validation) and measure, which can be a single measure or vector. Returns a PerformanceEvaluation object.\n\nAvailable resampling strategies are CV, Holdout, InSample, StratifiedCV and TimeSeriesCV. If resampling is not an instance of one of these, then a vector of tuples of the form (train_rows, test_rows) is expected. For example, setting\n\nresampling = [((1:100), (101:200)),\n ((101:200), (1:100))]\n\ngives two-fold cross-validation using the first 200 rows of data.\n\nAny measure conforming to the StatisticalMeasuresBase.jl API can be provided, assuming it can consume multiple observations.\n\nAlthough evaluate! is mutating, mach.model and mach.args are not mutated.\n\nAdditional keyword options\n\nrows - vector of observation indices from which both train and test folds are constructed (default is all observations)\noperation/operations=nothing - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified. For example, predict_mode will be used for a Multiclass target, if model is a probabilistic predictor, but measure is expects literal (point) target predictions. Operations actually applied can be inspected from the operation field of the object returned.\nweights - per-sample Real weights for measures that support them (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).\nclass_weights - dictionary of Real per-class weights for use with measures that support these, in classification problems (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).\nrepeats::Int=1: set to a higher value for repeated (Monte Carlo) resampling. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.\nacceleration=CPU1(): acceleration/parallelization option; can be any instance of CPU1, (single-threaded computation), CPUThreads (multi-threaded computation) or CPUProcesses (multi-process computation); default is default_resource(). These types are owned by ComputationalResources.jl.\nforce=false: set to true to force cold-restart of each training event\nverbosity::Int=1 logging level; can be negative\ncheck_measure=true: whether to screen measures for possible incompatibility with the model. Will not catch all incompatibilities.\nper_observation=true: whether to calculate estimates for individual observations; if false the per_observation field of the returned object is populated with missings. Setting to false may reduce compute time and allocations.\nlogger - a logger object (see MLJBase.log_evaluation)\ncompact=false - if true, the returned evaluation object excludes these fields: fitted_params_per_fold, report_per_fold, train_test_rows.\n\nSee also evaluate, PerformanceEvaluation, CompactPerformanceEvaluation.\n\n\n\n\n\n","category":"function"},{"location":"evaluating_model_performance/#MLJModelInterface.evaluate","page":"Evaluating Model Performance","title":"MLJModelInterface.evaluate","text":"some meta-models may choose to implement the evaluate operations\n\n\n\n\n\n","category":"function"},{"location":"evaluating_model_performance/#MLJBase.PerformanceEvaluation","page":"Evaluating Model Performance","title":"MLJBase.PerformanceEvaluation","text":"PerformanceEvaluation <: AbstractPerformanceEvaluation\n\nType of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model, and store other information ancillary to the computation.\n\nIf evaluate or evaluate! is called with the compact=true option, then a CompactPerformanceEvaluation object is returned instead.\n\nWhen evaluate/evaluate! is called, a number of train/test pairs (\"folds\") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.\n\nWhen displayed, a PerformanceEvaluation object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).\n\nFields\n\nThese fields are part of the public API of the PerformanceEvaluation struct.\n\nmodel: model used to create the performance evaluation. In the case a tuning model, this is the best model found.\nmeasure: vector of measures (metrics) used to evaluate performance\nmeasurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())\noperation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.\nper_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.\nper_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.\nfitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.\nreport_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.\ntrain_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.\nresampling: the user-specified resampling strategy to generate the train/test pairs (or literal train/test pairs if that was directly specified).\nrepeats: the number of times the resampling strategy was repeated.\n\nSee also CompactPerformanceEvaluation.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/#User-specified-train/test-sets","page":"Evaluating Model Performance","title":"User-specified train/test sets","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Users can either provide an explicit list of train/test pairs of row indices for resampling, as in this example:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"fold1 = 1:6; fold2 = 7:12;\nevaluate!(\n mach,\n resampling = [(fold1, fold2), (fold2, fold1)],\n measures=[l1, l2],\n verbosity=0,\n)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Or the user can define their own re-usable ResamplingStrategy objects; see Custom resampling strategies below.","category":"page"},{"location":"evaluating_model_performance/#Built-in-resampling-strategies","page":"Evaluating Model Performance","title":"Built-in resampling strategies","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.Holdout","category":"page"},{"location":"evaluating_model_performance/#MLJBase.Holdout","page":"Evaluating Model Performance","title":"MLJBase.Holdout","text":"holdout = Holdout(; fraction_train=0.7, shuffle=nothing, rng=nothing)\n\nInstantiate a Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.\n\ntrain_test_pairs(holdout, rows)\n\nReturns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.CV","category":"page"},{"location":"evaluating_model_performance/#MLJBase.CV","page":"Evaluating Model Performance","title":"MLJBase.CV","text":"cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning.\n\ntrain_test_pairs(cv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.StratifiedCV","category":"page"},{"location":"evaluating_model_performance/#MLJBase.StratifiedCV","page":"Evaluating Model Performance","title":"MLJBase.StratifiedCV","text":"stratified_cv = StratifiedCV(; nfolds=6,\n shuffle=false,\n rng=Random.GLOBAL_RNG)\n\nStratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).\n\ntrain_test_pairs(stratified_cv, rows, y)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.\n\nUnlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.\n\nThe stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.TimeSeriesCV","category":"page"},{"location":"evaluating_model_performance/#MLJBase.TimeSeriesCV","page":"Evaluating Model Performance","title":"MLJBase.TimeSeriesCV","text":"tscv = TimeSeriesCV(; nfolds=4)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.\n\ntrain_test_pairs(tscv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.\n\nThe first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.\n\nExamples\n\njulia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)\n3-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:\n (1:4, 5:6)\n (1:6, 7:8)\n (1:8, 9:10)\n\njulia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)();\n\njulia> data = @load_sunspots;\n\njulia> X = (lag1 = data.sunspot_number[2:end-1],\n lag2 = data.sunspot_number[1:end-2]);\n\njulia> y = data.sunspot_number[3:end];\n\njulia> tscv = TimeSeriesCV(nfolds=3);\n\njulia> evaluate(model, X, y, resampling=tscv, measure=rmse, verbosity=0)\n┌───────────────────────────┬───────────────┬────────────────────┐\n│ _.measure │ _.measurement │ _.per_fold │\n├───────────────────────────┼───────────────┼────────────────────┤\n│ RootMeanSquaredError @753 │ 21.7 │ [25.4, 16.3, 22.4] │\n└───────────────────────────┴───────────────┴────────────────────┘\n_.per_observation = [missing]\n_.fitted_params_per_fold = [ … ]\n_.report_per_fold = [ … ]\n_.train_test_rows = [ … ]\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/#Custom-resampling-strategies","page":"Evaluating Model Performance","title":"Custom resampling strategies","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"To define a new resampling strategy, make relevant parameters of your strategy the fields of a new type MyResamplingStrategy <: MLJ.ResamplingStrategy, and implement one of the following methods:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows)\nMLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, y)\nMLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, X, y)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Each method takes a vector of indices rows and returns a vector [(t1, e1), (t2, e2), ... (tk, ek)] of train/test pairs of row indices selected from rows. Here X, y are the input and target data (ignored in simple strategies, such as Holdout and CV).","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Here is the code for the Holdout strategy as an example:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"struct Holdout <: ResamplingStrategy\n fraction_train::Float64\n shuffle::Bool\n rng::Union{Int,AbstractRNG}\n\n function Holdout(fraction_train, shuffle, rng)\n 0 < fraction_train < 1 ||\n error(\"`fraction_train` must be between 0 and 1.\")\n return new(fraction_train, shuffle, rng)\n end\nend\n\n# Keyword Constructor\nfunction Holdout(; fraction_train::Float64=0.7, shuffle=nothing, rng=nothing)\n if rng isa Integer\n rng = MersenneTwister(rng)\n end\n if shuffle === nothing\n shuffle = ifelse(rng===nothing, false, true)\n end\n if rng === nothing\n rng = Random.GLOBAL_RNG\n end\n return Holdout(fraction_train, shuffle, rng)\nend\n\nfunction train_test_pairs(holdout::Holdout, rows)\n train, test = partition(rows, holdout.fraction_train,\n shuffle=holdout.shuffle, rng=holdout.rng)\n return [(train, test),]\nend","category":"page"},{"location":"common_mlj_workflows/#Common-MLJ-Workflows","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"This demo assumes you have certain packages in your active package environment. To activate a new environment, \"MyNewEnv\", with just these packages, do this in a new REPL session:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using Pkg\nPkg.activate(\"MyNewEnv\")\nPkg.add([\"MLJ\", \"RDatasets\", \"DataFrames\", \"MLJDecisionTreeInterface\",\n \"MLJMultivariateStatsInterface\", \"NearestNeighborModels\", \"MLJGLMInterface\",\n \"Plots\"])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"The following starts MLJ and shows the current version of MLJ (you can also use Pkg.status()):","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using MLJ\nMLJ_VERSION","category":"page"},{"location":"common_mlj_workflows/#Data-ingestion","page":"Common MLJ Workflows","title":"Data ingestion","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"# to avoid RDatasets as a doc dependency, generate synthetic data with\n# similar parameters, with the first four rows mimicking the original dataset\n# for display purposes\ncolor_off()\nimport DataFrames\nchanning = (Sex = [repeat([\"Male\"], 4)..., rand([\"Male\",\"Female\"], 458)...],\n Entry = Int32[782, 1020, 856, 915, rand(733:1140, 458)...],\n Exit = Int32[909, 1128, 969, 957, rand(777:1207, 458)...],\n Time = Int32[127, 108, 113, 42, rand(0:137, 458)...],\n Cens = Int32[1, 1, 1, 1, rand(0:1, 458)...]) |> DataFrames.DataFrame\ncoerce!(channing, :Sex => Multiclass)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"import RDatasets\nchanning = RDatasets.dataset(\"boot\", \"channing\")","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"first(channing, 4) |> pretty","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting metadata, including column scientific types:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"schema(channing)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Horizontally splitting data and shuffling rows.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Here y is the :Exit column and X a table with everything else:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y, X = unpack(channing, ==(:Exit), rng=123)\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Here y is the :Exit column and X everything else except :Time:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y, X = unpack(channing,\n ==(:Exit),\n !=(:Time);\n rng=123);\nscitype(y)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"schema(X)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fixing wrong scientific types in X:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X = coerce(X, :Exit=>Continuous, :Entry=>Continuous, :Cens=>Multiclass);\nschema(X)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Loading a built-in supervised dataset:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"table = load_iris();\nschema(table)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Loading a built-in data set already split into X and y:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris;\nselectrows(X, 1:4) # selectrows works whenever `Tables.istable(X)==true`.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y[1:4]","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Splitting data vertically after row shuffling:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"channing_train, channing_test = partition(channing, 0.6, rng=123);\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Or, if already horizontally split:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.6, multi=true, rng=123)","category":"page"},{"location":"common_mlj_workflows/#Model-Search","page":"Common MLJ Workflows","title":"Model Search","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Model Search","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Searching for a supervised model:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_boston\nms = models(matching(X, y))","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"ms[6]","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"models(\"Tree\")","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"A more refined search:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"models() do model\n matching(model, X, y) &&\n model.prediction_type == :deterministic &&\n model.is_pure_julia\nend;\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Searching for an unsupervised model:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"models(matching(X))","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Getting the metadata entry for a given model type:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"info(\"PCA\")\ninfo(\"RidgeRegressor\", pkg=\"MultivariateStats\") # a model type in multiple packages","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Extracting the model document string (output omitted):","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/#Instantiating-a-model","page":"Common MLJ Workflows","title":"Instantiating a model","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Getting Started, Loading Model Code","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Assumes MLJDecisionTreeClassifier is in your environment. Otherwise, try interactive loading with @iload:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree(min_samples_split=5, max_depth=4)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"or","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"tree = (@load DecisionTreeClassifier)()\ntree.min_samples_split = 5\ntree.max_depth = 4","category":"page"},{"location":"common_mlj_workflows/#Evaluating-a-model","page":"Common MLJ Workflows","title":"Evaluating a model","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Evaluating Model Performance","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_boston # a table and a vector\nKNN = @load KNNRegressor\nknn = KNN()\nevaluate(knn, X, y,\n resampling=CV(nfolds=5),\n measure=[RootMeanSquaredError(), LPLoss(1)])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Note RootMeanSquaredError() has alias rms and LPLoss(1) has aliases l1, mae.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Do measures() to list all losses and scores and their aliases, or refer to the StatisticalMeasures.jl docs.","category":"page"},{"location":"common_mlj_workflows/#Basic-fit/evaluate/predict-by-hand","page":"Common MLJ Workflows","title":"Basic fit/evaluate/predict by hand","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Getting Started, Machines, Evaluating Model Performance, Performance Measures","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"crabs = load_crabs() |> DataFrames.DataFrame\nschema(crabs)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y, X = unpack(crabs, ==(:sp), !in([:index, :sex]); rng=123)\n\nTree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree(max_depth=2) # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Bind the model and data together in a machine, which will additionally, store the learned parameters (fitresults) when fit:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"mach = machine(tree, X, y)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Split row indices into training and evaluation rows:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"train, test = partition(eachindex(y), 0.7); # 70:30 split","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fit on the train data set and evaluate on the test data set:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"fit!(mach, rows=train)\nyhat = predict(mach, X[test,:])\nLogLoss(tol=1e-4)(yhat, y[test])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Note LogLoss() has aliases log_loss and cross_entropy.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Predict on the new data set:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Xnew = (FL = rand(3), RW = rand(3), CL = rand(3), CW = rand(3), BD = rand(3))\npredict(mach, Xnew) # a vector of distributions","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"predict_mode(mach, Xnew) # a vector of point-predictions","category":"page"},{"location":"common_mlj_workflows/#More-performance-evaluation-examples","page":"Common MLJ Workflows","title":"More performance evaluation examples","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Evaluating model + data directly:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"evaluate(tree, X, y,\n resampling=Holdout(fraction_train=0.7, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"If a machine is already defined, as above:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"evaluate!(mach,\n resampling=Holdout(fraction_train=0.7, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Using cross-validation:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"evaluate!(mach, resampling=CV(nfolds=5, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"With user-specified train/test pairs of row indices:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"f1, f2, f3 = 1:13, 14:26, 27:36\npairs = [(f1, vcat(f2, f3)), (f2, vcat(f3, f1)), (f3, vcat(f1, f2))];\nevaluate!(mach,\n resampling=pairs,\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Changing a hyperparameter and re-evaluating:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"tree.max_depth = 3\nevaluate!(mach,\n resampling=CV(nfolds=5, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/#Inspecting-training-results","page":"Common MLJ Workflows","title":"Inspecting training results","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fit an ordinary least square model to some synthetic data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"x1 = rand(100)\nx2 = rand(100)\n\nX = (x1=x1, x2=x2)\ny = x1 - 2x2 + 0.1*rand(100);\n\nOLS = @load LinearRegressor pkg=GLM\nols = OLS()\nmach = machine(ols, X, y) |> fit!","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Get a named tuple representing the learned parameters, human-readable if appropriate:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"fitted_params(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Get other training-related information:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"report(mach)","category":"page"},{"location":"common_mlj_workflows/#Basic-fit/transform-for-unsupervised-models","page":"Common MLJ Workflows","title":"Basic fit/transform for unsupervised models","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Load data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris # a table and a vector\ntrain, test = partition(eachindex(y), 0.97, shuffle=true, rng=123)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Instantiate and fit the model/machine:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"PCA = @load PCA\npca = PCA(maxoutdim=2)\nmach = machine(pca, X)\nfit!(mach, rows=train)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Transform selected data bound to the machine:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"transform(mach, rows=test);","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Transform new data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Xnew = (sepal_length=rand(3), sepal_width=rand(3),\n petal_length=rand(3), petal_width=rand(3));\ntransform(mach, Xnew)","category":"page"},{"location":"common_mlj_workflows/#Inverting-learned-transformations","page":"Common MLJ Workflows","title":"Inverting learned transformations","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y = rand(100);\nstand = Standardizer()\nmach = machine(stand, y)\nfit!(mach)\nz = transform(mach, y);\n@assert inverse_transform(mach, z) ≈ y # true","category":"page"},{"location":"common_mlj_workflows/#Nested-hyperparameter-tuning","page":"Common MLJ Workflows","title":"Nested hyperparameter tuning","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Tuning Models","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Define a model with nested hyperparameters:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree()\nforest = EnsembleModel(model=tree, n=300)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Define ranges for hyperparameters to be tuned:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r1 = range(forest, :bagging_fraction, lower=0.5, upper=1.0, scale=:log10)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r2 = range(forest, :(model.n_subfeatures), lower=1, upper=4) # nested","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Wrap the model in a tuning strategy:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"tuned_forest = TunedModel(model=forest,\n tuning=Grid(resolution=12),\n resampling=CV(nfolds=6),\n ranges=[r1, r2],\n measure=BrierLoss())","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Bound the wrapped model to data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"mach = machine(tuned_forest, X, y)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fitting the resultant machine optimizes the hyperparameters specified in range, using the specified tuning and resampling strategies and performance measure (possibly a vector of measures), and retrains on all data bound to the machine:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"fit!(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting the optimal model:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"F = fitted_params(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"F.best_model","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting details of tuning procedure:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r = report(mach);\nkeys(r)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r.history[[1,end]]","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Visualizing these results:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using Plots\nplot(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Image: )","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Predicting on new data using the optimized model trained on all data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"predict(mach, Xnew)","category":"page"},{"location":"common_mlj_workflows/#Constructing-linear-pipelines","page":"Common MLJ Workflows","title":"Constructing linear pipelines","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Linear Pipelines","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Constructing a linear (unbranching) pipeline with a learned target transformation/inverse transformation:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_reduced_ames\nKNN = @load KNNRegressor\nknn_with_target = TransformedTargetModel(model=KNN(K=3), transformer=Standardizer())","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"pipe = (X -> coerce(X, :age=>Continuous)) |> OneHotEncoder() |> knn_with_target","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Evaluating the pipeline (just as you would any other model):","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"pipe.one_hot_encoder.drop_last = true # mutate a nested hyper-parameter\nevaluate(pipe, X, y, resampling=Holdout(), measure=RootMeanSquaredError(), verbosity=2)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting the learned parameters in a pipeline:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"mach = machine(pipe, X, y) |> fit!\nF = fitted_params(mach)\nF.transformed_target_model_deterministic.model","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Constructing a linear (unbranching) pipeline with a static (unlearned) target transformation/inverse transformation:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Tree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0\ntree_with_target = TransformedTargetModel(model=Tree(),\n transformer=y -> log.(y),\n inverse = z -> exp.(z))\npipe2 = (X -> coerce(X, :age=>Continuous)) |> OneHotEncoder() |> tree_with_target\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/#Creating-a-homogeneous-ensemble-of-models","page":"Common MLJ Workflows","title":"Creating a homogeneous ensemble of models","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Homogeneous Ensembles","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris\nTree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree()\nforest = EnsembleModel(model=tree, bagging_fraction=0.8, n=300)\nmach = machine(forest, X, y)\nevaluate!(mach, measure=LogLoss())","category":"page"},{"location":"common_mlj_workflows/#Performance-curves","page":"Common MLJ Workflows","title":"Performance curves","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Generate a plot of performance, as a function of some hyperparameter (building on the preceding example)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Single performance curve:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r = range(forest, :n, lower=1, upper=1000, scale=:log10)\ncurve = learning_curve(mach,\n range=r,\n resampling=Holdout(),\n resolution=50,\n measure=LogLoss(),\n verbosity=0)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using Plots\nplot(curve.parameter_values, curve.measurements,\n xlab=curve.parameter_name, xscale=curve.parameter_scale)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Image: )","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Multiple curves:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"curve = learning_curve(mach,\n range=r,\n resampling=Holdout(),\n measure=LogLoss(),\n resolution=50,\n rng_name=:rng,\n rngs=4,\n verbosity=0)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"plot(curve.parameter_values, curve.measurements,\n xlab=curve.parameter_name, xscale=curve.parameter_scale)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Image: )","category":"page"},{"location":"models/LinearRegressor_GLM/#LinearRegressor_GLM","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a linear regressor, based on GLM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=GLM","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor assumes the target is a continuous variable whose conditional distribution is normal with constant variance, and whose expected value is a linear combination of the features (identity link function). Options exist to specify an intercept or offset feature.","category":"page"},{"location":"models/LinearRegressor_GLM/#Training-data","page":"LinearRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"Here","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)\nw: is a vector of Real per-observation weights","category":"page"},{"location":"models/LinearRegressor_GLM/#Hyper-parameters","page":"LinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)\ndropcollinear=false: Whether to drop features in the training data to ensure linear independence. If true , only the first of each set of linearly-dependent features is used. The coefficient for redundant linearly dependent features is 0.0 and all associated statistics are set to NaN.\noffsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.\nreport_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearRegressor_GLM/#Operations","page":"LinearRegressor","title":"Operations","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.\npredict_mean(mach, Xnew): instead return the mean of each prediction above\npredict_median(mach, Xnew): instead return the median of each prediction above.","category":"page"},{"location":"models/LinearRegressor_GLM/#Fitted-parameters","page":"LinearRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"features: The names of the features encountered during model fitting.\ncoef: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearRegressor_GLM/#Report","page":"LinearRegressor","title":"Report","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"When all keys are enabled in report_keys, the following fields are available in report(mach):","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares\ndof_residual: The degrees of freedom for residuals, when meaningful.\nstderror: The standard errors of the coefficients.\nvcov: The estimated variance-covariance matrix of the coefficient estimates.\ncoef_table: Table which displays coefficients and summarizes their significance and confidence intervals.\nglm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.","category":"page"},{"location":"models/LinearRegressor_GLM/#Examples","page":"LinearRegressor","title":"Examples","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"using MLJ\nLinearRegressor = @load LinearRegressor pkg=GLM\nglm = LinearRegressor()\n\nX, y = make_regression(100, 2) ## synthetic data\nmach = machine(glm, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2)\nyhat = predict(mach, Xnew) ## new predictions\nyhat_point = predict_mean(mach, Xnew) ## new predictions\n\nfitted_params(mach).features\nfitted_params(mach).coef ## x1, x2, intercept\nfitted_params(mach).intercept\n\nreport(mach)","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"See also LinearCountRegressor, LinearBinaryClassifier","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#SelfOrganizingMap_SelfOrganizingMaps","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"SelfOrganizingMap","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"A model type for constructing a self organizing map, based on SelfOrganizingMaps.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"SelfOrganizingMap = @load SelfOrganizingMap pkg=SelfOrganizingMaps","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"Do model = SelfOrganizingMap() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SelfOrganizingMap(k=...).","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"SelfOrganizingMaps implements Kohonen's Self Organizing Map, Proceedings of the IEEE; Kohonen, T.; (1990):\"The self-organizing map\"","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Training-data","page":"SelfOrganizingMap","title":"Training data","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X) where","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"X: an AbstractMatrix or Table of input features whose columns are of scitype Continuous.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Hyper-parameters","page":"SelfOrganizingMap","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"k=10: Number of nodes along once side of SOM grid. There are k² total nodes.\nη=0.5: Learning rate. Scales adjust made to winning node and its neighbors during each round of training.\nσ²=0.05: The (squared) neighbor radius. Used to determine scale for neighbor node adjustments.\ngrid_type=:rectangular Node grid geometry. One of (:rectangular, :hexagonal, :spherical).\nη_decay=:exponential Learning rate schedule function. One of (:exponential, :asymptotic)\nσ_decay=:exponential Neighbor radius schedule function. One of (:exponential, :asymptotic, :none)\nneighbor_function=:gaussian Kernel function used to make adjustment to neighbor weights. Scale is set by σ². One of (:gaussian, :mexican_hat).\nmatching_distance=euclidean Distance function from Distances.jl used to determine winning node.\nNepochs=1 Number of times to repeat training on the shuffled dataset.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Operations","page":"SelfOrganizingMap","title":"Operations","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"transform(mach, Xnew): returns the coordinates of the winning SOM node for each instance of Xnew. For SOM of gridtype :rectangular and :hexagonal, these are cartesian coordinates. For gridtype :spherical, these are the latitude and longitude in radians.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Fitted-parameters","page":"SelfOrganizingMap","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"coords: The coordinates of each of the SOM nodes (points in the domain of the map) with shape (k², 2)\nweights: Array of weight vectors for the SOM nodes (corresponding points in the map's range) of shape (k², input dimension)","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Report","page":"SelfOrganizingMap","title":"Report","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"classes: the index of the winning node for each instance of the training data X interpreted as a class label","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Examples","page":"SelfOrganizingMap","title":"Examples","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"using MLJ\nsom = @load SelfOrganizingMap pkg=SelfOrganizingMaps\nmodel = som()\nX, y = make_regression(50, 3) ## synthetic data\nmach = machine(model, X) |> fit!\nX̃ = transform(mach, X)\n\nrpt = report(mach)\nclasses = rpt.classes","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#MultinomialClassifier_MLJLinearModels","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"MultinomialClassifier","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"A model type for constructing a multinomial classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"MultinomialClassifier = @load MultinomialClassifier pkg=MLJLinearModels","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"Do model = MultinomialClassifier() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"This model coincides with LogisticClassifier, except certain optimizations possible in the special binary case will not be applied. Its hyperparameters are identical.","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#Training-data","page":"MultinomialClassifier","title":"Training data","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"where:","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#Hyperparameters","page":"MultinomialClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: eps()\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, NewtonCG, ProxGrad; but subject to the following restrictions:\nIf penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.\nUnless scitype(y) <: Finite{2} (binary target) Newton is disallowed.\nIf solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#Example","page":"MultinomialClassifier","title":"Example","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"using MLJ\nX, y = make_blobs(centers = 3)\nmach = fit!(machine(MultinomialClassifier(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"See also LogisticClassifier.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#MultitargetSRRegressor_SymbolicRegression","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"MultitargetSRRegressor","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(binary_operators=...).","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Training-data","page":"MultitargetSRRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"OR","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.\nw is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous. The same weights are used for all targets.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Hyper-parameters","page":"MultitargetSRRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.\nunary_operators: Same, but for unary operators (one input scalar, gives an output scalar).\nconstraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.\nbatching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.\nbatch_size: What batch size to use if using batching.\nelementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).\nloss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,\n function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}\n prediction, flag = eval_tree_array(tree, dataset.X, options)\n if !flag\n return L(Inf)\n end\n return sum((prediction .- dataset.y) .^ 2) / dataset.n\n end\nnode_type::Type{N}=Node: The type of node to use for the search. For example, Node or GraphNode.\npopulations: How many populations of equations to use.\npopulation_size: How many equations in each population.\nncycles_per_iteration: How many generations to consider per iteration.\ntournament_selection_n: Number of expressions considered in each tournament.\ntournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.\ntopn: Number of equations to return to the host process, and to consider for the hall of fame.\ncomplexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].\ncomplexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.\ncomplexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.\nalpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.\nmaxsize: Maximum size of equations during the search.\nmaxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.\nparsimony: A multiplicative factor for how much complexity is punished.\ndimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.\nuse_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.\nuse_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.\nadaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.\nturbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!\nbumper: Whether to use Bumper.jl for faster evaluation. Experimental!\nmigration: Whether to migrate equations between processes.\nhof_migration: Whether to migrate equations from the hall of fame to processes.\nfraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.\nfraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.\nshould_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.\nshould_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.\noptimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).\noptimizer_nrestarts: How many different random starting positions to consider for optimization of constants.\noptimizer_probability: Probability of performing optimization of constants at the end of a given iteration.\noptimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.\noptimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 0 which means no limit.\noptimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.\noutput_file: What file to store equations to, as a backup.\nperturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).\nprobability_negate_constant: Probability of negating a constant in the equation when mutating it.\nmutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.\ncrossover_probability: Probability of performing crossover.\nannealing: Whether to use simulated annealing.\nwarmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.\nverbosity: Whether to print debugging statements or not.\nprint_precision: How many digits to print when printing equations. By default, this is 5.\nsave_to_file: Whether to save equations to a file during the search.\nbin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).\nuna_constraints: Likewise, for unary operators.\nseed: What random seed to use. nothing uses no seed.\nprogress: Whether to use a progress bar output (verbosity will have no effect).\nearly_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.\ntimeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).\nmax_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.\nskip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.\nnested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.\ndeterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.\ndefine_helper_functions: Whether to define helper functions for constructing and evaluating trees.\nniterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.\nparallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like \"multithreading\".\nnumprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).\nprocs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.\naddprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.\nheap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.\nruntests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.\nloss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.\nselection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.\ndimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Operations","page":"MultitargetSRRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.\npredict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Fitted-parameters","page":"MultitargetSRRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity. T is equal to the element type of the passed data.\nequation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Report","page":"MultitargetSRRegressor","title":"Report","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity.\nequation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.\ncomplexities::Vector{Vector{Int}}: The complexity of each expression in each Pareto frontier.\nlosses::Vector{Vector{L}}: The loss of each expression in each Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.\nscores::Vector{Vector{L}}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Examples","page":"MultitargetSRRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"using MLJ\nMultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression\nX = (a=rand(100), b=rand(100), c=rand(100))\nY = (y1=(@. cos(X.c) * 2.1 - 0.9), y2=(@. X.a * X.b + X.c))\nmodel = MultitargetSRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)\nmach = machine(model, X, Y)\nfit!(mach)\ny_hat = predict(mach, X)\n## View the equations used:\nr = report(mach)\nfor (output_index, (eq, i)) in enumerate(zip(r.equation_strings, r.best_idx))\n println(\"Equation used for \", output_index, \": \", eq[i])\nend","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"See also SRRegressor.","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/#PerceptronClassifier_MLJScikitLearnInterface","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"","category":"section"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"PerceptronClassifier","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"A model type for constructing a perceptron classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"PerceptronClassifier = @load PerceptronClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"Do model = PerceptronClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PerceptronClassifier(penalty=...).","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"PerceptronClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"penalty = nothing\nalpha = 0.0001\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\neta0 = 1.0\nn_jobs = nothing\nrandom_state = 0\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nclass_weight = nothing\nwarm_start = false","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/#KNeighborsRegressor_MLJScikitLearnInterface","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"","category":"section"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"KNeighborsRegressor","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"A model type for constructing a K-nearest neighbors regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"KNeighborsRegressor = @load KNeighborsRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"Do model = KNeighborsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsRegressor(n_neighbors=...).","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"KNeighborsRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"n_neighbors = 5\nweights = uniform\nalgorithm = auto\nleaf_size = 30\np = 2\nmetric = minkowski\nmetric_params = nothing\nn_jobs = nothing","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#NeuralNetworkRegressor_MLJFlux","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"A model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Do model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Training-data","page":"NeuralNetworkRegressor","title":"Training data","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Here:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Hyper-parameters","page":"NeuralNetworkRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreasing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Operations","page":"NeuralNetworkRegressor","title":"Operations","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Fitted-parameters","page":"NeuralNetworkRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Report","page":"NeuralNetworkRegressor","title":"Report","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"training_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Examples","page":"NeuralNetworkRegressor","title":"Examples","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"In this example we build a regression model for the Boston house price dataset.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"using MLJ\nimport MLJFlux\nusing Flux","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"First, we load in the data: The :MEDV column becomes the target vector y, and all remaining columns go into a table X, with the exception of :CHAS:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"data = OpenML.load(531); ## Loads from https://www.openml.org/d/531\ny, X = unpack(data, ==(:MEDV), !=(:CHAS); rng=123);\n\nscitype(y)\nschema(X)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Since MLJFlux models do not handle ordered factors, we'll treat :RAD as Continuous:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"X = coerce(X, :RAD=>Continuous)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Splitting off a test set:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Next, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features (which will be known at fit! time) and rng is a proxy for a RNG (which will be passed from the rng field of model defined below). We also have the parameter n_out which is the number of output features. As we are doing single target regression, the value passed will always be 1, but the builder we define will also work for MultitargetNeuralRegressor.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"builder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Instantiating a model:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux\nmodel = NeuralNetworkRegressor(\n builder=builder,\n rng=123,\n epochs=20\n)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"We arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"pipe = Standardizer |> TransformedTargetModel(model, target=Standardizer)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"If we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach).","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"mach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n## first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Experimenting-with-learning-rate","page":"NeuralNetworkRegressor","title":"Experimenting with learning rate","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"We can visually compare how the learning rate affects the predictions:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"using Plots\n\nrates = rates = [5e-5, 1e-4, 0.005, 0.001, 0.05]\nplt=plot()\n\nforeach(rates) do η\n pipe.transformed_target_model_deterministic.model.optimiser.eta = η\n fit!(mach, force=true, verbosity=0)\n losses =\n report(mach).transformed_target_model_deterministic.model.training_losses[3:end]\n plot!(1:length(losses), losses, label=η)\nend\n\nplt\n\npipe.transformed_target_model_deterministic.model.optimiser.eta = 0.0001","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"With the learning rate fixed, we compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"## CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=l2)\n\n## loss for `(Xtest, test)`:\nfit!(mach) ## train on `(X, y)`\nyhat = predict(mach, Xtest)\nl2(yhat, ytest) |> mean","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"See also MultitargetNeuralNetworkRegressor","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/#PassiveAggressiveRegressor_MLJScikitLearnInterface","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"","category":"section"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"PassiveAggressiveRegressor","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"A model type for constructing a passive aggressive regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"PassiveAggressiveRegressor = @load PassiveAggressiveRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"Do model = PassiveAggressiveRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveRegressor(C=...).","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"PassiveAggressiveRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"C = 1.0\nfit_intercept = true\nmax_iter = 1000\ntol = 0.0001\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nshuffle = true\nverbose = 0\nloss = epsilon_insensitive\nepsilon = 0.1\nrandom_state = nothing\nwarm_start = false\naverage = false","category":"page"},{"location":"models/LOCIDetector_OutlierDetectionPython/#LOCIDetector_OutlierDetectionPython","page":"LOCIDetector","title":"LOCIDetector","text":"","category":"section"},{"location":"models/LOCIDetector_OutlierDetectionPython/","page":"LOCIDetector","title":"LOCIDetector","text":"LOCIDetector(alpha = 0.5,\n k = 3)","category":"page"},{"location":"models/LOCIDetector_OutlierDetectionPython/","page":"LOCIDetector","title":"LOCIDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.loci","category":"page"},{"location":"api/#Index-of-Methods","page":"Index of Methods","title":"Index of Methods","text":"","category":"section"},{"location":"api/","page":"Index of Methods","title":"Index of Methods","text":"","category":"page"},{"location":"models/OCSVMDetector_OutlierDetectionPython/#OCSVMDetector_OutlierDetectionPython","page":"OCSVMDetector","title":"OCSVMDetector","text":"","category":"section"},{"location":"models/OCSVMDetector_OutlierDetectionPython/","page":"OCSVMDetector","title":"OCSVMDetector","text":"OCSVMDetector(kernel = \"rbf\",\n degree = 3,\n gamma = \"auto\",\n coef0 = 0.0,\n tol = 0.001,\n nu = 0.5,\n shrinking = true,\n cache_size = 200,\n verbose = false,\n max_iter = -1)","category":"page"},{"location":"models/OCSVMDetector_OutlierDetectionPython/","page":"OCSVMDetector","title":"OCSVMDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.ocsvm","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/#ExtraTreesRegressor_MLJScikitLearnInterface","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"","category":"section"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"ExtraTreesRegressor","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"A model type for constructing a extra trees regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"ExtraTreesRegressor = @load ExtraTreesRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"Do model = ExtraTreesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesRegressor(n_estimators=...).","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"Extra trees regressor, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionPython/#LOFDetector_OutlierDetectionPython","page":"LOFDetector","title":"LOFDetector","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionPython/","page":"LOFDetector","title":"LOFDetector","text":"LOFDetector(n_neighbors = 5,\n algorithm = \"auto\",\n leaf_size = 30,\n metric = \"minkowski\",\n p = 2,\n metric_params = nothing,\n n_jobs = 1,\n novelty = true)","category":"page"},{"location":"models/LOFDetector_OutlierDetectionPython/","page":"LOFDetector","title":"LOFDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lof","category":"page"},{"location":"models/PerceptronClassifier_BetaML/#PerceptronClassifier_BetaML","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"","category":"section"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"mutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/PerceptronClassifier_BetaML/#Hyperparameters:","page":"PerceptronClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]\ninitial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]\nepochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]\nshuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\nforce_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]\nreturn_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/PerceptronClassifier_BetaML/#Example:","page":"PerceptronClassifier","title":"Example:","text":"","category":"section"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load PerceptronClassifier pkg = \"BetaML\"\n[ Info: For silent loading, specify `verbosity=0`. \nimport BetaML ✔\nBetaML.Perceptron.PerceptronClassifier\n\njulia> model = modelType()\nPerceptronClassifier(\n initial_coefficients = nothing, \n initial_constant = nothing, \n epochs = 1000, \n shuffle = true, \n force_origin = false, \n return_mean_hyperplane = false, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(PerceptronClassifier(initial_coefficients = nothing, …), …).\n*** Avg. error after epoch 2 : 0.0 (all elements of the set has been correctly classified)\njulia> est_classes = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>2.53e-34, virginica=>0.0)\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>1.27e-18, virginica=>1.86e-310)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>2.77e-57, versicolor=>1.1099999999999999e-82, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)","category":"page"},{"location":"models/ABODDetector_OutlierDetectionPython/#ABODDetector_OutlierDetectionPython","page":"ABODDetector","title":"ABODDetector","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionPython/","page":"ABODDetector","title":"ABODDetector","text":"ABODDetector(n_neighbors = 5,\n method = \"fast\")","category":"page"},{"location":"models/ABODDetector_OutlierDetectionPython/","page":"ABODDetector","title":"ABODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.abod","category":"page"},{"location":"preparing_data/#Preparing-Data","page":"Preparing Data","title":"Preparing Data","text":"","category":"section"},{"location":"preparing_data/#Splitting-data","page":"Preparing Data","title":"Splitting data","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MLJ has two tools for splitting data. To split data vertically (that is, to split by observations) use partition. This is commonly applied to a vector of observation indices, but can also be applied to datasets themselves, provided they are vectors, matrices or tables.","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"To split tabular data horizontally (i.e., break up a table based on feature names) use unpack.","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MLJBase.partition\nMLJBase.unpack","category":"page"},{"location":"preparing_data/#MLJBase.partition","page":"Preparing Data","title":"MLJBase.partition","text":"partition(X, fractions...;\n shuffle=nothing,\n rng=Random.GLOBAL_RNG,\n stratify=nothing,\n multi=false)\n\nSplits the vector, matrix or table X into a tuple of objects of the same type, whose vertical concatenation is X. The number of rows in each component of the return value is determined by the corresponding fractions of length(nrows(X)), where valid fractions are floats between 0 and 1 whose sum is less than one. The last fraction is not provided, as it is inferred from the preceding ones.\n\nFor synchronized partitioning of multiple objects, use the multi=true option.\n\njulia> partition(1:1000, 0.8)\n([1,...,800], [801,...,1000])\n\njulia> partition(1:1000, 0.2, 0.7)\n([1,...,200], [201,...,900], [901,...,1000])\n\njulia> partition(reshape(1:10, 5, 2), 0.2, 0.4)\n([1 6], [2 7; 3 8], [4 9; 5 10])\n\njulia> X, y = make_blobs() # a table and vector\njulia> Xtrain, Xtest = partition(X, 0.8, stratify=y)\n\nHere's an example of synchronized partitioning of multiple objects:\n\njulia> (Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)\n\nKeywords\n\nshuffle=nothing: if set to true, shuffles the rows before taking fractions.\nrng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.\nstratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.\nmulti=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).\n\n\n\n\n\n","category":"function"},{"location":"preparing_data/#MLJBase.unpack","page":"Preparing Data","title":"MLJBase.unpack","text":"unpack(table, f1, f2, ... fk;\n wrap_singles=false,\n shuffle=false,\n rng::Union{AbstractRNG,Int,Nothing}=nothing,\n coerce_options...)\n\nHorizontally split any Tables.jl compatible table into smaller tables or vectors by making column selections determined by the predicates f1, f2, ..., fk. Selection from the column names is without replacement. A predicate is any object f such that f(name) is true or false for each column name::Symbol of table.\n\nReturns a tuple of tables/vectors with length one greater than the number of supplied predicates, with the last component including all previously unselected columns.\n\njulia> table = DataFrame(x=[1,2], y=['a', 'b'], z=[10.0, 20.0], w=[\"A\", \"B\"])\n2×4 DataFrame\n Row │ x y z w\n │ Int64 Char Float64 String\n─────┼──────────────────────────────\n 1 │ 1 a 10.0 A\n 2 │ 2 b 20.0 B\n\njulia> Z, XY, W = unpack(table, ==(:z), !=(:w));\njulia> Z\n2-element Vector{Float64}:\n 10.0\n 20.0\n\njulia> XY\n2×2 DataFrame\n Row │ x y\n │ Int64 Char\n─────┼─────────────\n 1 │ 1 a\n 2 │ 2 b\n\njulia> W # the column(s) left over\n2-element Vector{String}:\n \"A\"\n \"B\"\n\nWhenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.\n\nIf coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.\n\nIf shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.\n\n\n\n\n\n","category":"function"},{"location":"preparing_data/#Bridging-the-gap-between-data-type-and-model-requirements","page":"Preparing Data","title":"Bridging the gap between data type and model requirements","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"As outlined in Getting Started, it is important that the scientific type of data matches the requirements of the model of interest. For example, while the majority of supervised learning models require input features to be Continuous, newcomers to MLJ are sometimes surprised at the disappointing results of model queries such as this one:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"using MLJ","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"X = (height = [185, 153, 163, 114, 180],\n time = [2.3, 4.5, 4.2, 1.8, 7.1],\n mark = [\"D\", \"A\", \"C\", \"B\", \"A\"],\n admitted = [\"yes\", \"no\", missing, \"yes\"]);\ny = [12.4, 12.5, 12.0, 31.9, 43.0]\nmodels(matching(X, y))","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Or are unsure about the source of the following warning:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"julia> Tree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0;\njulia> tree = Tree();\n\njulia> machine(tree, X, y)\n┌ Warning: The scitype of `X`, in `machine(model, X, ...)` is incompatible with `model=DecisionTreeRegressor @378`:\n│ scitype(X) = Table{Union{AbstractVector{Continuous}, AbstractVector{Count}, AbstractVector{Textual}, AbstractVector{Union{Missing, Textual}}}}\n│ input_scitype(model) = Table{var\"#s46\"} where var\"#s46\"<:Union{AbstractVector{var\"#s9\"} where var\"#s9\"<:Continuous, AbstractVector{var\"#s9\"} where var\"#s9\"<:Count, AbstractVector{var\"#s9\"} where var\"#s9\"<:OrderedFactor}.\n└ @ MLJBase ~/Dropbox/Julia7/MLJ/MLJBase/src/machines.jl:103\nMachine{DecisionTreeRegressor,…} @198 trained 0 times; caches data\n args:\n 1: Source @628 ⏎ `Table{Union{AbstractVector{Continuous}, AbstractVector{Count}, AbstractVector{Textual}, AbstractVector{Union{Missing, Textual}}}}`\n 2: Source @544 ⏎ `AbstractVector{Continuous}`","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"The meaning of the warning is:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"The input X is a table with column scitypes Continuous, Count, and Textual and Union{Missing, Textual}, which can also see by inspecting the schema:\nschema(X)\nThe model requires a table whose column element scitypes subtype Continuous, an incompatibility.","category":"page"},{"location":"preparing_data/#Common-data-preprocessing-workflows","page":"Preparing Data","title":"Common data preprocessing workflows","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"There are two tools for addressing data-model type mismatches like the above, with links to further documentation given below:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Scientific type coercion: We coerce machine types to obtain the intended scientific interpretation. If height in the above example is intended to be Continuous, mark is supposed to be OrderedFactor, and admitted a (binary) Multiclass, then we can do","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"X_coerced = coerce(X, :height=>Continuous, :mark=>OrderedFactor, :admitted=>Multiclass);\nschema(X_coerced)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Data transformations: We carry out conventional data transformations, such as missing value imputation and feature encoding:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"imputer = FillImputer()\nmach = machine(imputer, X_coerced) |> fit!\nX_imputed = transform(mach, X_coerced);\nschema(X_imputed)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"encoder = ContinuousEncoder()\nmach = machine(encoder, X_imputed) |> fit!\nX_encoded = transform(mach, X_imputed)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"schema(X_encoded)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Such transformations can also be combined in a pipeline; see Linear Pipelines.","category":"page"},{"location":"preparing_data/#Scientific-type-coercion","page":"Preparing Data","title":"Scientific type coercion","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Scientific type coercion is documented in detail at ScientificTypesBase.jl. See also the tutorial at the this MLJ Workshop (specifically, here) and this Data Science in Julia tutorial.","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Also relevant is the section, Working with Categorical Data.","category":"page"},{"location":"preparing_data/#Data-transformation","page":"Preparing Data","title":"Data transformation","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MLJ's Built-in transformers are documented at Transformers and Other Unsupervised Models. The most relevant in the present context are: ContinuousEncoder, OneHotEncoder, FeatureSelector and FillImputer. A Gaussian mixture models imputer is provided by BetaML, which can be loaded with","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MissingImputator = @load MissingImputator pkg=BetaML","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"This MLJ Workshop, and the \"End-to-end examples\" in Data Science in Julia tutorials give further illustrations of data preprocessing in MLJ.","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/#AgglomerativeClustering_MLJScikitLearnInterface","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"","category":"section"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"AgglomerativeClustering","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"A model type for constructing a agglomerative clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"AgglomerativeClustering = @load AgglomerativeClustering pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"Do model = AgglomerativeClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AgglomerativeClustering(n_clusters=...).","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"Recursively merges the pair of clusters that minimally increases a given linkage distance. Note: there is no predict or transform. Instead, inspect the fitted_params.","category":"page"},{"location":"","page":"Home","title":"Home","text":"\n\n
\n About  | \n Install  | \n Learn  | \n Cheatsheet  | \n Workflows  | \n For Developers  | \n 3rd Party Packages\n
\n\n\nMLJ\n
\n\nA Machine Learning Framework for Julia","category":"page"},{"location":"","page":"Home","title":"Home","text":"To support MLJ development, please cite these works or star the repo:","category":"page"},{"location":"","page":"Home","title":"Home","text":"(Image: DOI) (Image: arXiv)","category":"page"},{"location":"","page":"Home","title":"Home","text":"\n Star","category":"page"},{"location":"#[Model-Browser](@ref)","page":"Home","title":"Model Browser","text":"","category":"section"},{"location":"#Reference-Manual","page":"Home","title":"Reference Manual","text":"","category":"section"},{"location":"#Basics","page":"Home","title":"Basics","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Getting Started | Working with Categorical Data | Common MLJ Workflows | Machines | MLJ Cheatsheet ","category":"page"},{"location":"#Data","page":"Home","title":"Data","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Working with Categorical Data | Preparing Data | Generating Synthetic Data | OpenML Integration | Correcting Class Imbalance","category":"page"},{"location":"#Models","page":"Home","title":"Models","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Model Search | Loading Model Code | Transformers and Other Unsupervised Models | More on Probabilistic Predictors | Composing Models | Simple User Defined Models | List of Supported Models | Third Party Packages ","category":"page"},{"location":"#Meta-algorithms","page":"Home","title":"Meta-algorithms","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Evaluating Model Performance | Tuning Models | Controlling Iterative Models | Learning Curves| Correcting Class Imbalance","category":"page"},{"location":"#Composition","page":"Home","title":"Composition","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Composing Models | Linear Pipelines | Target Transformations | Homogeneous Ensembles | Model Stacking | Learning Networks| Correcting Class Imbalance","category":"page"},{"location":"#Integration","page":"Home","title":"Integration","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Logging Workflows | OpenML Integration","category":"page"},{"location":"#Customization-and-Extension","page":"Home","title":"Customization and Extension","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Simple User Defined Models | Quick-Start Guide to Adding Models | Adding Models for General Use | Composing Models | Internals | Modifying Behavior","category":"page"},{"location":"#Miscellaneous","page":"Home","title":"Miscellaneous","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Weights | Acceleration and Parallelism | Performance Measures ","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/#SVMNuClassifier_MLJScikitLearnInterface","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"","category":"section"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"SVMNuClassifier","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"A model type for constructing a nu-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"SVMNuClassifier = @load SVMNuClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"Do model = SVMNuClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuClassifier(nu=...).","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMNuClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"nu = 0.5\nkernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\nshrinking = true\ntol = 0.001\ncache_size = 200\nmax_iter = -1\ndecision_function_shape = ovr\nrandom_state = nothing","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#KernelPCA_MultivariateStats","page":"KernelPCA","title":"KernelPCA","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"KernelPCA","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"A model type for constructing a kernel prinicipal component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"KernelPCA = @load KernelPCA pkg=MultivariateStats","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"Do model = KernelPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KernelPCA(maxoutdim=...).","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"In kernel PCA the linear operations of ordinary principal component analysis are performed in a reproducing Hilbert space.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Training-data","page":"KernelPCA","title":"Training data","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"mach = machine(model, X)","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"Here:","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Hyper-parameters","page":"KernelPCA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.\nkernel::Function=(x,y)->x'y: The kernel function, takes in 2 vector arguments x and y, returns a scalar value. Defaults to the dot product of x and y.\nsolver::Symbol=:eig: solver to use for the eigenvalues, one of :eig(default, uses LinearAlgebra.eigen), :eigs(uses Arpack.eigs).\ninverse::Bool=true: perform calculations needed for inverse transform\nbeta::Real=1.0: strength of the ridge regression that learns the inverse transform when inverse is true.\ntol::Real=0.0: Convergence tolerance for eigenvalue solver.\nmaxiter::Int=300: maximum number of iterations for eigenvalue solver.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Operations","page":"KernelPCA","title":"Operations","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Fitted-parameters","page":"KernelPCA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Report","page":"KernelPCA","title":"Report","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data.\nprincipalvars: The variance of the principal components.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Examples","page":"KernelPCA","title":"Examples","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"using MLJ\nusing LinearAlgebra\n\nKernelPCA = @load KernelPCA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nfunction rbf_kernel(length_scale)\n return (x,y) -> norm(x-y)^2 / ((2 * length_scale)^2)\nend\n\nmodel = KernelPCA(maxoutdim=2, kernel=rbf_kernel(1))\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"See also PCA, ICA, FactorAnalysis, PPCA","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#StableRulesClassifier_SIRUS","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"StableRulesClassifier","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"A model type for constructing a stable rules classifier, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"StableRulesClassifier = @load StableRulesClassifier pkg=SIRUS","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"Do model = StableRulesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesClassifier(rng=...).","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"StableRulesClassifier implements the explainable rule-based model based on a random forest.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Training-data","page":"StableRulesClassifier","title":"Training data","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"where","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Hyperparameters","page":"StableRulesClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.\nmax_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.\nlambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Fitted-parameters","page":"StableRulesClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"fitresult: A StableRules object.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Operations","page":"StableRulesClassifier","title":"Operations","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"quick_start_guide_to_adding_models/#Quick-Start-Guide-to-Adding-Models","page":"Quick-Start Guide to Adding Models","title":"Quick-Start Guide to Adding Models","text":"","category":"section"},{"location":"quick_start_guide_to_adding_models/","page":"Quick-Start Guide to Adding Models","title":"Quick-Start Guide to Adding Models","text":"This guide has moved to this section of the MLJModelInterface.jl documentation.","category":"page"},{"location":"quick_start_guide_to_adding_models/","page":"Quick-Start Guide to Adding Models","title":"Quick-Start Guide to Adding Models","text":"For quick-and-dirty user-defined models, not intended for registering with the MLJ Model Registry, see Simple User Defined Models. ","category":"page"},{"location":"target_transformations/#Target-Transformations","page":"Target Transformations","title":"Target Transformations","text":"","category":"section"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Some supervised models work best if the target variable has been standardized, i.e., rescaled to have zero mean and unit variance. Such a target transformation is learned from the values of the training target variable. In particular, one generally learns a different transformation when training on a proper subset of the training data. Good data hygiene prescribes that a new transformation should be computed each time the supervised model is trained on new data - for example in cross-validation.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Additionally, one generally wants to inverse transform the predictions of the supervised model for the final target predictions to be on the original scale.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"All these concerns are addressed by wrapping the supervised model using TransformedTargetModel:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Ridge = @load RidgeRegressor pkg=MLJLinearModels verbosity=0\nridge = Ridge(fit_intercept=false)\nridge2 = TransformedTargetModel(ridge, transformer=Standardizer())","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Note that all the original hyperparameters, as well as those of the Standardizer, are accessible as nested hyper-parameters of the wrapped model, which can be trained or evaluated like any other:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"X, y = make_regression(rng=1234, intercept=false)\ny = y*10^5\nmach = machine(ridge2, X, y)\nfit!(mach, rows=1:60, verbosity=0)\npredict(mach, rows=61:62)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Training and predicting using ridge2 as above means:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Standardizing the target y using the first 60 rows to get a new target z\nTraining the original ridge model using the first 60 rows of X and z\nCalling predict on the machine trained in Step 2 on rows 61:62 of X\nApplying the inverse scaling learned in Step 1 to those predictions (to get the final output shown above)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Since both ridge and ridge2 return predictions on the original scale, we can meaningfully compare the corresponding mean absolute errors, which are indeed different in this case.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"evaluate(ridge, X, y, measure=l1)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"evaluate(ridge2, X, y, measure=l1)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Ordinary functions can also be used in target transformations but an inverse must be explicitly specified:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"ridge3 = TransformedTargetModel(ridge, transformer=y->log.(y), inverse=z->exp.(z))\nX, y = @load_boston\nevaluate(ridge3, X, y, measure=l1)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Without the log transform (ie, using ridge) we get the poorer mean absolute error, l1, of 3.9.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"TransformedTargetModel","category":"page"},{"location":"target_transformations/#MLJBase.TransformedTargetModel","page":"Target Transformations","title":"MLJBase.TransformedTargetModel","text":"TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)\n\nWrap the supervised or semi-supervised model in a transformation of the target variable.\n\nHere transformer one of the following:\n\nThe Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.\nA callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.\n\nSpecify cache=false to prioritize memory over speed, or to guarantee data anonymity.\n\nSpecify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).\n\nExamples\n\nA model that normalizes the target before applying ridge regression, with predictions returned on the original scale:\n\n@load RidgeRegressor pkg=MLJLinearModels\nmodel = RidgeRegressor()\ntmodel = TransformedTargetModel(model, transformer=Standardizer())\n\nA model that applies a static log transformation to the data, again returning predictions to the original scale:\n\ntmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))\n\n\n\n\n\n","category":"function"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/#SVMClassifier_MLJScikitLearnInterface","page":"SVMClassifier","title":"SVMClassifier","text":"","category":"section"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"SVMClassifier","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"A model type for constructing a C-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"SVMClassifier = @load SVMClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"Do model = SVMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMClassifier(C=...).","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"C = 1.0\nkernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\nshrinking = true\ntol = 0.001\ncache_size = 200\nmax_iter = -1\ndecision_function_shape = ovr\nrandom_state = nothing","category":"page"},{"location":"models/PCADetector_OutlierDetectionPython/#PCADetector_OutlierDetectionPython","page":"PCADetector","title":"PCADetector","text":"","category":"section"},{"location":"models/PCADetector_OutlierDetectionPython/","page":"PCADetector","title":"PCADetector","text":"PCADetector(n_components = nothing,\n n_selected_components = nothing,\n copy = true,\n whiten = false,\n svd_solver = \"auto\",\n tol = 0.0\n iterated_power = \"auto\",\n standardization = true,\n weighted = true,\n random_state = nothing)","category":"page"},{"location":"models/PCADetector_OutlierDetectionPython/","page":"PCADetector","title":"PCADetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.pca","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#RandomForestClassifier_DecisionTree","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A model type for constructing a CART random forest classifier, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(max_depth=...).","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): \"Random Forests.\", Machine Learning, vol. 45, pp. 5–32.","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Training-data","page":"RandomForestClassifier","title":"Training data","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"where","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Hyperparameters","page":"RandomForestClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: min number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)\nn_trees=10: number of trees to train\nsampling_fraction=0.7 fraction of samples to train each tree on\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Operations","page":"RandomForestClassifier","title":"Operations","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.\npredict_mode(mach, Xnew): instead return the mode of each prediction above.","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Fitted-parameters","page":"RandomForestClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"forest: the Ensemble object returned by the core DecisionTree.jl algorithm","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Report","page":"RandomForestClassifier","title":"Report","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Accessor-functions","page":"RandomForestClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Examples","page":"RandomForestClassifier","title":"Examples","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"using MLJ\nForest = @load RandomForestClassifier pkg=DecisionTree\nforest = Forest(min_samples_split=6, n_subfeatures=3)\n\nX, y = @load_iris\nmach = machine(forest, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\nyhat = predict(mach, Xnew) ## probabilistic predictions\npredict_mode(mach, Xnew) ## point predictions\npdf.(yhat, \"virginica\") ## probabilities for the \"verginica\" class\n\nfitted_params(mach).forest ## raw `Ensemble` object from DecisionTrees.jl\n\nfeature_importances(mach) ## `:impurity` feature importances\nforest.feature_importance = :split\nfeature_importance(mach) ## `:split` feature importances\n","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestClassifier.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#LADRegressor_MLJLinearModels","page":"LADRegressor","title":"LADRegressor","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"LADRegressor","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"A model type for constructing a lad regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"LADRegressor = @load LADRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Do model = LADRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Least absolute deviation regression is a linear model with objective function","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"$","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"where ρ is the absolute loss and n is the number of observations.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"If scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"$","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":".","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Training-data","page":"LADRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"where:","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Hyperparameters","page":"LADRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"See also RobustRegressor.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Parameters","page":"LADRegressor","title":"Parameters","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Example","page":"LADRegressor","title":"Example","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(LADRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#RidgeRegressor_MLJLinearModels","page":"RidgeRegressor","title":"RidgeRegressor","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"A model type for constructing a ridge regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Do model = RidgeRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Ridge regression is a linear model with objective function","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"$","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 $","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"where n is the number of observations.","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"If scale_penalty_with_samples = false then the objective function is instead","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"$","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"|Xθ - y|₂²/2 + λ|θ|₂²/2 $","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":".","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#Training-data","page":"RidgeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"where:","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#Hyperparameters","page":"RidgeRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"lambda::Real: strength of the L2 regularization. Default: 1.0\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient. If solver = nothing (default) then Analytical() is used. Default: nothing","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#Example","page":"RidgeRegressor","title":"Example","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(RidgeRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"See also ElasticNetRegressor.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionPython/#KNNDetector_OutlierDetectionPython","page":"KNNDetector","title":"KNNDetector","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionPython/","page":"KNNDetector","title":"KNNDetector","text":"KNNDetector(n_neighbors = 5,\n method = \"largest\",\n radius = 1.0,\n algorithm = \"auto\",\n leaf_size = 30,\n metric = \"minkowski\",\n p = 2,\n metric_params = nothing,\n n_jobs = 1)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionPython/","page":"KNNDetector","title":"KNNDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.knn","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#LinearRegressor_MultivariateStats","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=MultivariateStats","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(bias=...).","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor assumes the target is a Continuous variable and trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Training-data","page":"LinearRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"Here:","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y).","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Hyper-parameters","page":"LinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"bias=true: Include the bias term if true, otherwise fit without bias term.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Operations","page":"LinearRegressor","title":"Operations","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Fitted-parameters","page":"LinearRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"coefficients: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Examples","page":"LinearRegressor","title":"Examples","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"using MLJ\n\nLinearRegressor = @load LinearRegressor pkg=MultivariateStats\nlinear_regressor = LinearRegressor()\n\nX, y = make_regression(100, 2) ## a table and a vector (synthetic data)\nmach = machine(linear_regressor, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2)\nyhat = predict(mach, Xnew) ## new predictions","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"See also MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#QuantileRegressor_MLJLinearModels","page":"QuantileRegressor","title":"QuantileRegressor","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"QuantileRegressor","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"A model type for constructing a quantile regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"Do model = QuantileRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to QuantileRho(delta), where delta is a new hyperparameter.","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#Training-data","page":"QuantileRegressor","title":"Training data","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"where:","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#Hyperparameters","page":"QuantileRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"delta::Real: parameterizes the QuantileRho function (indicating the quantile to use with default 0.5 for the median regression) Default: 0.5\nlambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#Example","page":"QuantileRegressor","title":"Example","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(QuantileRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"See also RobustRegressor, HuberRegressor.","category":"page"},{"location":"mlj_cheatsheet/#MLJ-Cheatsheet","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"","category":"section"},{"location":"mlj_cheatsheet/#Starting-an-interactive-MLJ-session","page":"MLJ Cheatsheet","title":"Starting an interactive MLJ session","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"using MLJ\nMLJ_VERSION # version of MLJ for this cheatsheet","category":"page"},{"location":"mlj_cheatsheet/#Model-search-and-code-loading","page":"MLJ Cheatsheet","title":"Model search and code loading","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"info(\"PCA\") retrieves registry metadata for the model called \"PCA\"","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"info(\"RidgeRegressor\", pkg=\"MultivariateStats\") retrieves metadata for \"RidgeRegresssor\", which is provided by multiple packages","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\") retrieves the model document string for the classifier, without loading model code","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models() lists metadata of every registered model.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(\"Tree\") lists models with \"Tree\" in the model or package name.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(x -> x.is_supervised && x.is_pure_julia) lists all supervised models written in pure julia.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(matching(X)) lists all unsupervised models compatible with input X.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(matching(X, y)) lists all supervised models compatible with input/target X/y.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"With additional conditions:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models() do model\n matching(model, X, y) &&\n model.prediction_type == :probabilistic &&\n model.is_pure_julia\nend","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"imports \"DecisionTreeClassifier\" type and binds it to Tree.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"tree = Tree() to instantiate a Tree.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"tree2 = Tree(max_depth=2) instantiates a tree with different hyperparameter","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Ridge = @load RidgeRegressor pkg=MultivariateStats imports a type for a model provided by multiple packages","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"For interactive loading instead, use @iload","category":"page"},{"location":"mlj_cheatsheet/#Scitypes-and-coercion","page":"MLJ Cheatsheet","title":"Scitypes and coercion","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"scitype(x) is the scientific type of x. For example scitype(2.4) == Continuous","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"(Image: scitypes_small.png)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"type scitype\nAbstractFloat Continuous\nInteger Count\nCategoricalValue and CategoricalString Multiclass or OrderedFactor\nAbstractString Textual","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Figure and Table for common scalar scitypes","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Use schema(X) to get the column scitypes of a table X","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"To coerce the data into different scitypes, use the coerce function:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"coerce(y, Multiclass) attempts coercion of all elements of y into scitype Multiclass\ncoerce(X, :x1 => Continuous, :x2 => OrderedFactor) to coerce columns :x1 and :x2 of table X.\ncoerce(X, Count => Continuous) to coerce all columns with Count scitype to Continuous.","category":"page"},{"location":"mlj_cheatsheet/#Ingesting-data","page":"MLJ Cheatsheet","title":"Ingesting data","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Split the table channing into target y (the :Exit column) and features X (everything else), after a seeded row shuffling:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"using RDatasets\nchanning = dataset(\"boot\", \"channing\")\ny, X = unpack(channing, ==(:Exit); rng=123)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Same as above but exclude :Time column from X:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"using RDatasets\nchanning = dataset(\"boot\", \"channing\")\ny, X = unpack(channing,\n ==(:Exit),\n !=(:Time);\n rng=123)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Here, y is assigned the :Exit column, and X is assigned the rest, except :Time.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Splitting row indices into train/validation/test, with seeded shuffling:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"train, valid, test = partition(eachindex(y), 0.7, 0.2, rng=1234) # for 70:20:10 ratio","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"For a stratified split:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"train, test = partition(eachindex(y), 0.8, stratify=y)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Split a table or matrix X, instead of indices:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Xtrain, Xvalid, Xtest = partition(X, 0.5, 0.3, rng=123)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Simultaneous splitting (needs multi=true):","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Getting data from OpenML:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"table = OpenML.load(91)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Creating synthetic classification data:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"X, y = make_blobs(100, 2)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"(also: make_moons, make_circles, make_regression)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Creating synthetic regression data:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"X, y = make_regression(100, 2)","category":"page"},{"location":"mlj_cheatsheet/#Machine-construction","page":"MLJ Cheatsheet","title":"Machine construction","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Supervised case:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"model = KNNRegressor(K=1)\nmach = machine(model, X, y)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Unsupervised case:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"model = OneHotEncoder()\nmach = machine(model, X)","category":"page"},{"location":"mlj_cheatsheet/#Fitting","page":"MLJ Cheatsheet","title":"Fitting","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"The fit! function can be used to fit a machine (defaults shown):","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"fit!(mach, rows=1:100, verbosity=1, force=false)","category":"page"},{"location":"mlj_cheatsheet/#Prediction","page":"MLJ Cheatsheet","title":"Prediction","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Supervised case: predict(mach, Xnew) or predict(mach, rows=1:100)\nFor probabilistic models: predict_mode, predict_mean and predict_median.\nUnsupervised case: W = transform(mach, Xnew) or inverse_transform(mach, W), etc.","category":"page"},{"location":"mlj_cheatsheet/#Inspecting-objects","page":"MLJ Cheatsheet","title":"Inspecting objects","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"info(ConstantRegressor()), info(\"PCA\"), info(\"RidgeRegressor\", pkg=\"MultivariateStats\") gets all properties (aka traits) of registered models","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"schema(X) get column names, types and scitypes, and nrows, of a table X","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"scitype(X) gets the scientific type of X","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"fitted_params(mach) gets learned parameters of the fitted machine","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"report(mach) gets other training results (e.g. feature rankings)","category":"page"},{"location":"mlj_cheatsheet/#Saving-and-retrieving-machines-using-Julia-serializer","page":"MLJ Cheatsheet","title":"Saving and retrieving machines using Julia serializer","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"MLJ.save(\"my_machine.jls\", mach) to save machine mach (without data)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"predict_only_mach = machine(\"my_machine.jls\") to deserialize.","category":"page"},{"location":"mlj_cheatsheet/#Performance-estimation","page":"MLJ Cheatsheet","title":"Performance estimation","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"evaluate(model, X, y, resampling=CV(), measure=rms)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"evaluate!(mach, resampling=Holdout(), measure=[rms, mav])","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"evaluate!(mach, resampling=[(fold1, fold2), (fold2, fold1)], measure=rms)","category":"page"},{"location":"mlj_cheatsheet/#Resampling-strategies-(resampling...)","page":"MLJ Cheatsheet","title":"Resampling strategies (resampling=...)","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Holdout(fraction_train=0.7, rng=1234) for simple holdout","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"CV(nfolds=6, rng=1234) for cross-validation","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"StratifiedCV(nfolds=6, rng=1234) for stratified cross-validation","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"TimeSeriesSV(nfolds=4) for time-series cross-validation","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"InSample(): test set = train set","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"or a list of pairs of row indices:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"[(train1, eval1), (train2, eval2), ... (traink, evalk)]","category":"page"},{"location":"mlj_cheatsheet/#Tuning-model-wrapper","page":"MLJ Cheatsheet","title":"Tuning model wrapper","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"tuned_model = TunedModel(model; tuning=RandomSearch(), resampling=Holdout(), measure=…, range=…)","category":"page"},{"location":"mlj_cheatsheet/#Ranges-for-tuning-(range...)","page":"MLJ Cheatsheet","title":"Ranges for tuning (range=...)","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"If r = range(KNNRegressor(), :K, lower=1, upper = 20, scale=:log)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"then Grid() search uses iterator(r, 6) == [1, 2, 3, 6, 11, 20].","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"lower=-Inf and upper=Inf are allowed.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Non-numeric ranges: r = range(model, :parameter, values=…)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Instead of model, declare type: r = range(Char, :c; values=['a', 'b'])","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Nested ranges: Use dot syntax, as in r = range(EnsembleModel(atom=tree), :(atom.max_depth), ...)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Specify multiple ranges, as in range=[r1, r2, r3]. For more range options do ?Grid or ?RandomSearch","category":"page"},{"location":"mlj_cheatsheet/#Tuning-strategies","page":"MLJ Cheatsheet","title":"Tuning strategies","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"RandomSearch(rng=1234) for basic random search","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Grid(resolution=10) or Grid(goal=50) for basic grid search","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Also available: LatinHyperCube, Explicit (built-in), MLJTreeParzenTuning, ParticleSwarm, AdaptiveParticleSwarm (3rd-party packages)","category":"page"},{"location":"mlj_cheatsheet/#Learning-curves","page":"MLJ Cheatsheet","title":"Learning curves","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"For generating a plot of performance against parameter specified by range:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"curve = learning_curve(mach, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"curve = learning_curve(model, X, y, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"If using Plots.jl:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"plot(curve.parameter_values, curve.measurements, xlab=curve.parameter_name, xscale=curve.parameter_scale)","category":"page"},{"location":"mlj_cheatsheet/#Controlling-iterative-models","page":"MLJ Cheatsheet","title":"Controlling iterative models","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Requires: using MLJIteration","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"iterated_model = IteratedModel(model=…, resampling=Holdout(), measure=…, controls=…, retrain=false)","category":"page"},{"location":"mlj_cheatsheet/#Controls","page":"MLJ Cheatsheet","title":"Controls","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Increment training: Step(n=1)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Stopping: TimeLimit(t=0.5) (in hours), NumberLimit(n=100), NumberSinceBest(n=6), NotANumber(), Threshold(value=0.0), GL(alpha=2.0), PQ(alpha=0.75, k=5), Patience(n=5)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Logging: Info(f=identity), Warn(f=\"\"), Error(predicate, f=\"\")","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Callbacks: Callback(f=mach->nothing), WithNumberDo(f=n->@info(n)), WithIterationsDo(f=i->@info(\"num iterations: $i\")), WithLossDo(f=x->@info(\"loss: $x\")), WithTrainingLossesDo(f=v->@info(v))","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Snapshots: Save(filename=\"machine.jlso\")","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Wraps: MLJIteration.skip(control, predicate=1), IterationControl.with_state_do(control)","category":"page"},{"location":"mlj_cheatsheet/#Performance-measures-(metrics)","page":"MLJ Cheatsheet","title":"Performance measures (metrics)","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Do measures() to get full list.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Do measures(\"log\") to list measures with \"log\" in doc-string.","category":"page"},{"location":"mlj_cheatsheet/#Transformers","page":"MLJ Cheatsheet","title":"Transformers","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Built-ins include: Standardizer, OneHotEncoder, UnivariateBoxCoxTransformer, FeatureSelector, FillImputer, UnivariateDiscretizer, ContinuousEncoder, UnivariateTimeTypeToContinuous","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Externals include: PCA (in MultivariateStats), KMeans, KMedoids (in Clustering).","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(m -> !m.is_supervised) to get full list","category":"page"},{"location":"mlj_cheatsheet/#Ensemble-model-wrapper","page":"MLJ Cheatsheet","title":"Ensemble model wrapper","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"EnsembleModel(model; weights=Float64[], bagging_fraction=0.8, rng=GLOBAL_RNG, n=100, parallel=true, out_of_bag_measure=[])","category":"page"},{"location":"mlj_cheatsheet/#Target-transformation-wrapper","page":"MLJ Cheatsheet","title":"Target transformation wrapper","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"TransformedTargetModel(model; target=Standardizer())","category":"page"},{"location":"mlj_cheatsheet/#Pipelines","page":"MLJ Cheatsheet","title":"Pipelines","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"pipe = (X -> coerce(X, :height=>Continuous)) |> OneHotEncoder |> KNNRegressor(K=3)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Unsupervised:\npipe = Standardizer |> OneHotEncoder\nConcatenation:\npipe1 |> pipe2 or model |> pipe or pipe |> model, etc.","category":"page"},{"location":"mlj_cheatsheet/#Advanced-model-composition-techniques","page":"MLJ Cheatsheet","title":"Advanced model composition techniques","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"See the Composing Models section of the MLJ manual.","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/#ExtraTreesClassifier_MLJScikitLearnInterface","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"","category":"section"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"ExtraTreesClassifier","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"A model type for constructing a extra trees classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"ExtraTreesClassifier = @load ExtraTreesClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"Do model = ExtraTreesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesClassifier(n_estimators=...).","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"Extra trees classifier, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/#SGDRegressor_MLJScikitLearnInterface","page":"SGDRegressor","title":"SGDRegressor","text":"","category":"section"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"SGDRegressor","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"A model type for constructing a stochastic gradient descent-based regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"SGDRegressor = @load SGDRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"Do model = SGDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDRegressor(loss=...).","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SGDRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"loss = squared_error\npenalty = l2\nalpha = 0.0001\nl1_ratio = 0.15\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\nepsilon = 0.1\nrandom_state = nothing\nlearning_rate = invscaling\neta0 = 0.01\npower_t = 0.25\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nwarm_start = false\naverage = false","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/#LassoCVRegressor_MLJScikitLearnInterface","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"","category":"section"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"LassoCVRegressor","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"A model type for constructing a lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"LassoCVRegressor = @load LassoCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"Do model = LassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoCVRegressor(eps=...).","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"eps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nprecompute = auto\nmax_iter = 1000\ntol = 0.0001\ncopy_X = true\ncv = 5\nverbose = false\nn_jobs = nothing\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#BorderlineSMOTE1_Imbalance","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Initiate a BorderlineSMOTE1 model with the given hyper-parameters.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"BorderlineSMOTE1","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"A model type for constructing a borderline smot e1, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Do model = BorderlineSMOTE1() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BorderlineSMOTE1(m=...).","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"BorderlineSMOTE1 implements the BorderlineSMOTE1 algorithm to correct for class imbalance as in Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In D.S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer. ","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Training-data","page":"BorderlineSMOTE1","title":"Training data","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"mach = machine(model)","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"model = BorderlineSMOTE1()","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Hyperparameters","page":"BorderlineSMOTE1","title":"Hyperparameters","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"m::Integer=5: The number of neighbors to consider while checking the BorderlineSMOTE1 condition. Should be within the range 0 < m < N where N is the number of observations in the data. It will be automatically set to N-1 if N ≤ m.\nk::Integer=5: Number of nearest neighbors to consider in the SMOTE part of the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class. It will be automatically set to l-1 for any class with l points where l ≤ k.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.\nverbosity::Integer=1: Whenever higher than 0 info regarding the points that will participate in oversampling is logged.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Transform-Inputs","page":"BorderlineSMOTE1","title":"Transform Inputs","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Transform-Outputs","page":"BorderlineSMOTE1","title":"Transform Outputs","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Operations","page":"BorderlineSMOTE1","title":"Operations","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"transform(mach, X, y): resample the data X and y using BorderlineSMOTE1, returning both the new and original observations","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Example","page":"BorderlineSMOTE1","title":"Example","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 1000, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n stds=[0.1 0.1 0.1], min_sep=0.01, class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 200 (40.8%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 310 (63.3%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%) \n\n## load BorderlineSMOTE1\nBorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance\n\n## wrap the model in a machine\noversampler = BorderlineSMOTE1(m=3, k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 392 (80.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 441 (90.0%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%) ","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/#DecisionTreeClassifier_BetaML","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"mutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/#Hyperparameters:","page":"DecisionTreeClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/#Example:","page":"DecisionTreeClassifier","title":"Example:","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load DecisionTreeClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Trees.DecisionTreeClassifier\n\njulia> model = modelType()\nDecisionTreeClassifier(\n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.gini, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(DecisionTreeClassifier(max_depth = 0, …), …).\n\njulia> cat_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)","category":"page"},{"location":"loading_model_code/#Loading-Model-Code","page":"Loading Model Code","title":"Loading Model Code","text":"","category":"section"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"Once the name of a model, and the package providing that model, have been identified (see Model Search) one can either import the model type interactively with @iload, as shown under Installation, or use @load as shown below. The @load macro works from within a module, a package or a function, provided the relevant package providing the MLJ interface has been added to your package environment. It will attempt to load the model type into the global namespace of the module in which @load is invoked (Main if invoked at the REPL).","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"In general, the code providing core functionality for the model (living in a package you should consult for documentation) may be different from the package providing the MLJ interface. Since the core package is a dependency of the interface package, only the interface package needs to be added to your environment.","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"For instance, suppose you have activated a Julia package environment my_env that you wish to use for your MLJ project; for example, you have run:","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"using Pkg\nPkg.activate(\"my_env\", shared=true)","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"Furthermore, suppose you want to use DecisionTreeClassifier, provided by the DecisionTree.jl package. Then, to determine which package provides the MLJ interface you call load_path:","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> load_path(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")\n\"MLJDecisionTreeInterface.DecisionTreeClassifier\"","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"In this case, we see that the package required is MLJDecisionTreeInterface.jl. If this package is not in my_env (do Pkg.status() to check) you add it by running","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> Pkg.add(\"MLJDecisionTreeInterface\")","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"So long as my_env is the active environment, this action need never be repeated (unless you run Pkg.rm(\"MLJDecisionTreeInterface\")). You are now ready to instantiate a decision tree classifier:","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> Tree = @load DecisionTree pkg=DecisionTree\njulia> tree = Tree()","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"which is equivalent to","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> import MLJDecisionTreeInterface.DecisionTreeClassifier\njulia> Tree = MLJDecisionTreeInterface.DecisionTreeClassifier\njulia> tree = Tree()","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"Tip. The specification pkg=... above can be dropped for the many models that are provided by only a single package.","category":"page"},{"location":"loading_model_code/#API","page":"Loading Model Code","title":"API","text":"","category":"section"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"load_path\n@load\n@iload","category":"page"},{"location":"loading_model_code/#StatisticalTraits.load_path","page":"Loading Model Code","title":"StatisticalTraits.load_path","text":"load_path(model_name::String, pkg=nothing)\n\nReturn the load path for model type with name model_name, specifying the algorithm=providing package name pkg to resolve name conflicts, if necessary.\n\nload_path(proxy::NamedTuple)\n\nReturn the load path for the model whose name is proxy.name and whose algorithm-providing package has name proxy.package_name. For example, proxy could be any element of the vector returned by models().\n\nload_path(model)\n\nReturn the load path of a model instance or type. Usually requires necessary model code to have been separately loaded. Supply strings as above if code is not loaded.\n\n\n\n\n\n","category":"function"},{"location":"loading_model_code/#MLJModels.@load","page":"Loading Model Code","title":"MLJModels.@load","text":"@load ModelName pkg=nothing verbosity=0 add=false\n\nImport the model type the model named in the first argument into the calling module, specfying pkg in the case of an ambiguous name (to packages providing a model type with the same name). Returns the model type.\n\nWarning In older versions of MLJ/MLJModels, @load returned an instance instead.\n\nTo automatically add required interface packages to the current environment, specify add=true. For interactive loading, use @iload instead.\n\nExamples\n\nTree = @load DecisionTreeRegressor\ntree = Tree()\ntree2 = Tree(min_samples_split=6)\n\nSVM = @load SVC pkg=LIBSVM\nsvm = SVM()\n\nSee also @iload\n\n\n\n\n\n","category":"macro"},{"location":"loading_model_code/#MLJModels.@iload","page":"Loading Model Code","title":"MLJModels.@iload","text":"@iload ModelName\n\nInteractive alternative to @load. Provides user with an optioin to install (add) the required interface package to the current environment, and to choose the relevant model-providing package in ambiguous cases. See @load\n\n\n\n\n\n","category":"macro"},{"location":"models/MCDDetector_OutlierDetectionPython/#MCDDetector_OutlierDetectionPython","page":"MCDDetector","title":"MCDDetector","text":"","category":"section"},{"location":"models/MCDDetector_OutlierDetectionPython/","page":"MCDDetector","title":"MCDDetector","text":"MCDDetector(store_precision = true,\n assume_centered = false,\n support_fraction = nothing,\n random_state = nothing)","category":"page"},{"location":"models/MCDDetector_OutlierDetectionPython/","page":"MCDDetector","title":"MCDDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.mcd","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#OneClassSVM_LIBSVM","page":"OneClassSVM","title":"OneClassSVM","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"OneClassSVM","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"A model type for constructing a one-class support vector machine, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"OneClassSVM = @load OneClassSVM pkg=LIBSVM","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Do model = OneClassSVM() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneClassSVM(kernel=...).","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"This model is an outlier detection model delivering raw scores based on the decision function of a support vector machine. Like the NuSVC classifier, it uses the nu re-parameterization of the cost parameter appearing in standard support vector classification SVC.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"To extract normalized scores (\"probabilities\") wrap the model using ProbabilisticDetector from OutlierDetection.jl. For threshold-based classification, wrap the probabilistic model using MLJ's BinaryThresholdPredictor. Examples of wrapping appear below.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Training-data","page":"OneClassSVM","title":"Training data","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"where","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Hyper-parameters","page":"OneClassSVM","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\nnu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Operations","page":"OneClassSVM","title":"Operations","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"transform(mach, Xnew): return scores for outlierness, given features Xnew having the same scitype as X above. The greater the score, the more likely it is an outlier. This score is based on the SVM decision function. For normalized scores, wrap model using ProbabilisticDetector from OutlierDetection.jl and call predict instead, and for threshold-based classification, wrap again using BinaryThresholdPredictor. See the examples below.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Fitted-parameters","page":"OneClassSVM","title":"Fitted parameters","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\norientation: this equals 1 if the decision function for libsvm_model is increasing with increasing outlierness, and -1 if it is decreasing instead. Correspondingly, the libsvm_model attaches true to outliers in the first case, and false in the second. (The scores given in the MLJ report and generated by MLJ.transform already correct for this ambiguity, which is therefore only an issue for users directly accessing libsvm_model.)","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Report","page":"OneClassSVM","title":"Report","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"The fields of report(mach) are:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Examples","page":"OneClassSVM","title":"Examples","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/#Generating-raw-scores-for-outlierness","page":"OneClassSVM","title":"Generating raw scores for outlierness","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"using MLJ\nimport LIBSVM\nimport StableRNGs.StableRNG\n\nOneClassSVM = @load OneClassSVM pkg=LIBSVM ## model type\nmodel = OneClassSVM(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nrng = StableRNG(123)\nXmatrix = randn(rng, 5, 3)\nXmatrix[1, 1] = 100.0\nX = MLJ.table(Xmatrix)\n\nmach = machine(model, X) |> fit!\n\n## training scores (outliers have larger scores):\njulia> report(mach).scores\n5-element Vector{Float64}:\n 6.711689156091755e-7\n -6.740101976655081e-7\n -6.711632439648446e-7\n -6.743015858874887e-7\n -6.745393717880104e-7\n\n## scores for new data:\nXnew = MLJ.table(rand(rng, 2, 3))\n\njulia> transform(mach, rand(rng, 2, 3))\n2-element Vector{Float64}:\n -6.746293022511047e-7\n -6.744289265348623e-7","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Generating-probabilistic-predictions-of-outlierness","page":"OneClassSVM","title":"Generating probabilistic predictions of outlierness","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Continuing the previous example:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"using OutlierDetection\npmodel = ProbabilisticDetector(model)\npmach = machine(pmodel, X) |> fit!\n\n## probabilistic predictions on new data:\n\njulia> y_prob = predict(pmach, Xnew)\n2-element UnivariateFiniteVector{OrderedFactor{2}, String, UInt8, Float64}:\n UnivariateFinite{OrderedFactor{2}}(normal=>1.0, outlier=>9.57e-5)\n UnivariateFinite{OrderedFactor{2}}(normal=>1.0, outlier=>0.0)\n\n## probabilities for outlierness:\n\njulia> pdf.(y_prob, \"outlier\")\n2-element Vector{Float64}:\n 9.572583265925801e-5\n 0.0\n\n## raw scores are still available using `transform`:\n\njulia> transform(pmach, Xnew)\n2-element Vector{Float64}:\n 9.572583265925801e-5\n 0.0","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Outlier-classification-using-a-probability-threshold:","page":"OneClassSVM","title":"Outlier classification using a probability threshold:","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Continuing the previous example:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"dmodel = BinaryThresholdPredictor(pmodel, threshold=0.9)\ndmach = machine(dmodel, X) |> fit!\n\njulia> yhat = predict(dmach, Xnew)\n2-element CategoricalArrays.CategoricalArray{String,1,UInt8}:\n \"normal\"\n \"normal\"","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#User-defined-kernels","page":"OneClassSVM","title":"User-defined kernels","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Continuing the first example:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = OneClassSVM(kernel=k)\nmach = machine(model, X) |> fit!\n\njulia> yhat = transform(mach, Xnew)\n2-element Vector{Float64}:\n -0.4825363352732942\n -0.4848772169720227","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"See also LIVSVM.jl and the original C implementation documentation. For an alternative source of outlier detection models with an MLJ interface, see OutlierDetection.jl.","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/#KNeighborsClassifier_MLJScikitLearnInterface","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"","category":"section"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"KNeighborsClassifier","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"A model type for constructing a K-nearest neighbors classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"KNeighborsClassifier = @load KNeighborsClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"Do model = KNeighborsClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsClassifier(n_neighbors=...).","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"KNeighborsClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"n_neighbors = 5\nweights = uniform\nalgorithm = auto\nleaf_size = 30\np = 2\nmetric = minkowski\nmetric_params = nothing\nn_jobs = nothing","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/#BayesianLDA_MLJScikitLearnInterface","page":"BayesianLDA","title":"BayesianLDA","text":"","category":"section"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"A model type for constructing a Bayesian linear discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA = @load BayesianLDA pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(solver=...).","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/#Hyper-parameters","page":"BayesianLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"solver = svd\nshrinkage = nothing\npriors = nothing\nn_components = nothing\nstore_covariance = false\ntol = 0.0001\ncovariance_estimator = nothing","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/#LassoLarsCVRegressor_MLJScikitLearnInterface","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"","category":"section"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"LassoLarsCVRegressor","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"A model type for constructing a Lasso model fit with least angle regression (LARS) with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"LassoLarsCVRegressor = @load LassoLarsCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"Do model = LassoLarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsCVRegressor(fit_intercept=...).","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoLarsCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"fit_intercept = true\nverbose = false\nmax_iter = 500\nnormalize = false\nprecompute = auto\ncv = 5\nmax_n_alphas = 1000\nn_jobs = nothing\neps = 2.220446049250313e-16\ncopy_X = true\npositive = false","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/#OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"OrthogonalMatchingPursuitCVRegressor","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"A model type for constructing a orthogonal ,atching pursuit (OMP) model with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"OrthogonalMatchingPursuitCVRegressor = @load OrthogonalMatchingPursuitCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"Do model = OrthogonalMatchingPursuitCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitCVRegressor(copy=...).","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"OrthogonalMatchingPursuitCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"copy = true\nfit_intercept = true\nnormalize = false\nmax_iter = nothing\ncv = 5\nn_jobs = 1\nverbose = false","category":"page"},{"location":"models/KMeansClusterer_BetaML/#KMeansClusterer_BetaML","page":"KMeansClusterer","title":"KMeansClusterer","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"mutable struct KMeansClusterer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/KMeansClusterer_BetaML/#Parameters:","page":"KMeansClusterer","title":"Parameters:","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"n_classes::Int64: Number of classes to discriminate the data [def: 3]\ndist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary to KMedoidsClusterer, the KMeansClusterer algorithm is not guaranteed to converge with other distances than the Euclidean one.\ninitialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:\n\"random\": randomly in the X space\n\"grid\": using a grid approach\n\"shuffle\": selecting randomly within the available points [default]\n\"given\": using a provided set of initial representatives provided in the initial_representatives parameter\ninitial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy=\"given\") [default: nothing]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/KMeansClusterer_BetaML/#Notes:","page":"KMeansClusterer","title":"Notes:","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"data must be numerical\nonline fitting (re-fitting with new data) is supported","category":"page"},{"location":"models/KMeansClusterer_BetaML/#Example:","page":"KMeansClusterer","title":"Example:","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load KMeansClusterer pkg = \"BetaML\" verbosity=0\nBetaML.Clustering.KMeansClusterer\n\njulia> model = modelType()\nKMeansClusterer(\n n_classes = 3, \n dist = BetaML.Clustering.var\"#34#36\"(), \n initialisation_strategy = \"shuffle\", \n initial_representatives = nothing, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(KMeansClusterer(n_classes = 3, …), …).\n\njulia> classes_est = predict(mach, X);\n\njulia> hcat(y,classes_est)\n150×2 CategoricalArrays.CategoricalArray{Union{Int64, String},2,UInt32}:\n \"setosa\" 2\n \"setosa\" 2\n \"setosa\" 2\n ⋮ \n \"virginica\" 3\n \"virginica\" 3\n \"virginica\" 1","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#LassoRegressor_MLJLinearModels","page":"LassoRegressor","title":"LassoRegressor","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"A model type for constructing a lasso regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor = @load LassoRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Do model = LassoRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Lasso regression is a linear model with objective function","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"$","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"|Xθ - y|₂²/2 + n⋅λ|θ|₁ $","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"where n is the number of observations.","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"If scale_penalty_with_samples = false the objective function is","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"$","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"|Xθ - y|₂²/2 + λ|θ|₁ $","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":".","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#Training-data","page":"LassoRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"where:","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#Hyperparameters","page":"LassoRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"lambda::Real: strength of the L1 regularization. Default: 1.0\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad. If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used. Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#Example","page":"LassoRegressor","title":"Example","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(LassoRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"See also ElasticNetRegressor.","category":"page"},{"location":"model_stacking/#Model-Stacking","page":"Model Stacking","title":"Model Stacking","text":"","category":"section"},{"location":"model_stacking/","page":"Model Stacking","title":"Model Stacking","text":"In a model stack, as introduced by Wolpert (1992), an adjudicating model learns the best way to combine the predictions of multiple base models. In MLJ, such models are constructed using the Stack constructor. To learn more about stacking and to see how to construct a stack \"by hand\" using Learning Networks, see this Data Science in Julia tutorial","category":"page"},{"location":"model_stacking/","page":"Model Stacking","title":"Model Stacking","text":"MLJBase.Stack","category":"page"},{"location":"model_stacking/#MLJBase.Stack","page":"Model Stacking","title":"MLJBase.Stack","text":"Stack(; metalearner=nothing, name1=model1, name2=model2, ..., keyword_options...)\n\nImplements the two-layer generalized stack algorithm introduced by Wolpert (1992) and generalized by Van der Laan et al (2007). Returns an instance of type ProbabilisticStack or DeterministicStack, depending on the prediction type of metalearner.\n\nWhen training a machine bound to such an instance:\n\nThe data is split into training/validation sets according to the specified resampling strategy.\nEach base model model1, model2, ... is trained on each training subset and outputs predictions on the corresponding validation sets. The multi-fold predictions are spliced together into a so-called out-of-sample prediction for each model.\nThe adjudicating model, metalearner, is subsequently trained on the out-of-sample predictions to learn the best combination of base model predictions.\nEach base model is retrained on all supplied data for purposes of passing on new production data onto the adjudicator for making new predictions\n\nArguments\n\nmetalearner::Supervised: The model that will optimize the desired criterion based on its internals. For instance, a LinearRegression model will optimize the squared error.\nresampling: The resampling strategy used to prepare out-of-sample predictions of the base learners.\nmeasures: A measure or iterable over measures, to perform an internal evaluation of the learners in the Stack while training. This is not for the evaluation of the Stack itself.\ncache: Whether machines created in the learning network will cache data or not.\nacceleration: A supported AbstractResource to define the training parallelization mode of the stack.\nname1=model1, name2=model2, ...: the Supervised model instances to be used as base learners. The provided names become properties of the instance created to allow hyper-parameter access\n\nExample\n\nThe following code defines a DeterministicStack instance for learning a Continuous target, and demonstrates that:\n\nBase models can be Probabilistic models even if the stack itself is Deterministic (predict_mean is applied in such cases).\nAs an alternative to hyperparameter optimization, one can stack multiple copies of given model, mutating the hyper-parameter used in each copy.\n\nusing MLJ\n\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nEvoTreeRegressor = @load EvoTreeRegressor\nXGBoostRegressor = @load XGBoostRegressor\nKNNRegressor = @load KNNRegressor pkg=NearestNeighborModels\nLinearRegressor = @load LinearRegressor pkg=MLJLinearModels\n\nX, y = make_regression(500, 5)\n\nstack = Stack(;metalearner=LinearRegressor(),\n resampling=CV(),\n measures=rmse,\n constant=ConstantRegressor(),\n tree_2=DecisionTreeRegressor(max_depth=2),\n tree_3=DecisionTreeRegressor(max_depth=3),\n evo=EvoTreeRegressor(),\n knn=KNNRegressor(),\n xgb=XGBoostRegressor())\n\nmach = machine(stack, X, y)\nevaluate!(mach; resampling=Holdout(), measure=rmse)\n\n\nThe internal evaluation report can be accessed like this and provides a PerformanceEvaluation object for each model:\n\nreport(mach).cv_report\n\n\n\n\n\n","category":"type"},{"location":"models/LGBMClassifier_LightGBM/#LGBMClassifier_LightGBM","page":"LGBMClassifier","title":"LGBMClassifier","text":"","category":"section"},{"location":"models/LGBMClassifier_LightGBM/","page":"LGBMClassifier","title":"LGBMClassifier","text":"Microsoft LightGBM FFI wrapper: Classifier","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/#Birch_MLJScikitLearnInterface","page":"Birch","title":"Birch","text":"","category":"section"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Birch","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"A model type for constructing a birch, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Birch = @load Birch pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Do model = Birch() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Birch(threshold=...).","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Memory-efficient, online-learning algorithm provided as an alternative to MiniBatchKMeans. Note: noisy samples are given the label -1.","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/#MultitargetGaussianMixtureRegressor_BetaML","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"","category":"section"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/#Hyperparameters:","page":"MultitargetGaussianMixtureRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to \"gived\" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nmaximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/#Example:","page":"MultitargetGaussianMixtureRegressor","title":"Example:","text":"","category":"section"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> ydouble = hcat(y, y .*2 .+5);\n\njulia> modelType = @load MultitargetGaussianMixtureRegressor pkg = \"BetaML\" verbosity=0\nBetaML.GMM.MultitargetGaussianMixtureRegressor\n\njulia> model = modelType()\nMultitargetGaussianMixtureRegressor(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"kmeans\", \n maximum_iterations = 9223372036854775807, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, ydouble);\n\njulia> fit!(mach);\n[ Info: Training machine(MultitargetGaussianMixtureRegressor(n_classes = 3, …), …).\nIter. 1: Var. of the post 20.46947926187522 Log-likelihood -23662.72770575145\n\njulia> ŷdouble = predict(mach, X)\n506×2 Matrix{Float64}:\n 23.3358 51.6717\n 23.3358 51.6717\n ⋮ \n 16.6843 38.3686\n 16.6843 38.3686","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/#DBSCAN_MLJScikitLearnInterface","page":"DBSCAN","title":"DBSCAN","text":"","category":"section"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"A model type for constructing a dbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN = @load DBSCAN pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(eps=...).","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#KNNRegressor_NearestNeighborModels","page":"KNNRegressor","title":"KNNRegressor","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"KNNRegressor","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"A model type for constructing a K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"Do model = KNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNRegressor(K=...).","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"KNNRegressor implements K-Nearest Neighbors regressor which is non-parametric algorithm that predicts the response associated with a new point by taking an weighted average of the response of the K-nearest points.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Training-data","page":"KNNRegressor","title":"Training data","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"OR","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"Here:","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).\nw is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Hyper-parameters","page":"KNNRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Operations","page":"KNNRegressor","title":"Operations","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Fitted-parameters","page":"KNNRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Examples","page":"KNNRegressor","title":"Examples","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"using MLJ\nKNNRegressor = @load KNNRegressor pkg=NearestNeighborModels\nX, y = @load_boston; ## loads the crabs dataset from MLJBase\n## view possible kernels\nNearestNeighborModels.list_kernels()\nmodel = KNNRegressor(weights = NearestNeighborModels.Inverse()) #KNNRegressor instantiation\nmach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit\ny_hat = predict(mach, X)\n","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"See also MultitargetKNNRegressor","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/#ARDRegressor_MLJScikitLearnInterface","page":"ARDRegressor","title":"ARDRegressor","text":"","category":"section"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"ARDRegressor","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"A model type for constructing a Bayesian ARD regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"ARDRegressor = @load ARDRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"Do model = ARDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ARDRegressor(n_iter=...).","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"ARDRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"n_iter = 300\ntol = 0.001\nalpha_1 = 1.0e-6\nalpha_2 = 1.0e-6\nlambda_1 = 1.0e-6\nlambda_2 = 1.0e-6\ncompute_score = false\nthreshold_lambda = 10000.0\nfit_intercept = true\ncopy_X = true\nverbose = false","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/#LinearRegressor_MLJScikitLearnInterface","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a ordinary least-squares regressor (OLS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"fit_intercept = true\ncopy_X = true\nn_jobs = nothing","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/#SVMNuRegressor_MLJScikitLearnInterface","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"","category":"section"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"SVMNuRegressor","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"A model type for constructing a nu-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"SVMNuRegressor = @load SVMNuRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"Do model = SVMNuRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuRegressor(nu=...).","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMNuRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"nu = 0.5\nC = 1.0\nkernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\nshrinking = true\ntol = 0.001\ncache_size = 200\nmax_iter = -1","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#LinearRegressor_MLJLinearModels","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a linear regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"This model provides standard linear regression with objective function","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"$","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"|Xθ - y|₂²/2 $","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#Training-data","page":"LinearRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"where:","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#Hyperparameters","page":"LinearRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"fit_intercept::Bool: whether to fit the intercept or not. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: \"any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient.\nIf solver = nothing (default) then Analytical() is used. Default: nothing","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#Example","page":"LinearRegressor","title":"Example","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(LinearRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"}] +[{"location":"models/LDA_MultivariateStats/#LDA_MultivariateStats","page":"LDA","title":"LDA","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"LDA","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"A model type for constructing a linear discriminant analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"LDA = @load LDA pkg=MultivariateStats","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Do model = LDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LDA(method=...).","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Multiclass linear discriminant analysis learns a projection in a space of features to a lower dimensional space, in a way that attempts to preserve as much as possible the degree to which the classes of a discrete target variable can be discriminated. This can be used either for dimension reduction of the features (see transform below) or for probabilistic classification of the target (see predict below).","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"In the case of prediction, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here \"distance\" is computed using a user-specified distance function.","category":"page"},{"location":"models/LDA_MultivariateStats/#Training-data","page":"LDA","title":"Training data","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Here:","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LDA_MultivariateStats/#Hyper-parameters","page":"LDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"method::Symbol=:gevd: The solver, one of :gevd or :whiten methods.\ncov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.\ncov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).\noutdim::Int=0: The output dimension, i.e dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.\nregcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.\ndist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.","category":"page"},{"location":"models/LDA_MultivariateStats/#Operations","page":"LDA","title":"Operations","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/LDA_MultivariateStats/#Fitted-parameters","page":"LDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).","category":"page"},{"location":"models/LDA_MultivariateStats/#Report","page":"LDA","title":"Report","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The mean of the untransformed training data. A vector of length indim.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).\nclass_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).\nclass_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nSb: The between class scatter matrix.\nSw: The within class scatter matrix.","category":"page"},{"location":"models/LDA_MultivariateStats/#Examples","page":"LDA","title":"Examples","text":"","category":"section"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"using MLJ\n\nLDA = @load LDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = LDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)\n","category":"page"},{"location":"models/LDA_MultivariateStats/","page":"LDA","title":"LDA","text":"See also BayesianLDA, SubspaceLDA, BayesianSubspaceLDA","category":"page"},{"location":"models/NuSVC_LIBSVM/#NuSVC_LIBSVM","page":"NuSVC","title":"NuSVC","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"NuSVC","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"A model type for constructing a ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"NuSVC = @load NuSVC pkg=LIBSVM","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"Do model = NuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVC(kernel=...).","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"This model is a re-parameterization of the SVC classifier, where nu replaces cost, and is mathematically equivalent to it. The parameter nu allows more direct control over the number of support vectors (see under \"Hyper-parameters\").","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"This model always predicts actual class labels. For probabilistic predictions, use instead ProbabilisticNuSVC.","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/NuSVC_LIBSVM/#Training-data","page":"NuSVC","title":"Training data","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"where","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/NuSVC_LIBSVM/#Hyper-parameters","page":"NuSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\nnu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/NuSVC_LIBSVM/#Operations","page":"NuSVC","title":"Operations","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/NuSVC_LIBSVM/#Fitted-parameters","page":"NuSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/NuSVC_LIBSVM/#Report","page":"NuSVC","title":"Report","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/NuSVC_LIBSVM/#Examples","page":"NuSVC","title":"Examples","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/#Using-a-built-in-kernel","page":"NuSVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"using MLJ\nimport LIBSVM\n\nNuSVC = @load NuSVC pkg=LIBSVM ## model type\nmodel = NuSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/NuSVC_LIBSVM/#User-defined-kernels","page":"NuSVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = NuSVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/NuSVC_LIBSVM/","page":"NuSVC","title":"NuSVC","text":"See also the classifiers SVC and LinearSVC, LIVSVM.jl and the original C implementation. documentation.","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#KMedoidsClusterer_BetaML","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"mutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#Parameters:","page":"KMedoidsClusterer","title":"Parameters:","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"n_classes::Int64: Number of classes to discriminate the data [def: 3]\ndist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.\ninitialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:\n\"random\": randomly in the X space\n\"grid\": using a grid approach\n\"shuffle\": selecting randomly within the available points [default]\n\"given\": using a provided set of initial representatives provided in the initial_representatives parameter\ninitial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy=\"given\") [default: nothing]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"Similar to K-Means, but the \"representatives\" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#Notes:","page":"KMedoidsClusterer","title":"Notes:","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"data must be numerical\nonline fitting (re-fitting with new data) is supported","category":"page"},{"location":"models/KMedoidsClusterer_BetaML/#Example:","page":"KMedoidsClusterer","title":"Example:","text":"","category":"section"},{"location":"models/KMedoidsClusterer_BetaML/","page":"KMedoidsClusterer","title":"KMedoidsClusterer","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load KMedoidsClusterer pkg = \"BetaML\" verbosity=0\nBetaML.Clustering.KMedoidsClusterer\n\njulia> model = modelType()\nKMedoidsClusterer(\n n_classes = 3, \n dist = BetaML.Clustering.var\"#39#41\"(), \n initialisation_strategy = \"shuffle\", \n initial_representatives = nothing, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(KMedoidsClusterer(n_classes = 3, …), …).\n\njulia> classes_est = predict(mach, X);\n\njulia> hcat(y,classes_est)\n150×2 CategoricalArrays.CategoricalArray{Union{Int64, String},2,UInt32}:\n \"setosa\" 3\n \"setosa\" 3\n \"setosa\" 3\n ⋮ \n \"virginica\" 1\n \"virginica\" 1\n \"virginica\" 2","category":"page"},{"location":"benchmarking/#Benchmarking","page":"Benchmarking","title":"Benchmarking","text":"","category":"section"},{"location":"benchmarking/","page":"Benchmarking","title":"Benchmarking","text":"This feature not yet available.","category":"page"},{"location":"benchmarking/","page":"Benchmarking","title":"Benchmarking","text":"CONTRIBUTE.md","category":"page"},{"location":"weights/#Weights","page":"Weights","title":"Weights","text":"","category":"section"},{"location":"weights/","page":"Weights","title":"Weights","text":"In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"There are two kinds of weights in use in MLJ:","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations\nclass weights refer to dictionaries keyed on the target classes (levels) for use in classification problems","category":"page"},{"location":"weights/#Specifying-weights-in-training","page":"Weights","title":"Specifying weights in training","text":"","category":"section"},{"location":"weights/","page":"Weights","title":"Weights","text":"To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"KNNRegressor = @load KNNRegressor\nmodel = KNNRegressor()\nX, y = make_regression(10, 3)\nw = rand(length(y))\n\nmach = machine(model, X, y, w) |> fit!","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"Note that model supports per observation weights if supports_weights(model) is true. To list all such models, do","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"models() do m\n m.supports_weights\nend","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"The model model supports class weights if supports_class_weights(model) is true.","category":"page"},{"location":"weights/#Specifying-weights-in-performance-evaluation","page":"Weights","title":"Specifying weights in performance evaluation","text":"","category":"section"},{"location":"weights/","page":"Weights","title":"Weights","text":"When calling a measure (metric) that supports weights, provide the weights as the last argument, as in","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"_, y = @load_iris\nŷ = shuffle(y)\nw = Dict(\"versicolor\" => 1, \"setosa\" => 2, \"virginica\"=> 3)\nmacro_f1score(ŷ, y, w)","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.","category":"page"},{"location":"weights/","page":"Weights","title":"Weights","text":"To pass weights to all the measures listed in an evaluate!/evaluate call, use the keyword specifiers weights=... or class_weights=.... For details, see Evaluating Model Performance.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#NeuralNetworkClassifier_MLJFlux","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"NeuralNetworkClassifier","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"A model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Do model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"NeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Training-data","page":"NeuralNetworkClassifier","title":"Training data","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Here:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Hyper-parameters","page":"NeuralNetworkClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Operations","page":"NeuralNetworkClassifier","title":"Operations","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Fitted-parameters","page":"NeuralNetworkClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Report","page":"NeuralNetworkClassifier","title":"Report","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/#Examples","page":"NeuralNetworkClassifier","title":"Examples","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"using MLJ\nusing Flux\nimport RDatasets","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"First, we can load the data:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123); ## a vector and a table\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier()","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Next, we can train the model:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"mach = machine(clf, X, y)\nfit!(mach)","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"We can train the model in an incremental fashion, altering the learning rate as we go, provided optimizer_changes_trigger_retraining is false (the default). Here, we also change the number of (total) iterations:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"clf.optimiser.eta = clf.optimiser.eta * 2\nclf.epochs = clf.epochs + 5\n\nfit!(mach, verbosity=2) ## trains 5 more epochs","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"We can inspect the mean training loss using the cross_entropy function:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"training_loss = cross_entropy(predict(mach, X), y) |> mean","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"And we can access the Flux chain (model) using fitted_params:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"chain = fitted_params(mach).chain","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"Finally, we can see how the out-of-sample performance changes over time, using MLJ's learning_curve function:","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"r = range(clf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(clf, X, y,\n range=r,\n resampling=Holdout(fraction_train=0.7),\n measure=cross_entropy)\nusing Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\")\n","category":"page"},{"location":"models/NeuralNetworkClassifier_MLJFlux/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"See also ImageClassifier.","category":"page"},{"location":"models/HBOSDetector_OutlierDetectionPython/#HBOSDetector_OutlierDetectionPython","page":"HBOSDetector","title":"HBOSDetector","text":"","category":"section"},{"location":"models/HBOSDetector_OutlierDetectionPython/","page":"HBOSDetector","title":"HBOSDetector","text":"HBOSDetector(n_bins = 10,\n alpha = 0.1,\n tol = 0.5)","category":"page"},{"location":"models/HBOSDetector_OutlierDetectionPython/","page":"HBOSDetector","title":"HBOSDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.hbos","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#RecursiveFeatureElimination_FeatureSelection","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"RecursiveFeatureElimination(model, n_features, step)","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"This model implements a recursive feature elimination algorithm for feature selection. It recursively removes features, training a base model on the remaining features and evaluating their importance until the desired number of features is selected.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"Construct an instance with default hyper-parameters using the syntax rfe_model = RecursiveFeatureElimination(model=...). Provide keyword arguments to override hyper-parameter defaults.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#Training-data","page":"RecursiveFeatureElimination","title":"Training data","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"In MLJ or MLJBase, bind an instance rfe_model to data with","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"mach = machine(rfe_model, X, y)","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"OR, if the base model supports weights, as","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"mach = machine(rfe_model, X, y, w)","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"Here:","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"X is any table of input features (eg, a DataFrame) whose columns are of the scitype as that required by the base model; check column scitypes with schema(X) and column scitypes required by base model with input_scitype(basemodel).\ny is the target, which can be any table of responses whose element scitype is Continuous or Finite depending on the target_scitype required by the base model; check the scitype with scitype(y).\nw is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#Hyper-parameters","page":"RecursiveFeatureElimination","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"model: A base model with a fit method that provides information on feature feature importance (i.e reports_feature_importances(model) == true)\nn_features::Real = 0: The number of features to select. If 0, half of the features are selected. If a positive integer, the parameter is the absolute number of features to select. If a real number between 0 and 1, it is the fraction of features to select.\nstep::Real=1: If the value of step is at least 1, it signifies the quantity of features to eliminate in each iteration. Conversely, if step falls strictly within the range of 0.0 to 1.0, it denotes the proportion (rounded down) of features to remove during each iteration.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#Operations","page":"RecursiveFeatureElimination","title":"Operations","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"transform(mach, X): transform the input table X into a new table containing only","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"columns corresponding to features gotten from the RFE algorithm.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"predict(mach, X): transform the input table X into a new table same as in\ntransform(mach, X) above and predict using the fitted base model on the transformed table.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#Fitted-parameters","page":"RecursiveFeatureElimination","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"features_left: names of features remaining after recursive feature elimination.\nmodel_fitresult: fitted parameters of the base model.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#Report","page":"RecursiveFeatureElimination","title":"Report","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"The fields of report(mach) are:","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"ranking: The feature ranking of each features in the training dataset.\nmodel_report: report for the fitted base model.\nfeatures: names of features seen during the training process.","category":"page"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/#Examples","page":"RecursiveFeatureElimination","title":"Examples","text":"","category":"section"},{"location":"models/RecursiveFeatureElimination_FeatureSelection/","page":"RecursiveFeatureElimination","title":"RecursiveFeatureElimination","text":"using FeatureSelection, MLJ, StableRNGs\n\nRandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree\n\n## Creates a dataset where the target only depends on the first 5 columns of the input table.\nA = rand(rng, 50, 10);\ny = 10 .* sin.(\n pi .* A[:, 1] .* A[:, 2]\n ) + 20 .* (A[:, 3] .- 0.5).^ 2 .+ 10 .* A[:, 4] .+ 5 * A[:, 5]);\nX = MLJ.table(A);\n\n## fit a rfe model\nrf = RandomForestRegressor()\nselector = RecursiveFeatureElimination(model = rf)\nmach = machine(selector, X, y)\nfit!(mach)\n\n## view the feature importances\nfeature_importances(mach)\n\n## predict using the base model\nXnew = MLJ.table(rand(rng, 50, 10));\npredict(mach, Xnew)\n","category":"page"},{"location":"models/DBSCAN_Clustering/#DBSCAN_Clustering","page":"DBSCAN","title":"DBSCAN","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"A model type for constructing a DBSCAN clusterer (density-based spatial clustering of applications with noise), based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN = @load DBSCAN pkg=Clustering","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(radius=...).","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN is a clustering algorithm that groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. Point types - core, boundary or noise - are accessed from the machine report (see below).","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"In MLJ or MLJBase, create a machine with","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"mach = machine(model)","category":"page"},{"location":"models/DBSCAN_Clustering/#Hyper-parameters","page":"DBSCAN","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"radius=1.0: query radius.\nleafsize=20: number of points binned in each leaf node of the nearest neighbor k-d tree.\nmin_neighbors=1: minimum number of a core point neighbors.\nmin_cluster_size=1: minimum number of points in a valid cluster.","category":"page"},{"location":"models/DBSCAN_Clustering/#Operations","page":"DBSCAN","title":"Operations","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Note that points of type noise will always get a label of 0.","category":"page"},{"location":"models/DBSCAN_Clustering/#Report","page":"DBSCAN","title":"Report","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"After calling predict(mach), the fields of report(mach) are:","category":"page"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"point_types: A CategoricalVector with the DBSCAN point type classification, one element per row of X. Elements are either 'C' (core), 'B' (boundary), or 'N' (noise).\nnclusters: The number of clusters (excluding the noise \"cluster\")\ncluster_labels: The unique list of cluster labels\nclusters: A vector of Clustering.DbscanCluster objects from Clustering.jl, which have these fields:\nsize: number of points in a cluster (core + boundary)\ncore_indices: indices of points in the cluster core\nboundary_indices: indices of points on the cluster boundary","category":"page"},{"location":"models/DBSCAN_Clustering/#Examples","page":"DBSCAN","title":"Examples","text":"","category":"section"},{"location":"models/DBSCAN_Clustering/","page":"DBSCAN","title":"DBSCAN","text":"using MLJ\n\nX, labels = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X\ny = map(labels) do label\n label == 0 ? \"cookie\" : \"monster\"\nend;\ny = coerce(y, Multiclass);\n\nDBSCAN = @load DBSCAN pkg=Clustering\nmodel = DBSCAN(radius=0.13, min_cluster_size=5)\nmach = machine(model)\n\n## compute and output cluster assignments for observations in `X`:\nyhat = predict(mach, X)\n\n## get DBSCAN point types:\nreport(mach).point_types\nreport(mach).nclusters\n\n## compare cluster labels with actual labels:\ncompare = zip(yhat, y) |> collect;\ncompare[1:10] ## clusters align with classes\n\n## visualize clusters, noise in red:\npoints = zip(X.x1, X.x2) |> collect\ncolors = map(yhat) do i\n i == 0 ? :red :\n i == 1 ? :blue :\n i == 2 ? :green :\n i == 3 ? :yellow :\n :black\nend\nusing Plots\nscatter(points, color=colors)","category":"page"},{"location":"glossary/#Glossary","page":"Glossary","title":"Glossary","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Note: This glossary includes some detail intended mainly for MLJ developers.","category":"page"},{"location":"glossary/#Basics","page":"Glossary","title":"Basics","text":"","category":"section"},{"location":"glossary/#hyperparameters","page":"Glossary","title":"hyperparameters","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Parameters on which some learning algorithm depends, specified before the algorithm is applied, and where learning is interpreted in the broadest sense. For example, PCA feature reduction is a \"preprocessing\" transformation \"learning\" a projection from training data, governed by a dimension hyperparameter. Hyperparameters in our sense may specify configuration (eg, number of parallel processes) even when this does not affect the end-product of learning. (But we exclude verbosity level.)","category":"page"},{"location":"glossary/#model-(object-of-abstract-type-Model)","page":"Glossary","title":"model (object of abstract type Model)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Object collecting together hyperpameters of a single algorithm. Models are classified either as supervised or unsupervised models (eg, \"transformers\"), with corresponding subtypes Supervised <: Model and Unsupervised <: Model.","category":"page"},{"location":"glossary/#fitresult-(type-generally-defined-outside-of-MLJ)","page":"Glossary","title":"fitresult (type generally defined outside of MLJ)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Also known as \"learned\" or \"fitted\" parameters, these are \"weights\", \"coefficients\", or similar parameters learned by an algorithm, after adopting the prescribed hyper-parameters. For example, decision trees of a random forest, the coefficients and intercept of a linear model, or the projection matrices of a PCA dimension-reduction algorithm.","category":"page"},{"location":"glossary/#operation","page":"Glossary","title":"operation","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Data-manipulating operations (methods) using some fitresult. For supervised learners, the predict, predict_mean, predict_median, or predict_mode methods; for transformers, the transform or inverse_transform method. An operation may also refer to an ordinary data-manipulating method that does not depend on a fit-result (e.g., a broadcasted logarithm) which is then called static operation for clarity. An operation that is not static is dynamic.","category":"page"},{"location":"glossary/#machine-(object-of-type-Machine)","page":"Glossary","title":"machine (object of type Machine)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"An object consisting of:","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"A model\nA fit-result (undefined until training)\nTraining arguments (one for each data argument of the model's associated fit method). A training argument is data used for training (subsampled by specifying rows=... in fit!) but also in evaluation (subsampled by specifying rows=... in predict, predict_mean, etc). Generally, there are two training arguments for supervised models, and just one for unsupervised models. Each argument is either a Source node, wrapping concrete data supplied to the machine constructor, or a Node, in the case of a learning network (see below). Both kinds of nodes can be called with an optional rows=... keyword argument to (lazily) return concrete data.","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"In addition, machines store \"report\" metadata, for recording algorithm-specific statistics of training (eg, an internal estimate of generalization error, feature importances); and they cache information allowing the fit-result to be updated without repeating unnecessary information.","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Machines are trained by calls to a fit! method which may be passed an optional argument specifying the rows of data to be used in training.","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"For more, see the Machines section.","category":"page"},{"location":"glossary/#Learning-Networks-and-Composite-Models","page":"Glossary","title":"Learning Networks and Composite Models","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Note: Multiple machines in a learning network may share the same model, and multiple learning nodes may share the same machine.","category":"page"},{"location":"glossary/#source-node-(object-of-type-Source)","page":"Glossary","title":"source node (object of type Source)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"A container for training data and point of entry for new data in a learning network (see below).","category":"page"},{"location":"glossary/#node-(object-of-type-Node)","page":"Glossary","title":"node (object of type Node)","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Essentially a machine (whose arguments are possibly other nodes) wrapped in an associated operation (e.g., predict or inverse_transform). It consists primarily of:","category":"page"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"An operation, static or dynamic.\nA machine, or nothing if the operation is static.\nUpstream connections to other nodes, specified by a list of arguments (one for each argument of the operation). These are the arguments on which the operation \"acts\" when the node N is called, as in N().","category":"page"},{"location":"glossary/#learning-network","page":"Glossary","title":"learning network","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"A directed acyclic graph implicit in the connections of a collection of source(s) and nodes. ","category":"page"},{"location":"glossary/#wrapper","page":"Glossary","title":"wrapper","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Any model with one or more other models as hyper-parameters.","category":"page"},{"location":"glossary/#composite-model","page":"Glossary","title":"composite model","text":"","category":"section"},{"location":"glossary/","page":"Glossary","title":"Glossary","text":"Any wrapper, or any learning network, \"exported\" as a model (see Composing Models).","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/#ProbabilisticSGDClassifier_MLJScikitLearnInterface","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"","category":"section"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"ProbabilisticSGDClassifier","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"A model type for constructing a probabilistic sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"ProbabilisticSGDClassifier = @load ProbabilisticSGDClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"Do model = ProbabilisticSGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSGDClassifier(loss=...).","category":"page"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"ProbabilisticSGDClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/","page":"ProbabilisticSGDClassifier","title":"ProbabilisticSGDClassifier","text":"loss = log_loss\npenalty = l2\nalpha = 0.0001\nl1_ratio = 0.15\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\nepsilon = 0.1\nn_jobs = nothing\nrandom_state = nothing\nlearning_rate = optimal\neta0 = 0.0\npower_t = 0.5\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nclass_weight = nothing\nwarm_start = false\naverage = false","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/#HuberRegressor_MLJScikitLearnInterface","page":"HuberRegressor","title":"HuberRegressor","text":"","category":"section"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"A model type for constructing a Huber regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor = @load HuberRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"Do model = HuberRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HuberRegressor(epsilon=...).","category":"page"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"HuberRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/HuberRegressor_MLJScikitLearnInterface/","page":"HuberRegressor","title":"HuberRegressor","text":"epsilon = 1.35\nmax_iter = 100\nalpha = 0.0001\nwarm_start = false\nfit_intercept = true\ntol = 1.0e-5","category":"page"},{"location":"models/KPLSRegressor_PartialLeastSquaresRegressor/#KPLSRegressor_PartialLeastSquaresRegressor","page":"KPLSRegressor","title":"KPLSRegressor","text":"","category":"section"},{"location":"models/KPLSRegressor_PartialLeastSquaresRegressor/","page":"KPLSRegressor","title":"KPLSRegressor","text":"A Kernel Partial Least Squares Regressor. A Kernel PLS2 NIPALS algorithms. Can be used mainly for regression.","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#EpsilonSVR_LIBSVM","page":"EpsilonSVR","title":"EpsilonSVR","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"EpsilonSVR","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"A model type for constructing a ϵ-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"EpsilonSVR = @load EpsilonSVR pkg=LIBSVM","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"Do model = EpsilonSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EpsilonSVR(kernel=...).","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"This model is an adaptation of the classifier SVC to regression, but has an additional parameter epsilon (denoted ϵ in the cited reference).","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Training-data","page":"EpsilonSVR","title":"Training data","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"where","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Hyper-parameters","page":"EpsilonSVR","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\nepsilon=0.1 (range (0, Inf)): the parameter denoted ϵ in the cited reference; epsilon is the thickness of the penalty-free neighborhood of the graph of the prediction function (\"slab\" or \"tube\"). Specifically, a data point (x, y) incurs no training loss unless it is outside this neighborhood; the further away it is from the this neighborhood, the greater the loss penalty.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Operations","page":"EpsilonSVR","title":"Operations","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Fitted-parameters","page":"EpsilonSVR","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"libsvm_model: the trained model object created by the LIBSVM.jl package","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Report","page":"EpsilonSVR","title":"Report","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#Examples","page":"EpsilonSVR","title":"Examples","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/#Using-a-built-in-kernel","page":"EpsilonSVR","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"using MLJ\nimport LIBSVM\n\nEpsilonSVR = @load EpsilonSVR pkg=LIBSVM ## model type\nmodel = EpsilonSVR(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = make_regression(rng=123) ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew, _ = make_regression(3, rng=123)\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 0.2512132502584155\n 0.007340201523624579\n -0.2482949812264707","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/#User-defined-kernels","page":"EpsilonSVR","title":"User-defined kernels","text":"","category":"section"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = EpsilonSVR(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 1.1121225361666656\n 0.04667702229741916\n -0.6958148424680672","category":"page"},{"location":"models/EpsilonSVR_LIBSVM/","page":"EpsilonSVR","title":"EpsilonSVR","text":"See also NuSVR, LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#EvoSplineRegressor_EvoLinear","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"EvoSplineRegressor(; kwargs...)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"A model type for constructing a EvoSplineRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Keyword-arguments","page":"EvoSplineRegressor","title":"Keyword arguments","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"loss=:mse: loss function to be minimised. Can be one of:\n:mse\n:logistic\n:poisson\n:gamma\n:tweedie\nnrounds=10: maximum number of training rounds.\neta=1: Learning rate. Typically in the range [1e-2, 1].\nL1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.\nL2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.\nrng=123: random seed. Not used at the moment.\nupdater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.\ndevice=:cpu: Only :cpu is supported at the moment.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Internal-API","page":"EvoSplineRegressor","title":"Internal API","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Do config = EvoSplineRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"EvoSplineRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Training-model","page":"EvoSplineRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"A model is built using fit:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"config = EvoSplineRegressor()\nm = fit(config; x, y, w)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Inference","page":"EvoSplineRegressor","title":"Inference","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument. ","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"preds = m(x)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#MLJ-Interface","page":"EvoSplineRegressor","title":"MLJ Interface","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"EvoSplineRegressor = @load EvoSplineRegressor pkg=EvoLinear","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoSplineRegressor(loss=...).","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Training-model-2","page":"EvoSplineRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where: ","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Operations","page":"EvoSplineRegressor","title":"Operations","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"predict(mach, Xnew): return predictions of the target given","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Fitted-parameters","page":"EvoSplineRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":":fitresult: the SplineModel object returned by EvoSplineRegressor fitting algorithm.","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/#Report","page":"EvoSplineRegressor","title":"Report","text":"","category":"section"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoSplineRegressor_EvoLinear/","page":"EvoSplineRegressor","title":"EvoSplineRegressor","text":":coef: Vector of coefficients (βs) associated to each of the features.\n:bias: Value of the bias.\n:names: Names of each of the features.","category":"page"},{"location":"models/RandomForestRegressor_BetaML/#RandomForestRegressor_BetaML","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"","category":"section"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"mutable struct RandomForestRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/RandomForestRegressor_BetaML/#Hyperparameters:","page":"RandomForestRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"n_trees::Int64: Number of (decision) trees in the forest [def: 30]\nmax_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimension]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.\nβ::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour \"better\" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/RandomForestRegressor_BetaML/#Example:","page":"RandomForestRegressor","title":"Example:","text":"","category":"section"},{"location":"models/RandomForestRegressor_BetaML/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load RandomForestRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Trees.RandomForestRegressor\n\njulia> model = modelType()\nRandomForestRegressor(\n n_trees = 30, \n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.variance, \n β = 0.0, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(RandomForestRegressor(n_trees = 30, …), …).\n\njulia> ŷ = predict(mach, X);\n\njulia> hcat(y,ŷ)\n506×2 Matrix{Float64}:\n 24.0 25.8433\n 21.6 22.4317\n 34.7 35.5742\n 33.4 33.9233\n ⋮ \n 23.9 24.42\n 22.0 22.4433\n 11.9 15.5833","category":"page"},{"location":"models/KMeans_ParallelKMeans/#KMeans_ParallelKMeans","page":"KMeans","title":"KMeans","text":"","category":"section"},{"location":"models/KMeans_ParallelKMeans/","page":"KMeans","title":"KMeans","text":"Parallel & lightning fast implementation of all available variants of the KMeans clustering algorithm in native Julia. Compatible with Julia 1.3+","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/#BisectingKMeans_MLJScikitLearnInterface","page":"BisectingKMeans","title":"BisectingKMeans","text":"","category":"section"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"BisectingKMeans","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"A model type for constructing a bisecting k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"BisectingKMeans = @load BisectingKMeans pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"Do model = BisectingKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BisectingKMeans(n_clusters=...).","category":"page"},{"location":"models/BisectingKMeans_MLJScikitLearnInterface/","page":"BisectingKMeans","title":"BisectingKMeans","text":"Bisecting K-Means clustering.","category":"page"},{"location":"logging_workflows/#Logging-Workflows","page":"Logging Workflows using MLflow","title":"Logging Workflows","text":"","category":"section"},{"location":"logging_workflows/#MLflow-integration","page":"Logging Workflows using MLflow","title":"MLflow integration","text":"","category":"section"},{"location":"logging_workflows/","page":"Logging Workflows using MLflow","title":"Logging Workflows using MLflow","text":"MLflow is a popular, language-agnostic, tool for externally logging the outcomes of machine learning experiments, including those carried out using MLJ.","category":"page"},{"location":"logging_workflows/","page":"Logging Workflows using MLflow","title":"Logging Workflows using MLflow","text":"MLJ logging examples are given in the MLJFlow.jl documentation. MLJ includes and re-exports all the methods of MLJFlow.jl, so there is no need to import MLJFlow.jl if using MLJ.","category":"page"},{"location":"logging_workflows/","page":"Logging Workflows using MLflow","title":"Logging Workflows using MLflow","text":"warning: Warning\nMLJFlow.jl is a new package still under active development and should be regarded as experimental. At this time, breaking changes to MLJFlow.jl will not necessarily trigger new breaking releases of MLJ.jl.","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/#ComplementNBClassifier_MLJScikitLearnInterface","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"","category":"section"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"ComplementNBClassifier","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"A model type for constructing a Complement naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"ComplementNBClassifier = @load ComplementNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"Do model = ComplementNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ComplementNBClassifier(alpha=...).","category":"page"},{"location":"models/ComplementNBClassifier_MLJScikitLearnInterface/","page":"ComplementNBClassifier","title":"ComplementNBClassifier","text":"Similar to MultinomialNBClassifier but with more robust assumptions. Suited for imbalanced datasets.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#RobustRegressor_MLJLinearModels","page":"RobustRegressor","title":"RobustRegressor","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"RobustRegressor","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"A model type for constructing a robust regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"RobustRegressor = @load RobustRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Do model = RobustRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Robust regression is a linear model with objective function","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"$","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"where ρ is a robust loss function (e.g. the Huber function) and n is the number of observations.","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"If scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"$","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":".","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#Training-data","page":"RobustRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"where:","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#Hyperparameters","page":"RobustRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"rho::MLJLinearModels.RobustRho: the type of robust loss, which can be any instance of MLJLinearModels.L where L is one of: AndrewsRho, BisquareRho, FairRho, HuberRho, LogisticRho, QuantileRho, TalwarRho, HuberRho, TalwarRho. Default: HuberRho(0.1)\nlambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/#Example","page":"RobustRegressor","title":"Example","text":"","category":"section"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(RobustRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/RobustRegressor_MLJLinearModels/","page":"RobustRegressor","title":"RobustRegressor","text":"See also HuberRegressor, QuantileRegressor.","category":"page"},{"location":"controlling_iterative_models/#Controlling-Iterative-Models","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the \"warm restart\" behavior described in Machines.","category":"page"},{"location":"controlling_iterative_models/#Basic-use","page":"Controlling Iterative Models","title":"Basic use","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"As in Tuning Models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"using MLJ\n\nX, y = make_moons(100, rng=123, noise=0.5)\nEvoTreeClassifier = @load EvoTreeClassifier verbosity=0\n\niterated_model = IteratedModel(model=EvoTreeClassifier(rng=123, eta=0.005),\n resampling=Holdout(),\n measures=log_loss,\n controls=[Step(5),\n Patience(2),\n NumberLimit(100)],\n retrain=true)\n\nmach = machine(iterated_model, X, y)\nnothing # hide","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"fit!(mach)","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means \"Compute 5 more iterations\" (in this case starting from none); Patience(2) means \"Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss\"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Because iteration is implemented as a wrapper, the \"self-iterating\" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);\nmap(e.report_per_fold) do r\n r.n_iterations\nend","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Alternatively, one might wrap the self-iterating model in a tuning strategy, using TunedModel; see Tuning Models. In this way, the optimization of some other hyper-parameter is realized simultaneously with that of the iteration parameter, which will frequently be more efficient than a direct two-parameter search.","category":"page"},{"location":"controlling_iterative_models/#Controls-provided","page":"Controlling Iterative Models","title":"Controls provided","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In the table below, mach is the training machine being iterated, constructed by binding the supplied data to the model specified in the IteratedModel wrapper, but trained in each iteration on a subset of the data, according to the value of the resampling hyper-parameter of the wrapper (using all data if resampling=nothing).","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"control description can trigger a stop\nStep(n=1) Train model for n more iterations no\nTimeLimit(t=0.5) Stop after t hours yes\nNumberLimit(n=100) Stop after n applications of the control yes\nNumberSinceBest(n=6) Stop when best loss occurred n control applications ago yes\nInvalidValue() Stop when NaN, Inf or -Inf loss/training loss encountered yes\nThreshold(value=0.0) Stop when loss < value yes\nGL(alpha=2.0) † Stop after the \"generalization loss (GL)\" exceeds alpha yes\nPQ(alpha=0.75, k=5) † Stop after \"progress-modified GL\" exceeds alpha yes\nPatience(n=5) † Stop after n consecutive loss increases yes\nWarmup(c; n=1) Wait for n loss updates before checking criteria c no\nInfo(f=identity) Log to Info the value of f(mach), where mach is current machine no\nWarn(predicate; f=\"\") Log to Warn the value of f or f(mach), if predicate(mach) holds no\nError(predicate; f=\"\") Log to Error the value of f or f(mach), if predicate(mach) holds and then stop yes\nCallback(f=mach->nothing) Call f(mach) yes\nWithNumberDo(f=n->@info(n)) Call f(n + 1) where n is the number of complete control cycles so far yes\nWithIterationsDo(f=i->@info(\"iterations: $i\")) Call f(i), where i is total number of iterations yes\nWithLossDo(f=x->@info(\"loss: $x\")) Call f(loss) where loss is the current loss yes\nWithTrainingLossesDo(f=v->@info(v)) Call f(v) where v is the current batch of training losses yes\nWithEvaluationDo(f->e->@info(\"evaluation: $e)) Call f(e) where e is the current performance evaluation object yes\nWithFittedParamsDo(f->fp->@info(\"fitted_params: $fp)) Call f(fp) where fp is fitted parameters of training machine yes\nWithReportDo(f->e->@info(\"report: $e)) Call f(r) where r is the training machine report yes\nWithModelDo(f->m->@info(\"model: $m)) Call f(m) where m is the model, which may be mutated by f yes\nWithMachineDo(f->mach->@info(\"report: $mach)) Call f(mach) wher mach is the training machine in its current state yes\nSave(filename=\"machine.jls\") Save current training machine to machine1.jls, machine2.jsl, etc yes","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Table 1. Atomic controls. Some advanced options are omitted.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"† For more on these controls see Prechelt, Lutz (1998): \"Early Stopping - But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Stopping option. All the following controls trigger a stop if the provided function f returns true and stop_if_true=true is specified in the constructor: Callback, WithNumberDo, WithLossDo, WithTrainingLossesDo.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"There are also three control wrappers to modify a control's behavior:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"wrapper description\nIterationControl.skip(control, predicate=1) Apply control every predicate applications of the control wrapper (can also be a function; see doc-string)\nIterationControl.louder(control, by=1) Increase the verbosity level of control by the specified value (negative values lower verbosity)\nIterationControl.with_state_do(control; f=...) Apply control and call f(x) where x is the internal state of control; useful for debugging. Default f logs state to Info. Warning: internal control state is not yet part of the public API.\nIterationControl.composite(controls...) Apply each control in controls in sequence; used internally by IterationControl.jl","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Table 2. Wrapped controls","category":"page"},{"location":"controlling_iterative_models/#Using-training-losses,-and-controlling-model-tuning","page":"Controlling Iterative Models","title":"Using training losses, and controlling model tuning","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Some iterative models report a training loss, as a byproduct of a fit! call and these can be used in two ways:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"To supplement an out-of-sample estimate of the loss in deciding when to stop, as in the PQ stopping criterion (see Prechelt, Lutz (1998))); or\nAs a (generally less reliable) substitute for an out-of-sample loss, when wishing to train exclusively on all supplied data.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"To have IteratedModel bind all data to the training machine and use training losses in place of an out-of-sample loss, specify resampling=nothing. To check if MyFavoriteIterativeModel reports training losses, load the model code and inspect supports_training_losses(MyFavoriteIterativeModel) (or do info(\"MyFavoriteIterativeModel\"))","category":"page"},{"location":"controlling_iterative_models/#Controlling-model-tuning","page":"Controlling Iterative Models","title":"Controlling model tuning","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"An example of scenario 2 occurs when controlling hyperparameter optimization (model tuning). Recall that MLJ's TunedModel wrapper is implemented as an iterative model. Moreover, this wrapper reports, as a training loss, the lowest value of the optimization objective function so far (typically the lowest value of an out-of-sample loss, or -1 times an out-of-sample score). One may want to simply end the hyperparameter search when this value meets the NumberSinceBest stopping criterion discussed below, say, rather than introducing an extra layer of resampling to first \"learn\" the optimal value of the iteration parameter.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In the following example, we conduct a RandomSearch for the optimal value of the regularization parameter lambda in a RidgeRegressor using 6-fold cross-validation. By wrapping our \"self-tuning\" version of the regressor as an IteratedModel, with resampling=nothing and NumberSinceBest(20) in the controls, we terminate the search when the number of lambda values tested since the previous best cross-validation loss reaches 20.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"using MLJ\n\nX, y = @load_boston;\nRidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels verbosity=0\nmodel = RidgeRegressor()\nr = range(model, :lambda, lower=-1, upper=2, scale=x->10^x)\nself_tuning_model = TunedModel(model=model,\n tuning=RandomSearch(rng=123),\n resampling=CV(nfolds=6),\n range=r,\n measure=mae);\niterated_model = IteratedModel(model=self_tuning_model,\n resampling=nothing,\n control=[Step(1), NumberSinceBest(20), NumberLimit(1000)])\nmach = machine(iterated_model, X, y)\nnothing # hide","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"fit!(mach)","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"report(mach).model_report.best_model","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"We can use mach here to directly obtain predictions using the optimal model (trained on all data), as in","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"predict(mach, selectrows(X, 1:4))","category":"page"},{"location":"controlling_iterative_models/#Custom-controls","page":"Controlling Iterative Models","title":"Custom controls","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Under the hood, control in MLJIteration is implemented using IterationControl.jl. Rather than iterating a training machine directly, we iterate a wrapped version of this object, which includes other information that a control may want to access, such as the MLJ evaluation object. This information is summarized under The training machine wrapper below.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Controls must implement two update! methods, one for initializing the control's state on the first application of the control (this state being external to the control struct) and one for all subsequent control applications, which generally updates the state as well. There are two optional methods: done, for specifying conditions triggering a stop, and takedown for specifying actions to perform at the end of controlled training.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"We summarize the training algorithm, as it relates to controls, after giving a simple example.","category":"page"},{"location":"controlling_iterative_models/#Example-1-Non-uniform-iteration-steps","page":"Controlling Iterative Models","title":"Example 1 - Non-uniform iteration steps","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Below we define a control, IterateFromList(list), to train, on each application of the control, until the iteration count reaches the next value in a user-specified list, triggering a stop when the list is exhausted. For example, to train on iteration counts on a log scale, one might use IterateFromList([round(Int, 10^x) for x in range(1, 2, length=10)].","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"In the code, wrapper is an object that wraps the training machine (see above). The variable n is a counter for control cycles (unused in this example).","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"import IterationControl # or MLJ.IterationControl\n\nstruct IterateFromList\n list::Vector{<:Int} # list of iteration parameter values\n IterateFromList(v) = new(unique(sort(v)))\nend\n\nfunction IterationControl.update!(control::IterateFromList, wrapper, verbosity, n)\n Δi = control.list[1]\n verbosity > 1 && @info \"Training $Δi more iterations. \"\n MLJIteration.train!(wrapper, Δi) # trains the training machine\n return (index = 2, )\nend\n\nfunction IterationControl.update!(control::IterateFromList, wrapper, verbosity, n, state)\n index = state.positioin_in_list\n Δi = control.list[i] - wrapper.n_iterations\n verbosity > 1 && @info \"Training $Δi more iterations. \"\n MLJIteration.train!(wrapper, Δi)\n return (index = index + 1, )\nend","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"The first update method will be called the first time the control is applied, returning an initialized state = (index = 2,), which is passed to the second update method, which is called on subsequent control applications (and which returns the updated state).","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"A done method articulates the criterion for stopping:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.done(control::IterateFromList, state) =\n state.index > length(control.list)","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"For the sake of illustration, we'll implement a takedown method; its return value is included in the IteratedModel report:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.takedown(control::IterateFromList, verbosity, state)\n verbosity > 1 && = @info \"Stepped through these values of the \"*\n \"iteration parameter: $(control.list)\"\n return (iteration_values=control.list, )\nend","category":"page"},{"location":"controlling_iterative_models/#The-training-machine-wrapper","page":"Controlling Iterative Models","title":"The training machine wrapper","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"A training machine wrapper has these properties:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"wrapper.machine - the training machine, type Machine\nwrapper.model - the mutable atomic model, coinciding with wrapper.machine.model\nwrapper.n_cycles - the number IterationControl.train!(wrapper, _) calls so far; generally the current control cycle count\nwrapper.n_iterations - the total number of iterations applied to the model so far\nwrapper.Δiterations - the number of iterations applied in the last IterationControl.train!(wrapper, _) call\nwrapper.loss - the out-of-sample loss (based on the first measure in measures)\nwrapper.training_losses - the last batch of training losses (if reported by model), an abstract vector of length wrapper.Δiteration.\nwrapper.evaluation - the complete MLJ performance evaluation object, which has the following properties: measure, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold (here there is only one fold). For further details, see Evaluating Model Performance.","category":"page"},{"location":"controlling_iterative_models/#The-training-algorithm","page":"Controlling Iterative Models","title":"The training algorithm","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"Here now is a simplified description of the training of an IteratedModel. First, the atomic model is bound in a machine - the training machine above - to a subset of the supplied data, and then wrapped in an object called wrapper below. To train the training machine machine for i more iterations, and update the other data in the wrapper, requires the call MLJIteration.train!(wrapper, i). Only controls can make this call (e.g., Step(...), or IterateFromList(...) above). If we assume for simplicity there is only a single control, called control, then training proceeds as follows:","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"n = 1 # initialize control cycle counter\nstate = update!(control, wrapper, verbosity, n)\nfinished = done(control, state)\n\n# subsequent training events:\nwhile !finished\n n += 1\n state = update!(control, wrapper, verbosity, n, state)\n finished = done(control, state)\nend\n\n# finalization:\nreturn takedown(control, verbosity, state)","category":"page"},{"location":"controlling_iterative_models/#Example-2-Cyclic-learning-rates","page":"Controlling Iterative Models","title":"Example 2 - Cyclic learning rates","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"The control below implements a triangular cyclic learning rate policy, close to that described in L. N. Smith (2019): \"Cyclical Learning Rates for Training Neural Networks,\" 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, pp. 464-472. [In that paper learning rates are mutated (slowly) during each training iteration (epoch) while here mutations can occur once per iteration of the model, at most.]","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"For the sake of illustration, we suppose the iterative model, model, specified in the IteratedModel constructor, has a field called :learning_parameter, and that mutating this parameter does not trigger cold-restarts.","category":"page"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"struct CycleLearningRate{F<:AbstractFloat}\n stepsize::Int\n lower::F\n upper::F\nend\n\n# return one cycle of learning rate values:\nfunction one_cycle(c::CycleLearningRate)\n rise = range(c.lower, c.upper, length=c.stepsize + 1)\n fall = reverse(rise)\n return vcat(rise[1:end - 1], fall[1:end - 1])\nend\n\nfunction IterationControl.update!(control::CycleLearningRate,\n wrapper,\n verbosity,\n n,\n state = (learning_rates=nothing, ))\n rates = n == 0 ? one_cycle(control) : state.learning_rates\n index = mod(n, length(rates)) + 1\n r = rates[index]\n verbosity > 1 && @info \"learning rate: $r\"\n wrapper.model.iteration_control = r\n return (learning_rates = rates,)\nend","category":"page"},{"location":"controlling_iterative_models/#API-Reference","page":"Controlling Iterative Models","title":"API Reference","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"MLJIteration.IteratedModel","category":"page"},{"location":"controlling_iterative_models/#MLJIteration.IteratedModel","page":"Controlling Iterative Models","title":"MLJIteration.IteratedModel","text":"IteratedModel(model;\n controls=MLJIteration.DEFAULT_CONTROLS,\n resampling=Holdout(),\n measure=nothing,\n retrain=false,\n advanced_options...,\n)\n\nWrap the specified supervised model in the specified iteration controls. Here model should support iteration, which is true if (iteration_parameter(model) is different from nothing.\n\nAvailable controls: Step(), Info(), Warn(), Error(), Callback(), WithLossDo(), WithTrainingLossesDo(), WithNumberDo(), Data(), Disjunction(), GL(), InvalidValue(), Never(), NotANumber(), NumberLimit(), NumberSinceBest(), PQ(), Patience(), Threshold(), TimeLimit(), Warmup(), WithIterationsDo(), WithEvaluationDo(), WithFittedParamsDo(), WithReportDo(), WithMachineDo(), WithModelDo(), CycleLearningRate() and Save().\n\nimportant: Important\nTo make out-of-sample losses available to the controls, the wrapped model is only trained on part of the data, as iteration proceeds. The user may want to force retraining on all data after controlled iteration has finished by specifying retrain=true. See also \"Training\", and the retrain option, under \"Extended help\" below.\n\nExtended help\n\nOptions\n\ncontrols=Any[Step(1), Patience(5), GL(2.0), TimeLimit(Dates.Millisecond(108000)), InvalidValue()]: Controls are summarized at https://JuliaAI.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.\nresampling=Holdout(fraction_train=0.7): The default resampling holds back 30% of data for computing an out-of-sample estimate of performance (the \"loss\") for loss-based controls such as WithLossDo. Specify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). If the model does not report a training loss, you can use resampling=InSample() instead. Otherwise, resampling must have type Holdout or be a vector with one element of the form (train_indices, test_indices).\nmeasure=nothing: StatisticalMeasures.jl compatible measure for estimating model performance (the \"loss\", but the orientation is immaterial - i.e., this could be a score). Inferred by default. Ignored if resampling=nothing.\nretrain=false: If retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected (\"learned\"). That is, the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped. This is typically desired if wrapping the iterated model further, or when inserting in a pipeline or other composite model. If retrain=false (default) and resampling isa Holdout, then iterated_model behaves like the original model trained on a subset of the provided data.\nweights=nothing: per-observation weights to be passed to measure where supported; if unspecified, these are understood to be uniform.\nclass_weights=nothing: class-weights to be passed to measure where supported; if unspecified, these are understood to be uniform.\noperation=nothing: Operation, such as predict or predict_mode, for computing target values, or proxy target values, for consumption by measure; automatically inferred by default.\ncheck_measure=true: Specify false to override checks on measure for compatibility with the training data.\niteration_parameter=nothing: A symbol, such as :epochs, naming the iteration parameter of model; inferred by default. Note that the actual value of the iteration parameter in the supplied model is ignored; only the value of an internal clone is mutated during training the wrapped model.\ncache=true: Whether or not model-specific representations of data are cached in between iteration parameter increments; specify cache=false to prioritize memory over speed.\n\nTraining\n\nTraining an instance iterated_model of IteratedModel on some data (by binding to a machine and calling fit!, for example) performs the following actions:\n\nAssuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy.\nA clone of the wrapped model, model is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).\nThe iteration parameter of the clone is set to 0.\nThe specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).\nOnce a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false (true by default) or resampling === nothing, in which case mach_production coincides with train_mach.\n\nPrediction\n\nCalling predict(mach, Xnew) in the example above returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.\n\nControls that mutate parameters\n\nA control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control\n\nCallback(mach -> mach.model.eta = 1.05*mach.model.eta)\n\nHowever, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.\n\nWarm restarts\n\nIn the following example, the second fit! call will not restart training of the internal train_mach, assuming model supports warm restarts:\n\niterated_model = IteratedModel(\n model,\n controls = [Step(1), NumberLimit(100)],\n)\nmach = machine(iterated_model, X, y)\nfit!(mach) # train for 100 iterations\niterated_model.controls = [Step(1), NumberLimit(50)],\nfit!(mach) # train for an *extra* 50 iterations\n\nMore generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.\n\nSpecifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#Controls","page":"Controlling Iterative Models","title":"Controls","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.Step\nEarlyStopping.TimeLimit\nEarlyStopping.NumberLimit\nEarlyStopping.NumberSinceBest\nEarlyStopping.InvalidValue\nEarlyStopping.Threshold\nEarlyStopping.GL\nEarlyStopping.PQ\nEarlyStopping.Patience\nIterationControl.Info\nIterationControl.Warn\nIterationControl.Error\nIterationControl.Callback\nIterationControl.WithNumberDo\nMLJIteration.WithIterationsDo\nIterationControl.WithLossDo\nIterationControl.WithTrainingLossesDo\nMLJIteration.WithEvaluationDo\nMLJIteration.WithFittedParamsDo\nMLJIteration.WithReportDo\nMLJIteration.WithModelDo\nMLJIteration.WithMachineDo\nMLJIteration.Save","category":"page"},{"location":"controlling_iterative_models/#IterationControl.Step","page":"Controlling Iterative Models","title":"IterationControl.Step","text":"Step(; n=1)\n\nAn iteration control, as in, Step(2). \n\nTrain for n more iterations. Will never trigger a stop. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.TimeLimit","page":"Controlling Iterative Models","title":"EarlyStopping.TimeLimit","text":"TimeLimit(; t=0.5)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nStopping is triggered after t hours have elapsed since the stopping criterion was initiated.\n\nAny Julia built-in Real type can be used for t. Subtypes of Period may also be used, as in TimeLimit(t=Minute(30)).\n\nInternally, t is rounded to nearest millisecond. ``\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.NumberLimit","page":"Controlling Iterative Models","title":"EarlyStopping.NumberLimit","text":"NumberLimit(; n=100)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered by n consecutive loss updates, excluding \"training\" loss updates.\n\nIf wrapped in a stopper::EarlyStopper, this is the number of calls to done!(stopper).\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.NumberSinceBest","page":"Controlling Iterative Models","title":"EarlyStopping.NumberSinceBest","text":"NumberSinceBest(; n=6)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered when the number of calls to the control, since the lowest value of the loss so far, is n.\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.InvalidValue","page":"Controlling Iterative Models","title":"EarlyStopping.InvalidValue","text":"InvalidValue()\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nStop if a loss (or training loss) is NaN, Inf or -Inf (or, more precisely, if isnan(loss) or isinf(loss) is true).\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.Threshold","page":"Controlling Iterative Models","title":"EarlyStopping.Threshold","text":"Threshold(; value=0.0)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered as soon as the loss drops below value.\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.GL","page":"Controlling Iterative Models","title":"EarlyStopping.GL","text":"GL(; alpha=2.0)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered when the (rescaled) generalization loss exceeds the threshold alpha.\n\nTerminology. Suppose E_1 E_2 E_t are a sequence of losses, for example, out-of-sample estimates of the loss associated with some iterative machine learning algorithm. Then the generalization loss at time t, is given by\n\nGL_t = 100 (E_t - E_opt) over E_opt\n\nwhere E_opt is the minimum value of the sequence.\n\nReference: Prechelt, Lutz (1998): \"Early Stopping- But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.PQ","page":"Controlling Iterative Models","title":"EarlyStopping.PQ","text":"PQ(; alpha=0.75, k=5, tol=eps(Float64))\n\nA stopping criterion for training iterative supervised learners.\n\nA stop is triggered when Prechelt's progress-modified generalization loss exceeds the threshold PQ_T alpha, or if the training progress drops below P_j tol. Here k is the number of training (in-sample) losses used to estimate the training progress.\n\nContext and explanation of terminology\n\nThe training progress at time j is defined by\n\nP_j = 1000 M - mm\n\nwhere M is the mean of the last k training losses F_1 F_2 F_k and m is the minimum value of those losses.\n\nThe progress-modified generalization loss at time t is then given by\n\nPQ_t = GL_t P_t\n\nwhere GL_t is the generalization loss at time t; see GL.\n\nPQ will stop when the following are true:\n\nAt least k training samples have been collected via done!(c::PQ, loss; training = true) or update_training(c::PQ, loss, state)\nThe last update was an out-of-sample update. (done!(::PQ, loss; training=true) is always false)\nThe progress-modified generalization loss exceeds the threshold PQ_t alpha OR the training progress stalls P_j tol.\n\nReference: Prechelt, Lutz (1998): \"Early Stopping- But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#EarlyStopping.Patience","page":"Controlling Iterative Models","title":"EarlyStopping.Patience","text":"Patience(; n=5)\n\nAn early stopping criterion for loss-reporting iterative algorithms. \n\nA stop is triggered by n consecutive increases in the loss.\n\nDenoted \"UPs\" in Prechelt, Lutz (1998): \"Early Stopping- But When?\", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..\n\nFor a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Info","page":"Controlling Iterative Models","title":"IterationControl.Info","text":"Info(f=identity)\n\nAn iteration control, as in, Info(my_loss_function). \n\nLog to Info the value of f(m), where m is the object being iterated. If IterativeControl.expose(m) has been overloaded, then log f(expose(m)) instead.\n\nCan be suppressed by setting the global verbosity level sufficiently low. \n\nSee also Warn, Error. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Warn","page":"Controlling Iterative Models","title":"IterationControl.Warn","text":"Warn(predicate; f=\"\")\n\nAn iteration control, as in, Warn(m -> length(m.cache) > 100, f=\"Memory low\"). \n\nIf predicate(m) is true, then log to Warn the value of f (or f(IterationControl.expose(m)) if f is a function). Here m is the object being iterated.\n\nCan be suppressed by setting the global verbosity level sufficiently low.\n\nSee also Info, Error. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Error","page":"Controlling Iterative Models","title":"IterationControl.Error","text":"Error(predicate; f=\"\", exception=nothing))\n\nAn iteration control, as in, Error(m -> isnan(m.bias), f=\"Bias overflow!\"). \n\nIf predicate(m) is true, then log at the Error level the value of f (or f(IterationControl.expose(m)) if f is a function) and stop iteration at the end of the current control cycle. Here m is the object being iterated.\n\nSpecify exception=... to throw an immediate execption, without waiting to the end of the control cycle.\n\nSee also Info, Warn. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.Callback","page":"Controlling Iterative Models","title":"IterationControl.Callback","text":"Callback(f=_->nothing, stop_if_true=false, stop_message=nothing, raw=false)\n\nAn iteration control, as in, Callback(m->put!(v, my_loss_function(m)). \n\nCall f(IterationControl.expose(m)), where m is the object being iterated, unless raw=true, in which case call f(m) (guaranteed if expose has not been overloaded.) If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.WithNumberDo","page":"Controlling Iterative Models","title":"IterationControl.WithNumberDo","text":"WithNumberDo(f=n->@info(\"number: $n\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithNumberDo(n->put!(my_channel, n)). \n\nCall f(n + 1), where n is the number of complete control cycles. of the control (so, n = 1, 2, 3, ..., unless control is wrapped in a IterationControl.skip)`.\n\nIf stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithIterationsDo","page":"Controlling Iterative Models","title":"MLJIteration.WithIterationsDo","text":"WithIterationsDo(f=x->@info(\"iterations: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithIterationsDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the current number of model iterations (generally more than the number of control cycles). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.WithLossDo","page":"Controlling Iterative Models","title":"IterationControl.WithLossDo","text":"WithLossDo(f=x->@info(\"loss: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithLossDo(x->put!(my_losses, x)). \n\nCall f(loss), where loss is current loss.\n\nIf stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#IterationControl.WithTrainingLossesDo","page":"Controlling Iterative Models","title":"IterationControl.WithTrainingLossesDo","text":"WithTrainingLossesDo(f=v->@info(\"training: $v\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithTrainingLossesDo(v->put!(my_losses, last(v)). \n\nCall f(training_losses), where training_losses is the vector of most recent batch of training losses.\n\nIf stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithEvaluationDo","page":"Controlling Iterative Models","title":"MLJIteration.WithEvaluationDo","text":"WithEvaluationDo(f=x->@info(\"evaluation: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithEvaluationDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the latest performance evaluation, as returned by evaluate!(train_mach, resampling=..., ...). Not valid if resampling=nothing. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithFittedParamsDo","page":"Controlling Iterative Models","title":"MLJIteration.WithFittedParamsDo","text":"WithFittedParamsDo(f=x->@info(\"fitted_params: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithFittedParamsDo(x->put!(my_channel, x)). \n\nCall f(x), where x = fitted_params(mach) is the fitted parameters of the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithReportDo","page":"Controlling Iterative Models","title":"MLJIteration.WithReportDo","text":"WithReportDo(f=x->@info(\"report: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithReportDo(x->put!(my_channel, x)). \n\nCall f(x), where x = report(mach) is the report associated with the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithModelDo","page":"Controlling Iterative Models","title":"MLJIteration.WithModelDo","text":"WithModelDo(f=x->@info(\"model: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithModelDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the model associated with the training machine; f may mutate x, as in f(x) = (x.learning_rate *= 0.9). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.WithMachineDo","page":"Controlling Iterative Models","title":"MLJIteration.WithMachineDo","text":"WithMachineDo(f=x->@info(\"machine: $x\"), stop_if_true=false, stop_message=nothing)\n\nAn iteration control, as in, WithMachineDo(x->put!(my_channel, x)). \n\nCall f(x), where x is the training machine in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified. \n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#MLJIteration.Save","page":"Controlling Iterative Models","title":"MLJIteration.Save","text":"Save(filename=\"machine.jls\")\n\nAn iteration control, as in, Save(\"run3/machine.jls\"). \n\nSave the current state of the machine being iterated to disk, using the provided filename, decorated with a number, as in \"run3/machine42.jls\". The default behaviour uses the Serialization module but this can be changed by setting the method=save_fn(::String, ::Any) argument where save_fn is any serialization method. For more on what is meant by \"the machine being iterated\", see IteratedModel.\n\n\n\n\n\n","category":"type"},{"location":"controlling_iterative_models/#Control-wrappers","page":"Controlling Iterative Models","title":"Control wrappers","text":"","category":"section"},{"location":"controlling_iterative_models/","page":"Controlling Iterative Models","title":"Controlling Iterative Models","text":"IterationControl.skip\nIterationControl.louder\nIterationControl.with_state_do\nIterationControl.composite","category":"page"},{"location":"controlling_iterative_models/#IterationControl.skip","page":"Controlling Iterative Models","title":"IterationControl.skip","text":"IterationControl.skip(control, predicate=1)\n\nAn iteration control wrapper.\n\nIf predicate is an integer, k: Apply control on every k calls to apply the wrapped control, starting with the kth call.\n\nIf predicate is a function: Apply control as usual when predicate(n + 1) is true but otherwise skip. Here n is the number of control cycles applied so far.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#IterationControl.louder","page":"Controlling Iterative Models","title":"IterationControl.louder","text":"IterationControl.louder(control, by=1)\n\nWrap control to make in more (or less) verbose. The same as control, but as if the global verbosity were increased by the value by.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#IterationControl.with_state_do","page":"Controlling Iterative Models","title":"IterationControl.with_state_do","text":"IterationControl.with_state_do(control,\n f=x->@info \"$(typeof(control)) state: $x\")\n\nWrap control to give access to it's internal state. Acts exactly like control except that f is called on the internal state of control. If f is not specified, the control type and state are logged to Info at every update (useful for debugging new controls).\n\nWarning. The internal state of a control is not yet considered part of the public interface and could change between in any pre 1.0 release of IterationControl.jl.\n\n\n\n\n\n","category":"function"},{"location":"controlling_iterative_models/#IterationControl.composite","page":"Controlling Iterative Models","title":"IterationControl.composite","text":"composite(controls...)\n\nConstruct an iteration control that applies the specified controls in sequence.\n\n\n\n\n\n","category":"function"},{"location":"models/SODDetector_OutlierDetectionPython/#SODDetector_OutlierDetectionPython","page":"SODDetector","title":"SODDetector","text":"","category":"section"},{"location":"models/SODDetector_OutlierDetectionPython/","page":"SODDetector","title":"SODDetector","text":"SODDetector(n_neighbors = 5,\n ref_set = 10,\n alpha = 0.8)","category":"page"},{"location":"models/SODDetector_OutlierDetectionPython/","page":"SODDetector","title":"SODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sod","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#RandomUndersampler_Imbalance","page":"RandomUndersampler","title":"RandomUndersampler","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"Initiate a random undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"RandomUndersampler","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"A model type for constructing a random undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"RandomUndersampler = @load RandomUndersampler pkg=Imbalance","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"Do model = RandomUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomUndersampler(ratios=...).","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"RandomUndersampler implements naive undersampling by randomly removing existing observations. ","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Training-data","page":"RandomUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"For default values of the hyper-parameters, model can be constructed by model = RandomUndersampler()","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Hyperparameters","page":"RandomUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"ratios=1.0: A parameter that controls the amount of undersampling to be done for each class\nCan be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Transform-Inputs","page":"RandomUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Transform-Outputs","page":"RandomUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Operations","page":"RandomUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"transform(mach, X, y): resample the data X and y using RandomUndersampler, returning both the new and original observations","category":"page"},{"location":"models/RandomUndersampler_Imbalance/#Example","page":"RandomUndersampler","title":"Example","text":"","category":"section"},{"location":"models/RandomUndersampler_Imbalance/","page":"RandomUndersampler","title":"RandomUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y; ref=\"minority\")\n 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load RandomUndersampler\nRandomUndersampler = @load RandomUndersampler pkg=Imbalance\n\n## wrap the model in a machine\nundersampler = RandomUndersampler(ratios=Dict(0=>1.0, 1=> 1.0, 2=>1.0), \n rng=42)\nmach = machine(undersampler)\n\n## provide the data to transform (there is nothing to fit)\nX_under, y_under = transform(mach, X, y)\n \njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) ","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/#RandomForestRegressor_MLJScikitLearnInterface","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"","category":"section"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A model type for constructing a random forest regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor = @load RandomForestRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(n_estimators=...).","category":"page"},{"location":"models/RandomForestRegressor_MLJScikitLearnInterface/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.","category":"page"},{"location":"models/FillImputer_MLJModels/#FillImputer_MLJModels","page":"FillImputer","title":"FillImputer","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"FillImputer","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"FillImputer = @load FillImputer pkg=MLJModels","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"Use this model to impute missing values in tabular data. A fixed \"filler\" value is learned from the training data, one for each column of the table.","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"For imputing missing values in a vector, use UnivariateFillImputer instead.","category":"page"},{"location":"models/FillImputer_MLJModels/#Training-data","page":"FillImputer","title":"Training data","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"mach = machine(model, X)","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"where","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/FillImputer_MLJModels/#Hyper-parameters","page":"FillImputer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as \"impute all\".\ncontinuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values","category":"page"},{"location":"models/FillImputer_MLJModels/#Operations","page":"FillImputer","title":"Operations","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach","category":"page"},{"location":"models/FillImputer_MLJModels/#Fitted-parameters","page":"FillImputer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"features_seen_in_fit: the names of features (columns) encountered during training\nunivariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)\nfiller_given_feature: dictionary of filler values, keyed on feature (column) names","category":"page"},{"location":"models/FillImputer_MLJModels/#Examples","page":"FillImputer","title":"Examples","text":"","category":"section"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"using MLJ\nimputer = FillImputer()\n\nX = (a = [1.0, 2.0, missing, 3.0, missing],\n b = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass),\n c = [1, 1, 2, missing, 3])\n\nschema(X)\njulia> schema(X)\n┌───────┬───────────────────────────────┐\n│ names │ scitypes │\n├───────┼───────────────────────────────┤\n│ a │ Union{Missing, Continuous} │\n│ b │ Union{Missing, Multiclass{2}} │\n│ c │ Union{Missing, Count} │\n└───────┴───────────────────────────────┘\n\nmach = machine(imputer, X)\nfit!(mach)\n\njulia> fitted_params(mach).filler_given_feature\n(filler = 2.0,)\n\njulia> fitted_params(mach).filler_given_feature\nDict{Symbol, Any} with 3 entries:\n :a => 2.0\n :b => \"y\"\n :c => 2\n\njulia> transform(mach, X)\n(a = [1.0, 2.0, 2.0, 3.0, 2.0],\n b = CategoricalValue{String, UInt32}[\"y\", \"n\", \"y\", \"y\", \"y\"],\n c = [1, 1, 2, 2, 3],)","category":"page"},{"location":"models/FillImputer_MLJModels/","page":"FillImputer","title":"FillImputer","text":"See also UnivariateFillImputer.","category":"page"},{"location":"composing_models/#Composing-Models","page":"Composing Models","title":"Composing Models","text":"","category":"section"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Three common ways of combining multiple models together have out-of-the-box implementations in MLJ:","category":"page"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Linear Pipelines (Pipeline)- for unbranching chains that take the output of one model (e.g., dimension reduction, such as PCA) and make it the input of the next model in the chain (e.g., a classification model, such as EvoTreeClassifier). To include transformations of the target variable in a supervised pipeline model, see Target Transformations.\nHomogeneous Ensembles (EnsembleModel) - for blending the predictions of multiple supervised models all of the same type, but which receive different views of the training data to reduce overall variance. The technique implemented here is known as observation bagging. \nModel Stacking - (Stack) for combining the predictions of a smaller number of models of possibly different types, with the help of an adjudicating model.","category":"page"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Additionally, more complicated model compositions are possible using:","category":"page"},{"location":"composing_models/","page":"Composing Models","title":"Composing Models","text":"Learning Networks - \"blueprints\" for combining models in flexible ways; these are simple transformations of your existing workflows which can be \"exported\" to define new, stand-alone model types.","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/#OPTICS_MLJScikitLearnInterface","page":"OPTICS","title":"OPTICS","text":"","category":"section"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"OPTICS","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"A model type for constructing a optics, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"OPTICS = @load OPTICS pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"Do model = OPTICS() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OPTICS(min_samples=...).","category":"page"},{"location":"models/OPTICS_MLJScikitLearnInterface/","page":"OPTICS","title":"OPTICS","text":"OPTICS (Ordering Points To Identify the Clustering Structure), closely related to `DBSCAN', finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/#BalancedBaggingClassifier_MLJBalancing","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"","category":"section"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"BalancedBaggingClassifier","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"A model type for constructing a balanced bagging classifier, based on MLJBalancing.jl.","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"BalancedBaggingClassifier = @load BalancedBaggingClassifier pkg=MLJBalancing","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"Construct an instance with default hyper-parameters using the syntax bagging_model = BalancedBaggingClassifier(model=...)","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"Given a probablistic classifier.BalancedBaggingClassifier performs bagging by undersampling only majority data in each bag so that its includes as much samples as in the minority data. This is proposed with an Adaboost classifier where the output scores are averaged in the paper Xu-Ying Liu, Jianxin Wu, & Zhi-Hua Zhou. (2009). Exploratory Undersampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39 (2), 539–5501","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/#Training-data","page":"BalancedBaggingClassifier","title":"Training data","text":"","category":"section"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"where","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"X: input features of a form supported by the model being wrapped (typically a table, e.g., DataFrame, with Continuous columns will be supported, as a minimum)\ny: the binary target, which can be any AbstractVector where length(unique(y)) == 2","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/#Hyperparameters","page":"BalancedBaggingClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"model::Probabilistic: The classifier to use to train on each bag.\nT::Integer=0: The number of bags to be used in the ensemble. If not given, will be set as the ratio between the frequency of the majority and minority classes. Can be later found in report(mach).\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if Julia VERSION>=1.7. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/#Operations","page":"BalancedBaggingClassifier","title":"Operations","text":"","category":"section"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"predict(mach, Xnew): return predictions of the target given","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"predict_mode(mach, Xnew): return the mode of each prediction above","category":"page"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/#Example","page":"BalancedBaggingClassifier","title":"Example","text":"","category":"section"},{"location":"models/BalancedBaggingClassifier_MLJBalancing/","page":"BalancedBaggingClassifier","title":"BalancedBaggingClassifier","text":"using MLJ\nusing Imbalance\n\n## Load base classifier and BalancedBaggingClassifier\nBalancedBaggingClassifier = @load BalancedBaggingClassifier pkg=MLJBalancing\nLogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0\n\n## Construct the base classifier and use it to construct a BalancedBaggingClassifier\nlogistic_model = LogisticClassifier()\nmodel = BalancedBaggingClassifier(model=logistic_model, T=5)\n\n## Load the data and train the BalancedBaggingClassifier\nX, y = Imbalance.generate_imbalanced_data(100, 5; num_vals_per_category = [3, 2],\n class_probs = [0.9, 0.1],\n type = \"ColTable\",\n rng=42)\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇ 16 (19.0%)\n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 84 (100.0%)\n\nmach = machine(model, X, y) |> fit!\n\n## Predict using the trained model\n\nyhat = predict(mach, X) ## probabilistic predictions\npredict_mode(mach, X) ## point predictions","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#OneHotEncoder_MLJModels","page":"OneHotEncoder","title":"OneHotEncoder","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"OneHotEncoder","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"OneHotEncoder = @load OneHotEncoder pkg=MLJModels","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"New data to be transformed may lack features present in the fit data, but no new features can be present.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Training-data","page":"OneHotEncoder","title":"Training data","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"mach = machine(model, X)","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"where","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Hyper-parameters","page":"OneHotEncoder","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.\nordered_factor=false: when true, OrderedFactor features are universally excluded\ndrop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Fitted-parameters","page":"OneHotEncoder","title":"Fitted parameters","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"all_features: names of all features encountered in training\nfitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name\nref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Report","page":"OneHotEncoder","title":"Report","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"The fields of report(mach) are:","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"features_to_be_encoded: names of input features to be encoded\nnew_features: names of all output features","category":"page"},{"location":"models/OneHotEncoder_MLJModels/#Example","page":"OneHotEncoder","title":"Example","text":"","category":"section"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"using MLJ\n\nX = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n└───────────┴──────────────────┘\n\nhot = OneHotEncoder(drop_last=true)\nmach = fit!(machine(hot, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade__A │ Continuous │\n│ grade__B │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Count │\n└──────────────┴────────────┘","category":"page"},{"location":"models/OneHotEncoder_MLJModels/","page":"OneHotEncoder","title":"OneHotEncoder","text":"See also ContinuousEncoder.","category":"page"},{"location":"internals/#internals_section","page":"Internals","title":"Internals","text":"","category":"section"},{"location":"internals/#The-machine-interface,-simplified","page":"Internals","title":"The machine interface, simplified","text":"","category":"section"},{"location":"internals/","page":"Internals","title":"Internals","text":"The following is a simplified description of the Machine interface. It predates the introduction of an optional data front-end for models (see Implementing a data front-end). See also the Glossary","category":"page"},{"location":"internals/#The-Machine-type","page":"Internals","title":"The Machine type","text":"","category":"section"},{"location":"internals/","page":"Internals","title":"Internals","text":"mutable struct Machine{M fit!\n\nXnew, _ = make_regression(3, 9)\nyhat = predict(mach, Xnew) ## new predictions","category":"page"},{"location":"models/MultitargetLinearRegressor_MultivariateStats/","page":"MultitargetLinearRegressor","title":"MultitargetLinearRegressor","text":"See also LinearRegressor, RidgeRegressor, MultitargetRidgeRegressor","category":"page"},{"location":"models/CDDetector_OutlierDetectionPython/#CDDetector_OutlierDetectionPython","page":"CDDetector","title":"CDDetector","text":"","category":"section"},{"location":"models/CDDetector_OutlierDetectionPython/","page":"CDDetector","title":"CDDetector","text":"CDDetector(whitening = true,\n rule_of_thumb = false)","category":"page"},{"location":"models/CDDetector_OutlierDetectionPython/","page":"CDDetector","title":"CDDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cd","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#ConstantRegressor_MLJModels","page":"ConstantRegressor","title":"ConstantRegressor","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"ConstantRegressor","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"This \"dummy\" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution returned is the one of the type specified that best fits the training target data. Use predict_mean or predict_median to predict the mean or median values instead. If not specified, a normal distribution is fit.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"Almost any reasonable model is expected to outperform ConstantRegressor which is used almost exclusively for testing and establishing performance baselines.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"In MLJ (or MLJModels) do model = ConstantRegressor() or model = ConstantRegressor(distribution=...) to construct a model instance.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Training-data","page":"ConstantRegressor","title":"Training data","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"In MLJ (or MLJBase) bind an instance model to data with","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"Here:","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"X is any table of input features (eg, a DataFrame)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with schema(y)","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Hyper-parameters","page":"ConstantRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"distribution_type=Distributions.Normal: The distribution to be fit to the target data. Must be a subtype of Distributions.ContinuousUnivariateDistribution.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Operations","page":"ConstantRegressor","title":"Operations","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.\npredict_mean(mach, Xnew): Return instead the means of the probabilistic predictions returned above.\npredict_median(mach, Xnew): Return instead the medians of the probabilistic predictions returned above.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Fitted-parameters","page":"ConstantRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"target_distribution: The distribution fit to the supplied target data.","category":"page"},{"location":"models/ConstantRegressor_MLJModels/#Examples","page":"ConstantRegressor","title":"Examples","text":"","category":"section"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"using MLJ\n\nX, y = make_regression(10, 2) ## synthetic data: a table and vector\nregressor = ConstantRegressor()\nmach = machine(regressor, X, y) |> fit!\n\nfitted_params(mach)\n\nXnew, _ = make_regression(3, 2)\npredict(mach, Xnew)\npredict_mean(mach, Xnew)\n","category":"page"},{"location":"models/ConstantRegressor_MLJModels/","page":"ConstantRegressor","title":"ConstantRegressor","text":"See also ConstantClassifier","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/#ElasticNetRegressor_MLJScikitLearnInterface","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"A model type for constructing a elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetRegressor(alpha=...).","category":"page"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"ElasticNetRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJScikitLearnInterface/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"alpha = 1.0\nl1_ratio = 0.5\nfit_intercept = true\nprecompute = false\nmax_iter = 1000\ncopy_X = true\ntol = 0.0001\nwarm_start = false\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#SubspaceLDA_MultivariateStats","page":"SubspaceLDA","title":"SubspaceLDA","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"SubspaceLDA","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"A model type for constructing a subpace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Do model = SubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SubspaceLDA(normalize=...).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Multiclass subspace linear discriminant analysis (LDA) is a variation on ordinary LDA suitable for high dimensional data, as it avoids storing scatter matrices. For details, refer the MultivariateStats.jl documentation.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"In addition to dimension reduction (using transform) probabilistic classification is provided (using predict). In the case of classification, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here \"distance\" is computed using a user-specified distance function.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Training-data","page":"SubspaceLDA","title":"Training data","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Here:","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Hyper-parameters","page":"SubspaceLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.\noutdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.\ndist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Operations","page":"SubspaceLDA","title":"Operations","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Fitted-parameters","page":"SubspaceLDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Report","page":"SubspaceLDA","title":"Report","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The mean of the untransformed training data. A vector of length indim.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool)","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nexplained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/#Examples","page":"SubspaceLDA","title":"Examples","text":"","category":"section"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"using MLJ\n\nSubspaceLDA = @load SubspaceLDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = SubspaceLDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)","category":"page"},{"location":"models/SubspaceLDA_MultivariateStats/","page":"SubspaceLDA","title":"SubspaceLDA","text":"See also LDA, BayesianLDA, BayesianSubspaceLDA","category":"page"},{"location":"generating_synthetic_data/#Generating-Synthetic-Data","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"Here synthetic data means artificially generated data, with no reference to a \"real world\" data set. Not to be confused \"fake data\" obtained by resampling from a distribution fit to some actual real data.","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"MLJ has a set of functions - make_blobs, make_circles, make_moons and make_regression (closely resembling functions in scikit-learn of the same name) - for generating synthetic data sets. These are useful for testing machine learning models (e.g., testing user-defined composite models; see Composing Models)","category":"page"},{"location":"generating_synthetic_data/#Generating-Gaussian-blobs","page":"Generating Synthetic Data","title":"Generating Gaussian blobs","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_blobs","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_blobs","page":"Generating Synthetic Data","title":"MLJBase.make_blobs","text":"X, y = make_blobs(n=100, p=2; kwargs...)\n\nGenerate Gaussian blobs for clustering and classification problems.\n\nReturn value\n\nBy default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\ncenters=3: either a number of centers or a c x p matrix with c pre-determined centers,\ncluster_std=1.0: the standard deviation(s) of each blob,\ncenter_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])\ndfBlobs = DataFrame(X)\ndfBlobs.y = y\nfirst(dfBlobs, 3)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using VegaLite\ndfBlobs |> @vlplot(:point, x=:x1, y=:x2, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"dfBlobs |> @vlplot(:point, x=:x1, y=:x3, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/#Generating-concentric-circles","page":"Generating Synthetic Data","title":"Generating concentric circles","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_circles","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_circles","page":"Generating Synthetic Data","title":"MLJBase.make_circles","text":"X, y = make_circles(n=100; kwargs...)\n\nGenerate n labeled points close to two concentric circles for classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0: standard deviation of the Gaussian noise added to the data,\nfactor=0.8: ratio of the smaller radius over the larger one,\n\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_circles(100; noise=0.5, factor=0.3)\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_circles(100; noise=0.05, factor=0.3)\ndfCircles = DataFrame(X)\ndfCircles.y = y\nfirst(dfCircles, 3)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using VegaLite\ndfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/#Sampling-from-two-interleaved-half-circles","page":"Generating Synthetic Data","title":"Sampling from two interleaved half-circles","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_moons","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_moons","page":"Generating Synthetic Data","title":"MLJBase.make_moons","text":"make_moons(n::Int=100; kwargs...)\n\nGenerates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0.1: standard deviation of the Gaussian noise added to the data,\nxshift=1.0: horizontal translation of the second center with respect to the first one.\nyshift=0.3: vertical translation of the second center with respect to the first one. \neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_moons(100; noise=0.5)\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_moons(100; noise=0.05)\ndfHalfCircles = DataFrame(X)\ndfHalfCircles.y = y\nfirst(dfHalfCircles, 3)","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using VegaLite\ndfHalfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :\"y:n\") ","category":"page"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"(Image: svg)","category":"page"},{"location":"generating_synthetic_data/#Regression-data-generated-from-noisy-linear-models","page":"Generating Synthetic Data","title":"Regression data generated from noisy linear models","text":"","category":"section"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"make_regression","category":"page"},{"location":"generating_synthetic_data/#MLJBase.make_regression","page":"Generating Synthetic Data","title":"MLJBase.make_regression","text":"make_regression(n, p; kwargs...)\n\nGenerate Gaussian input features and a linear response with Gaussian noise, for use with regression models.\n\nReturn value\n\nBy default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.\n\nKeywords\n\nintercept=true: Whether to generate data from a model with intercept.\nn_targets=1: Number of columns in the target.\nsparse=0: Proportion of the generating weight vector that is sparse.\nnoise=0.1: Standard deviation of the Gaussian noise added to the response (target).\noutliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)\nas_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.\neltype=Float64: Element type for X and y. Must subtype AbstractFloat.\nbinary=false: Whether the target should be binarized (via a sigmoid).\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). \n\nExample\n\nX, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)\n\n\n\n\n\n","category":"function"},{"location":"generating_synthetic_data/","page":"Generating Synthetic Data","title":"Generating Synthetic Data","text":"using MLJ, DataFrames\nX, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)\ndfRegression = DataFrame(X)\ndfRegression.y = y\nfirst(dfRegression, 3)","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/#FeatureAgglomeration_MLJScikitLearnInterface","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"","category":"section"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"FeatureAgglomeration","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"A model type for constructing a feature agglomeration, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"FeatureAgglomeration = @load FeatureAgglomeration pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"Do model = FeatureAgglomeration() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureAgglomeration(n_clusters=...).","category":"page"},{"location":"models/FeatureAgglomeration_MLJScikitLearnInterface/","page":"FeatureAgglomeration","title":"FeatureAgglomeration","text":"Similar to AgglomerativeClustering, but recursively merges features instead of samples.\"","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/#SVMRegressor_MLJScikitLearnInterface","page":"SVMRegressor","title":"SVMRegressor","text":"","category":"section"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"SVMRegressor","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"A model type for constructing a epsilon-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"SVMRegressor = @load SVMRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"Do model = SVMRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMRegressor(kernel=...).","category":"page"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMRegressor_MLJScikitLearnInterface/","page":"SVMRegressor","title":"SVMRegressor","text":"kernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\ntol = 0.001\nC = 1.0\nepsilon = 0.1\nshrinking = true\ncache_size = 200\nmax_iter = -1","category":"page"},{"location":"models/SimpleImputer_BetaML/#SimpleImputer_BetaML","page":"SimpleImputer","title":"SimpleImputer","text":"","category":"section"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"mutable struct SimpleImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"Impute missing values using feature (column) mean, with optional record normalisation (using l-norm norms), from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/SimpleImputer_BetaML/#Hyperparameters:","page":"SimpleImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"statistic::Function: The descriptive statistic of the column (feature) to use as imputed value [def: mean]\nnorm::Union{Nothing, Int64}: Normalise the feature mean by l-norm norm of the records [default: nothing]. Use it (e.g. norm=1 to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).","category":"page"},{"location":"models/SimpleImputer_BetaML/#Example:","page":"SimpleImputer","title":"Example:","text":"","category":"section"},{"location":"models/SimpleImputer_BetaML/","page":"SimpleImputer","title":"SimpleImputer","text":"julia> using MLJ\n\njulia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;\n\njulia> modelType = @load SimpleImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.SimpleImputer\n\njulia> model = modelType(norm=1)\nSimpleImputer(\n statistic = Statistics.mean, \n norm = 1)\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(SimpleImputer(statistic = mean, …), …).\n\njulia> X_full = transform(mach) |> MLJ.matrix\n9×2 Matrix{Float64}:\n 1.0 10.5\n 1.5 0.295466\n 1.8 8.0\n 1.7 15.0\n 3.2 40.0\n 0.280952 1.69524\n 3.3 38.0\n 0.0750839 -2.3\n 5.2 -2.4","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#UnivariateDiscretizer_MLJModels","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"UnivariateDiscretizer","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Training-data","page":"UnivariateDiscretizer","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"where","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"x: any abstract vector with Continuous element scitype; check scitype with scitype(x).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Hyper-parameters","page":"UnivariateDiscretizer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"n_classes: number of discrete classes in the output","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Operations","page":"UnivariateDiscretizer","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach\ninverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Fitted-parameters","page":"UnivariateDiscretizer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"The fields of fitted_params(mach).fitesult include:","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"odd_quantiles: quantiles used for transforming (length is n_classes - 1)\neven_quantiles: quantiles used for inverse transforming (length is n_classes)","category":"page"},{"location":"models/UnivariateDiscretizer_MLJModels/#Example","page":"UnivariateDiscretizer","title":"Example","text":"","category":"section"},{"location":"models/UnivariateDiscretizer_MLJModels/","page":"UnivariateDiscretizer","title":"UnivariateDiscretizer","text":"using MLJ\nusing Random\nRandom.seed!(123)\n\ndiscretizer = UnivariateDiscretizer(n_classes=100)\nmach = machine(discretizer, randn(1000))\nfit!(mach)\n\njulia> x = rand(5)\n5-element Vector{Float64}:\n 0.8585244609846809\n 0.37541692370451396\n 0.6767070590395461\n 0.9208844241267105\n 0.7064611415680901\n\njulia> z = transform(mach, x)\n5-element CategoricalArrays.CategoricalArray{UInt8,1,UInt8}:\n 0x52\n 0x42\n 0x4d\n 0x54\n 0x4e\n\nx_approx = inverse_transform(mach, z)\njulia> x - x_approx\n5-element Vector{Float64}:\n 0.008224506144777322\n 0.012731354778359405\n 0.0056265330571125816\n 0.005738175684445124\n 0.006835652575801987","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/#GaussianNBClassifier_MLJScikitLearnInterface","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"","category":"section"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"A model type for constructing a Gaussian naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier = @load GaussianNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianNBClassifier(priors=...).","category":"page"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"GaussianNBClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/GaussianNBClassifier_MLJScikitLearnInterface/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"priors = nothing\nvar_smoothing = 1.0e-9","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#GaussianNBClassifier_NaiveBayes","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"A model type for constructing a Gaussian naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"GaussianNBClassifier = @load GaussianNBClassifier pkg=NaiveBayes","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Given each class taken on by the target variable y, it is supposed that the conditional probability distribution for the input variables X is a multivariate Gaussian. The mean and covariance of these Gaussian distributions are estimated using maximum likelihood, and a probability distribution for y given X is deduced by applying Bayes' rule. The required marginal for y is estimated using class frequency in the training data.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Important. The name \"naive Bayes classifier\" is perhaps misleading. Since we are learning the full multivariate Gaussian distributions for X given y, we are not applying the usual naive Bayes independence condition, which would amount to forcing the covariance matrix to be diagonal.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Training-data","page":"GaussianNBClassifier","title":"Training data","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Here:","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Operations","page":"GaussianNBClassifier","title":"Operations","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic.\npredict_mode(mach, Xnew): Return the mode of above predictions.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Fitted-parameters","page":"GaussianNBClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"c_counts: A dictionary containing the observed count of each input class.\nc_stats: A dictionary containing observed statistics on each input class. Each class is represented by a DataStats object, with the following fields:\nn_vars: The number of variables used to describe the class's behavior.\nn_obs: The number of times the class is observed.\nobs_axis: The axis along which the observations were computed.\ngaussians: A per class dictionary of Gaussians, each representing the distribution of the class. Represented with type Distributions.MvNormal from the Distributions.jl package.\nn_obs: The total number of observations in the training data.","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/#Examples","page":"GaussianNBClassifier","title":"Examples","text":"","category":"section"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"using MLJ\nGaussianNB = @load GaussianNBClassifier pkg=NaiveBayes\n\nX, y = @load_iris\nclf = GaussianNB()\nmach = machine(clf, X, y) |> fit!\n\nfitted_params(mach)\n\npreds = predict(mach, X) ## probabilistic predictions\npreds[1]\npredict_mode(mach, X) ## point predictions","category":"page"},{"location":"models/GaussianNBClassifier_NaiveBayes/","page":"GaussianNBClassifier","title":"GaussianNBClassifier","text":"See also MultinomialNBClassifier","category":"page"},{"location":"models/Resampler_MLJBase/#Resampler_MLJBase","page":"Resampler","title":"Resampler","text":"","category":"section"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"resampler = Resampler(\n model=ConstantRegressor(),\n resampling=CV(),\n measure=nothing,\n weights=nothing,\n class_weights=nothing\n operation=predict,\n repeats = 1,\n acceleration=default_resource(),\n check_measure=true,\n per_observation=true,\n logger=nothing,\n compact=false,\n)","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"Private method. Use at own risk.","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"Resampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See evaluate! for meaning of the options. Not intended for use by general user, who will ordinarily use evaluate! directly.","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"Given a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"On subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see is_same_except).","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"If there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"The sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.","category":"page"},{"location":"models/Resampler_MLJBase/","page":"Resampler","title":"Resampler","text":"The sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#CatBoostRegressor_CatBoost","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"CatBoostRegressor","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"A model type for constructing a CatBoost regressor, based on CatBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"CatBoostRegressor = @load CatBoostRegressor pkg=CatBoost","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"Do model = CatBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostRegressor(iterations=...).","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Training-data","page":"CatBoostRegressor","title":"Training data","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"where","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.\ny: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Hyper-parameters","page":"CatBoostRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Operations","page":"CatBoostRegressor","title":"Operations","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Accessor-functions","page":"CatBoostRegressor","title":"Accessor functions","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Fitted-parameters","page":"CatBoostRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"model: The Python CatBoostRegressor model","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Report","page":"CatBoostRegressor","title":"Report","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"feature_importances: Vector{Pair{Symbol, Float64}} of feature importances","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/#Examples","page":"CatBoostRegressor","title":"Examples","text":"","category":"section"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"using CatBoost.MLJCatBoostInterface\nusing MLJ\n\nX = (\n duration = [1.5, 4.1, 5.0, 6.7], \n n_phone_calls = [4, 5, 6, 7], \n department = coerce([\"acc\", \"ops\", \"acc\", \"ops\"], Multiclass), \n)\ny = [2.0, 4.0, 6.0, 7.0]\n\nmodel = CatBoostRegressor(iterations=5)\nmach = machine(model, X, y)\nfit!(mach)\npreds = predict(mach, X)","category":"page"},{"location":"models/CatBoostRegressor_CatBoost/","page":"CatBoostRegressor","title":"CatBoostRegressor","text":"See also catboost and the unwrapped model type CatBoost.CatBoostRegressor.","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/#RidgeClassifier_MLJScikitLearnInterface","page":"RidgeClassifier","title":"RidgeClassifier","text":"","category":"section"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"RidgeClassifier","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"A model type for constructing a ridge regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"RidgeClassifier = @load RidgeClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"Do model = RidgeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeClassifier(alpha=...).","category":"page"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeClassifier_MLJScikitLearnInterface/","page":"RidgeClassifier","title":"RidgeClassifier","text":"alpha = 1.0\nfit_intercept = true\ncopy_X = true\nmax_iter = nothing\ntol = 0.001\nclass_weight = nothing\nsolver = auto\nrandom_state = nothing","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/#LassoRegressor_MLJScikitLearnInterface","page":"LassoRegressor","title":"LassoRegressor","text":"","category":"section"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"A model type for constructing a lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor = @load LassoRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"Do model = LassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoRegressor(alpha=...).","category":"page"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoRegressor_MLJScikitLearnInterface/","page":"LassoRegressor","title":"LassoRegressor","text":"alpha = 1.0\nfit_intercept = true\nprecompute = false\ncopy_X = true\nmax_iter = 1000\ntol = 0.0001\nwarm_start = false\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/KDEDetector_OutlierDetectionPython/#KDEDetector_OutlierDetectionPython","page":"KDEDetector","title":"KDEDetector","text":"","category":"section"},{"location":"models/KDEDetector_OutlierDetectionPython/","page":"KDEDetector","title":"KDEDetector","text":"KDEDetector(bandwidth=1.0,\n algorithm=\"auto\",\n leaf_size=30,\n metric=\"minkowski\",\n metric_params=None)","category":"page"},{"location":"models/KDEDetector_OutlierDetectionPython/","page":"KDEDetector","title":"KDEDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.kde","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#ConstantClassifier_MLJModels","page":"ConstantClassifier","title":"ConstantClassifier","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"ConstantClassifier","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"This \"dummy\" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Training-data","page":"ConstantClassifier","title":"Training data","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"Here:","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"X is any table of input features (eg, a DataFrame)\ny is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Hyper-parameters","page":"ConstantClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"None.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Operations","page":"ConstantClassifier","title":"Operations","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.\npredict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Fitted-parameters","page":"ConstantClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"target_distribution: The distribution fit to the supplied target data.","category":"page"},{"location":"models/ConstantClassifier_MLJModels/#Examples","page":"ConstantClassifier","title":"Examples","text":"","category":"section"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"using MLJ\n\nclf = ConstantClassifier()\n\nX, y = @load_crabs ## a table and a categorical vector\nmach = machine(clf, X, y) |> fit!\n\nfitted_params(mach)\n\nXnew = (;FL = [8.1, 24.8, 7.2],\n RW = [5.1, 25.7, 6.4],\n CL = [15.9, 46.7, 14.3],\n CW = [18.7, 59.7, 12.2],\n BD = [6.2, 23.6, 8.4],)\n\n## probabilistic predictions:\nyhat = predict(mach, Xnew)\nyhat[1]\n\n## raw probabilities:\npdf.(yhat, \"B\")\n\n## probability matrix:\nL = levels(y)\npdf(yhat, L)\n\n## point predictions:\npredict_mode(mach, Xnew)","category":"page"},{"location":"models/ConstantClassifier_MLJModels/","page":"ConstantClassifier","title":"ConstantClassifier","text":"See also ConstantRegressor","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#ClusterUndersampler_Imbalance","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"Initiate a cluster undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ClusterUndersampler","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"A model type for constructing a cluster undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ClusterUndersampler = @load ClusterUndersampler pkg=Imbalance","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"Do model = ClusterUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ClusterUndersampler(mode=...).","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ClusterUndersampler implements clustering undersampling as presented in Wei-Chao, L., Chih-Fong, T., Ya-Han, H., & Jing-Shang, J. (2017). Clustering-based undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26. with K-means as the clustering algorithm.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Training-data","page":"ClusterUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by \tmach = machine(model)","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"For default values of the hyper-parameters, model can be constructed with model = ClusterUndersampler().","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Hyperparameters","page":"ClusterUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"mode::AbstractString=\"nearest: If \"center\" then the undersampled data will consist of the centriods of","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"each cluster found; if `\"nearest\"` then it will consist of the nearest neighbor of each centroid.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"ratios=1.0: A parameter that controls the amount of undersampling to be done for each class\nCan be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float ratio for that class\nmaxiter::Integer=100: Maximum number of iterations to run K-means\nrng::Integer=42: Random number generator seed. Must be an integer.","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Transform-Inputs","page":"ClusterUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Transform-Outputs","page":"ClusterUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Operations","page":"ClusterUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"transform(mach, X, y): resample the data X and y using ClusterUndersampler, returning the undersampled versions","category":"page"},{"location":"models/ClusterUndersampler_Imbalance/#Example","page":"ClusterUndersampler","title":"Example","text":"","category":"section"},{"location":"models/ClusterUndersampler_Imbalance/","page":"ClusterUndersampler","title":"ClusterUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n \njulia> Imbalance.checkbalance(y; ref=\"minority\")\n 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load cluster_undersampling\nClusterUndersampler = @load ClusterUndersampler pkg=Imbalance\n\n## wrap the model in a machine\nundersampler = ClusterUndersampler(mode=\"nearest\", \n ratios=Dict(0=>1.0, 1=> 1.0, 2=>1.0), rng=42)\nmach = machine(undersampler)\n\n## provide the data to transform (there is nothing to fit)\nX_under, y_under = transform(mach, X, y)\n\n \njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#MultitargetRidgeRegressor_MultivariateStats","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"MultitargetRidgeRegressor","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"A model type for constructing a multitarget ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"MultitargetRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Do model = MultitargetRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetRidgeRegressor(lambda=...).","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Multi-target ridge regression adds a quadratic penalty term to multi-target least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. In this case, the output represents a response vector. Options exist to specify a bias term, and to adjust the strength of the penalty term.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Training-data","page":"MultitargetRidgeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Hyper-parameters","page":"MultitargetRidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.\nbias=true: Include the bias term if true, otherwise fit without bias term.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Operations","page":"MultitargetRidgeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Fitted-parameters","page":"MultitargetRidgeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"coefficients: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/#Examples","page":"MultitargetRidgeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"using MLJ\nusing DataFrames\n\nRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats\n\nX, y = make_regression(100, 6; n_targets = 2) ## a table and a table (synthetic data)\n\nridge_regressor = RidgeRegressor(lambda=1.5)\nmach = machine(ridge_regressor, X, y) |> fit!\n\nXnew, _ = make_regression(3, 6)\nyhat = predict(mach, Xnew) ## new predictions","category":"page"},{"location":"models/MultitargetRidgeRegressor_MultivariateStats/","page":"MultitargetRidgeRegressor","title":"MultitargetRidgeRegressor","text":"See also LinearRegressor, MultitargetLinearRegressor, RidgeRegressor","category":"page"},{"location":"frequently_asked_questions/#Frequently-Asked-Questions","page":"FAQ","title":"Frequently Asked Questions","text":"","category":"section"},{"location":"frequently_asked_questions/#Julia-already-has-a-great-machine-learning-toolbox,-ScitkitLearn.jl.-Why-MLJ?","page":"FAQ","title":"Julia already has a great machine learning toolbox, ScitkitLearn.jl. Why MLJ?","text":"","category":"section"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"An alternative machine learning toolbox for Julia users is ScikitLearn.jl. Initially intended as a Julia wrapper for the popular python library scikit-learn, ML algorithms written in Julia can also implement the ScikitLearn.jl API. Meta-algorithms (systematic tuning, pipelining, etc) remain python wrapped code, however.","category":"page"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"While ScikitLearn.jl provides the Julia user with access to a mature and large library of machine learning models, the scikit-learn API on which it is modeled, dating back to 2007, is not likely to evolve significantly in the future. MLJ enjoys (or will enjoy) several features that should make it an attractive alternative in the longer term:","category":"page"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"One language. ScikitLearn.jl wraps Python code, which in turn wraps C code for performance-critical routines. A Julia machine learning algorithm that implements the MLJ model interface is 100% Julia. Writing code in Julia is almost as fast as Python and well-written Julia code runs almost as fast as C. Additionally, a single language design provides superior interoperability. For example, one can implement: (i) gradient-descent tuning of hyperparameters, using automatic differentiation libraries such as Flux.jl; and (ii) GPU performance boosts without major code refactoring, using CuArrays.jl.\nRegistry for model metadata. In ScikitLearn.jl the list of available models, as well as model metadata (whether a model handles categorical inputs, whether it can make probabilistic predictions, etc) must be gleaned from the documentation. In MLJ, this information is more structured and is accessible to MLJ via a searchable model registry (without the models needing to be loaded).\nFlexible API for model composition. Pipelines in scikit-learn are more of an afterthought than an integral part of the original design. By contrast, MLJ's user-interaction API was predicated on the requirements of a flexible \"learning network\" API, one that allows models to be connected in essentially arbitrary ways (such as Wolpert model stacks). Networks can be built and tested in stages before being exported as first-class stand-alone models. Networks feature \"smart\" training (only necessary components are retrained after parameter changes) and will eventually be trainable using a DAG scheduler.\nClean probabilistic API. The scikit-learn API does not specify a universal standard for the form of probabilistic predictions. By fixing a probabilistic API along the lines of the skpro project, MLJ aims to improve support for Bayesian statistics and probabilistic graphical models.\nUniversal adoption of categorical data types. Python's scientific array library NumPy has no dedicated data type for representing categorical data (i.e., no type that tracks the pool of all possible classes). Generally, scikit-learn models deal with this by requiring data to be relabeled as integers. However, the naive user trains a model on relabeled categorical data only to discover that evaluation on a test set crashes their code because a categorical feature takes on a value not observed in training. MLJ mitigates such issues by insisting on the use of categorical data types, and by insisting that MLJ model implementations preserve the class pools. If, for example, a training target contains classes in the pool that do not appear in the training set, a probabilistic prediction will nevertheless predict a distribution whose support includes the missing class, but which is appropriately weighted with probability zero.","category":"page"},{"location":"frequently_asked_questions/","page":"FAQ","title":"FAQ","text":"Finally, we note that a large number of ScikitLearn.jl models are now wrapped for use in MLJ.","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/#AffinityPropagation_MLJScikitLearnInterface","page":"AffinityPropagation","title":"AffinityPropagation","text":"","category":"section"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"AffinityPropagation","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"A model type for constructing a Affinity Propagation Clustering of data, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"AffinityPropagation = @load AffinityPropagation pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"Do model = AffinityPropagation() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damping=...).","category":"page"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/#Hyper-parameters","page":"AffinityPropagation","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/AffinityPropagation_MLJScikitLearnInterface/","page":"AffinityPropagation","title":"AffinityPropagation","text":"damping = 0.5\nmax_iter = 200\nconvergence_iter = 15\ncopy = true\npreference = nothing\naffinity = euclidean\nverbose = false","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/#LogisticCVClassifier_MLJScikitLearnInterface","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"","category":"section"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"LogisticCVClassifier","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"A model type for constructing a logistic regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"LogisticCVClassifier = @load LogisticCVClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"Do model = LogisticCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticCVClassifier(Cs=...).","category":"page"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"LogisticCVClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LogisticCVClassifier_MLJScikitLearnInterface/","page":"LogisticCVClassifier","title":"LogisticCVClassifier","text":"Cs = 10\nfit_intercept = true\ncv = 5\ndual = false\npenalty = l2\nscoring = nothing\nsolver = lbfgs\ntol = 0.0001\nmax_iter = 100\nclass_weight = nothing\nn_jobs = nothing\nverbose = 0\nrefit = true\nintercept_scaling = 1.0\nmulti_class = auto\nrandom_state = nothing\nl1_ratios = nothing","category":"page"},{"location":"models/ROSE_Imbalance/#ROSE_Imbalance","page":"ROSE","title":"ROSE","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Initiate a ROSE model with the given hyper-parameters.","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"ROSE","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"A model type for constructing a rose, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"ROSE = @load ROSE pkg=Imbalance","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Do model = ROSE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ROSE(s=...).","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"ROSE implements the ROSE (Random Oversampling Examples) algorithm to correct for class imbalance as in G Menardi, N. Torelli, “Training and assessing classification rules with imbalanced data,” Data Mining and Knowledge Discovery, 28(1), pp.92-122, 2014.","category":"page"},{"location":"models/ROSE_Imbalance/#Training-data","page":"ROSE","title":"Training data","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"For default values of the hyper-parameters, model can be constructed by model = ROSE()","category":"page"},{"location":"models/ROSE_Imbalance/#Hyperparameters","page":"ROSE","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"s::float: A parameter that proportionally controls the bandwidth of the Gaussian kernel\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/ROSE_Imbalance/#Transform-Inputs","page":"ROSE","title":"Transform Inputs","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/ROSE_Imbalance/#Transform-Outputs","page":"ROSE","title":"Transform Outputs","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/ROSE_Imbalance/#Operations","page":"ROSE","title":"Operations","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"transform(mach, X, y): resample the data X and y using ROSE, returning both the new and original observations","category":"page"},{"location":"models/ROSE_Imbalance/#Example","page":"ROSE","title":"Example","text":"","category":"section"},{"location":"models/ROSE_Imbalance/","page":"ROSE","title":"ROSE","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n## load ROSE\nROSE = @load ROSE pkg=Imbalance\n\n## wrap the model in a machine\noversampler = ROSE(s=0.3, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"thresholding_probabilistic_predictors/#Thresholding-Probabilistic-Predictors","page":"Thresholding Probabilistic Predictors","title":"Thresholding Probabilistic Predictors","text":"","category":"section"},{"location":"thresholding_probabilistic_predictors/","page":"Thresholding Probabilistic Predictors","title":"Thresholding Probabilistic Predictors","text":"Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.","category":"page"},{"location":"thresholding_probabilistic_predictors/","page":"Thresholding Probabilistic Predictors","title":"Thresholding Probabilistic Predictors","text":"The positive class is always the second class returned when calling levels on the training target y.","category":"page"},{"location":"thresholding_probabilistic_predictors/","page":"Thresholding Probabilistic Predictors","title":"Thresholding Probabilistic Predictors","text":"MLJModels.BinaryThresholdPredictor","category":"page"},{"location":"thresholding_probabilistic_predictors/#MLJModels.BinaryThresholdPredictor","page":"Thresholding Probabilistic Predictors","title":"MLJModels.BinaryThresholdPredictor","text":"BinaryThresholdPredictor(model; threshold=0.5)\n\nWrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.\n\nBy convention the positive class is the second class returned by levels(y), where y is the target.\n\nIf threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.\n\nExample\n\nBelow is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.\n\nLoading the data:\n\nusing MLJ, Random\nrng = Xoshiro(123)\n\ndiabetes = OpenML.load(43582)\noutcome, X = unpack(diabetes, ==(:Outcome), rng=rng);\ny = coerce(Int.(outcome), OrderedFactor);\n\nChoosing a probabilistic classifier:\n\nEvoTreesClassifier = @load EvoTreesClassifier\nprob_predictor = EvoTreesClassifier()\n\nWrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:\n\npoint_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)\nXnew, _ = make_moons(3, rng=rng)\nmach = machine(point_predictor, X, y) |> fit!\npredict(mach, X)[1:3] # [0, 0, 0]\n\nEstimating performance:\n\nbalanced = BalancedAccuracy(adjusted=true)\ne = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])\ne.measurement[1] # 0.405 ± 0.089\n\nWrapping in tuning strategy to learn threshold that maximizes balanced accuracy:\n\nr = range(point_predictor, :threshold, lower=0.1, upper=0.9)\ntuned_point_predictor = TunedModel(\n point_predictor,\n tuning=RandomSearch(rng=rng),\n resampling=CV(nfolds=6),\n range = r,\n measure=balanced,\n n=30,\n)\nmach2 = machine(tuned_point_predictor, X, y) |> fit!\noptimized_point_predictor = report(mach2).best_model\noptimized_point_predictor.threshold # 0.260\npredict(mach2, X)[1:3] # [1, 1, 0]\n\nEstimating the performance of the auto-thresholding model (nested resampling here):\n\ne = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])\ne.measurement[1] # 0.477 ± 0.110\n\n\n\n\n\n","category":"type"},{"location":"simple_user_defined_models/#Simple-User-Defined-Models","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"","category":"section"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"To quickly implement a new supervised model in MLJ, it suffices to:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"Define a mutable struct to store hyperparameters. This is either a subtype of Probabilistic or Deterministic, depending on whether probabilistic or ordinary point predictions are intended. This struct is the model.\nDefine a fit method, dispatched on the model, returning learned parameters, also known as the fitresult.\nDefine a predict method, dispatched on the model, and the fitresult, to return predictions on new patterns.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"In the examples below, the training input X of fit, and the new input Xnew passed to predict, are tables. Each training target y is an AbstractVector.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"The predictions returned by predict have the same form as y for deterministic models, but are Vectors of distributions for probabilistic models.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"Advanced model functionality not addressed here includes: (i) optional update method to avoid redundant calculations when calling fit! on machines a second time; (ii) reporting extra training-related statistics; (iii) exposing model-specific functionality; (iv) checking the scientific type of data passed to your model in machine construction; and (iv) checking the validity of hyperparameter values. All this is described in Adding Models for General Use.","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"For an unsupervised model, implement transform and, optionally, inverse_transform using the same signature at predict below.","category":"page"},{"location":"simple_user_defined_models/#A-simple-deterministic-regressor","page":"Simple User Defined Models","title":"A simple deterministic regressor","text":"","category":"section"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"Here's a quick-and-dirty implementation of a ridge regressor with no intercept:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"using MLJ; color_off() # hide\nimport MLJBase\nusing LinearAlgebra\n\nmutable struct MyRegressor <: MLJBase.Deterministic\n lambda::Float64\nend\nMyRegressor(; lambda=0.1) = MyRegressor(lambda)\n\n# fit returns coefficients minimizing a penalized rms loss function:\nfunction MLJBase.fit(model::MyRegressor, verbosity, X, y)\n x = MLJBase.matrix(X) # convert table to matrix\n fitresult = (x'x + model.lambda*I)\\(x'y) # the coefficients\n cache = nothing\n report = nothing\n return fitresult, cache, report\nend\n\n# predict uses coefficients to make a new prediction:\nMLJBase.predict(::MyRegressor, fitresult, Xnew) = MLJBase.matrix(Xnew) * fitresult\nnothing # hide","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"After loading this code, all MLJ's basic meta-algorithms can be applied to MyRegressor:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"using MLJ # hide\nX, y = @load_boston;\nmodel = MyRegressor(lambda=1.0)\nregressor = machine(model, X, y)\nevaluate!(regressor, resampling=CV(), measure=rms, verbosity=0)","category":"page"},{"location":"simple_user_defined_models/#A-simple-probabilistic-classifier","page":"Simple User Defined Models","title":"A simple probabilistic classifier","text":"","category":"section"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"The following probabilistic model simply fits a probability distribution to the MultiClass training target (i.e., ignores X) and returns this pdf for any new pattern:","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"using MLJ # hide\nimport MLJBase\nimport Distributions\n\nstruct MyClassifier <: MLJBase.Probabilistic\nend\n\n# `fit` ignores the inputs X and returns the training target y\n# probability distribution:\nfunction MLJBase.fit(model::MyClassifier, verbosity, X, y)\n fitresult = Distributions.fit(MLJBase.UnivariateFinite, y)\n cache = nothing\n report = nothing\n return fitresult, cache, report\nend\n\n# `predict` returns the passed fitresult (pdf) for all new patterns:\nMLJBase.predict(model::MyClassifier, fitresult, Xnew) =\n [fitresult for r in 1:nrows(Xnew)]","category":"page"},{"location":"simple_user_defined_models/","page":"Simple User Defined Models","title":"Simple User Defined Models","text":"X, y = @load_iris;\nmach = machine(MyClassifier(), X, y) |> fit!;\npredict(mach, selectrows(X, 1:2))","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/#BayesianRidgeRegressor_MLJScikitLearnInterface","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"","category":"section"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"BayesianRidgeRegressor","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"A model type for constructing a Bayesian ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"BayesianRidgeRegressor = @load BayesianRidgeRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"Do model = BayesianRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianRidgeRegressor(max_iter=...).","category":"page"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"BayesianRidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianRidgeRegressor_MLJScikitLearnInterface/","page":"BayesianRidgeRegressor","title":"BayesianRidgeRegressor","text":"max_iter = 300\ntol = 0.001\nalpha_1 = 1.0e-6\nalpha_2 = 1.0e-6\nlambda_1 = 1.0e-6\nlambda_2 = 1.0e-6\ncompute_score = false\nfit_intercept = true\ncopy_X = true\nverbose = false","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/#RidgeCVClassifier_MLJScikitLearnInterface","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"","category":"section"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"RidgeCVClassifier","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"A model type for constructing a ridge regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"RidgeCVClassifier = @load RidgeCVClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"Do model = RidgeCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVClassifier(alphas=...).","category":"page"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeCVClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeCVClassifier_MLJScikitLearnInterface/","page":"RidgeCVClassifier","title":"RidgeCVClassifier","text":"alphas = [0.1, 1.0, 10.0]\nfit_intercept = true\nscoring = nothing\ncv = 5\nclass_weight = nothing\nstore_cv_values = false","category":"page"},{"location":"models/ICA_MultivariateStats/#ICA_MultivariateStats","page":"ICA","title":"ICA","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"ICA","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"A model type for constructing a independent component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"ICA = @load ICA pkg=MultivariateStats","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Do model = ICA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ICA(outdim=...).","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Independent component analysis is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.","category":"page"},{"location":"models/ICA_MultivariateStats/#Training-data","page":"ICA","title":"Training data","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"mach = machine(model, X)","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Here:","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ICA_MultivariateStats/#Hyper-parameters","page":"ICA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"outdim::Int=0: The number of independent components to recover, set automatically if 0.\nalg::Symbol=:fastica: The algorithm to use (only :fastica is supported at the moment).\nfun::Symbol=:tanh: The approximate neg-entropy function, one of :tanh, :gaus.\ndo_whiten::Bool=true: Whether or not to perform pre-whitening.\nmaxiter::Int=100: The maximum number of iterations.\ntol::Real=1e-6: The convergence tolerance for change in the unmixing matrix W.\nmean::Union{Nothing, Real, Vector{Float64}}=nothing: mean to use, if nothing (default) centering is computed and applied, if zero, no centering; otherwise a vector of means can be passed.\nwinit::Union{Nothing,Matrix{<:Real}}=nothing: Initial guess for the unmixing matrix W: either an empty matrix (for random initialization of W), a matrix of size m × k (if do_whiten is true), or a matrix of size m × k. Here m is the number of components (columns) of the input.","category":"page"},{"location":"models/ICA_MultivariateStats/#Operations","page":"ICA","title":"Operations","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"transform(mach, Xnew): Return the component-separated version of input Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/ICA_MultivariateStats/#Fitted-parameters","page":"ICA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"projection: The estimated component matrix.\nmean: The estimated mean vector.","category":"page"},{"location":"models/ICA_MultivariateStats/#Report","page":"ICA","title":"Report","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data.\nmean: The mean of the untransformed training data, of length indim.","category":"page"},{"location":"models/ICA_MultivariateStats/#Examples","page":"ICA","title":"Examples","text":"","category":"section"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"using MLJ\n\nICA = @load ICA pkg=MultivariateStats\n\ntimes = range(0, 8, length=2000)\n\nsine_wave = sin.(2*times)\nsquare_wave = sign.(sin.(3*times))\nsawtooth_wave = map(t -> mod(2t, 2) - 1, times)\nsignals = hcat(sine_wave, square_wave, sawtooth_wave)\nnoisy_signals = signals + 0.2*randn(size(signals))\n\nmixing_matrix = [ 1 1 1; 0.5 2 1; 1.5 1 2]\nX = MLJ.table(noisy_signals*mixing_matrix)\n\nmodel = ICA(outdim = 3, tol=0.1)\nmach = machine(model, X) |> fit!\n\nX_unmixed = transform(mach, X)\n\nusing Plots\n\nplot(X.x2)\nplot(X.x2)\nplot(X.x3)\n\nplot(X_unmixed.x1)\nplot(X_unmixed.x2)\nplot(X_unmixed.x3)\n","category":"page"},{"location":"models/ICA_MultivariateStats/","page":"ICA","title":"ICA","text":"See also PCA, KernelPCA, FactorAnalysis, PPCA","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/#LarsCVRegressor_MLJScikitLearnInterface","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"","category":"section"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"LarsCVRegressor","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"A model type for constructing a least angle regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"LarsCVRegressor = @load LarsCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"Do model = LarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsCVRegressor(fit_intercept=...).","category":"page"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LarsCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LarsCVRegressor_MLJScikitLearnInterface/","page":"LarsCVRegressor","title":"LarsCVRegressor","text":"fit_intercept = true\nverbose = false\nmax_iter = 500\nprecompute = auto\ncv = 5\nmax_n_alphas = 1000\nn_jobs = nothing\neps = 2.220446049250313e-16\ncopy_X = true","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#LogisticClassifier_MLJLinearModels","page":"LogisticClassifier","title":"LogisticClassifier","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"A model type for constructing a logistic classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Do model = LogisticClassifier() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"This model is more commonly known as \"logistic regression\". It is a standard classifier for both binary and multiclass classification. The objective function applies either a logistic loss (binary target) or multinomial (softmax) loss, and has a mixed L1/L2 penalty:","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"$","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"L(y, Xθ) + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":".","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Here L is either MLJLinearModels.LogisticLoss or MLJLinearModels.MultiClassLoss, λ and γ indicate the strength of the L2 (resp. L1) regularization components and n is the number of training observations.","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"With scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"$","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"L(y, Xθ) + λ|θ|₂²/2 + γ|θ|₁ $","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":".","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#Training-data","page":"LogisticClassifier","title":"Training data","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"where:","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#Hyperparameters","page":"LogisticClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"lambda::Real: strength of the regularizer if penalty is :l2 or :l1 and strength of the L2 regularizer if penalty is :en. Default: eps()\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, Newton, NewtonCG, ProxGrad; but subject to the following restrictions:\nIf penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.\nUnless scitype(y) <: Finite{2} (binary target) Newton is disallowed.\nIf solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/#Example","page":"LogisticClassifier","title":"Example","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"using MLJ\nX, y = make_blobs(centers = 2)\nmach = fit!(machine(LogisticClassifier(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/LogisticClassifier_MLJLinearModels/","page":"LogisticClassifier","title":"LogisticClassifier","text":"See also MultinomialClassifier.","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/#BaggingRegressor_MLJScikitLearnInterface","page":"BaggingRegressor","title":"BaggingRegressor","text":"","category":"section"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"BaggingRegressor","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"A model type for constructing a bagging ensemble regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"BaggingRegressor = @load BaggingRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"Do model = BaggingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingRegressor(estimator=...).","category":"page"},{"location":"models/BaggingRegressor_MLJScikitLearnInterface/","page":"BaggingRegressor","title":"BaggingRegressor","text":"A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#KNNClassifier_NearestNeighborModels","page":"KNNClassifier","title":"KNNClassifier","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"KNNClassifier","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"A model type for constructing a K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"Do model = KNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNClassifier(K=...).","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"KNNClassifier implements K-Nearest Neighbors classifier which is non-parametric algorithm that predicts a discrete class distribution associated with a new point by taking a vote over the classes of the k-nearest points. Each neighbor vote is assigned a weight based on proximity of the neighbor point to the test point according to a specified distance metric.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"For more information about the weighting kernels, see the paper by Geler et.al Comparison of different weighting schemes for the kNN classifier on time-series data. ","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Training-data","page":"KNNClassifier","title":"Training data","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"OR","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"Here:","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is <:Finite (<:Multiclass or <:OrderedFactor will do); check the scitype with scitype(y)\nw is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Hyper-parameters","page":"KNNClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Operations","page":"KNNClassifier","title":"Operations","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Fitted-parameters","page":"KNNClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/#Examples","page":"KNNClassifier","title":"Examples","text":"","category":"section"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"using MLJ\nKNNClassifier = @load KNNClassifier pkg=NearestNeighborModels\nX, y = @load_crabs; ## a table and a vector from the crabs dataset\n## view possible kernels\nNearestNeighborModels.list_kernels()\n## KNNClassifier instantiation\nmodel = KNNClassifier(weights = NearestNeighborModels.Inverse())\nmach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)\n","category":"page"},{"location":"models/KNNClassifier_NearestNeighborModels/","page":"KNNClassifier","title":"KNNClassifier","text":"See also MultitargetKNNClassifier","category":"page"},{"location":"models/KMedoids_Clustering/#KMedoids_Clustering","page":"KMedoids","title":"KMedoids","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"KMedoids","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"A model type for constructing a K-medoids clusterer, based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"KMedoids = @load KMedoids pkg=Clustering","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"Do model = KMedoids() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMedoids(k=...).","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"K-medoids is a clustering algorithm that works by finding k data points (called medoids) such that the total distance between each data point and the closest medoid is minimal.","category":"page"},{"location":"models/KMedoids_Clustering/#Training-data","page":"KMedoids","title":"Training data","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"mach = machine(model, X)","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"Here:","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X)","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KMedoids_Clustering/#Hyper-parameters","page":"KMedoids","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"k=3: The number of centroids to use in clustering.\nmetric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.\ninit (defaults to :kmpp): how medoids should be initialized, could be one of the following:\n:kmpp: KMeans++\n:kmenc: K-medoids initialization based on centrality\n:rand: random\nan instance of Clustering.SeedingAlgorithm from Clustering.jl\nan integer vector of length k that provides the indices of points to use as initial medoids.\nSee documentation of Clustering.jl.","category":"page"},{"location":"models/KMedoids_Clustering/#Operations","page":"KMedoids","title":"Operations","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.\ntransform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.","category":"page"},{"location":"models/KMedoids_Clustering/#Fitted-parameters","page":"KMedoids","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"medoids: The coordinates of the cluster medoids.","category":"page"},{"location":"models/KMedoids_Clustering/#Report","page":"KMedoids","title":"Report","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"The fields of report(mach) are:","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"assignments: The cluster assignments of each point in the training data.\ncluster_labels: The labels assigned to each cluster.","category":"page"},{"location":"models/KMedoids_Clustering/#Examples","page":"KMedoids","title":"Examples","text":"","category":"section"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"using MLJ\nKMedoids = @load KMedoids pkg=Clustering\n\ntable = load_iris()\ny, X = unpack(table, ==(:target), rng=123)\nmodel = KMedoids(k=3)\nmach = machine(model, X) |> fit!\n\nyhat = predict(mach, X)\n@assert yhat == report(mach).assignments\n\ncompare = zip(yhat, y) |> collect;\ncompare[1:8] ## clusters align with classes\n\ncenter_dists = transform(mach, fitted_params(mach).medoids')\n\n@assert center_dists[1][1] == 0.0\n@assert center_dists[2][2] == 0.0\n@assert center_dists[3][3] == 0.0","category":"page"},{"location":"models/KMedoids_Clustering/","page":"KMedoids","title":"KMedoids","text":"See also KMeans","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#RandomWalkOversampler_Imbalance","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Initiate a RandomWalkOversampler model with the given hyper-parameters.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"RandomWalkOversampler","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"A model type for constructing a random walk oversampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"RandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Do model = RandomWalkOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomWalkOversampler(ratios=...).","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"RandomWalkOversampler implements the random walk oversampling algorithm to correct for class imbalance as in Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 25, 4-20.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Training-data","page":"RandomWalkOversampler","title":"Training data","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"mach = machine(model)","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"model = RandomWalkOversampler()","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Hyperparameters","page":"RandomWalkOversampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"ratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Transform-Inputs","page":"RandomWalkOversampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":" elements in continuous columns should subtype `Infinite` (i.e., have \n [scitype](https://juliaai.github.io/ScientificTypes.jl/) `Count` or `Continuous`).","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"y: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Transform-Outputs","page":"RandomWalkOversampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Operations","page":"RandomWalkOversampler","title":"Operations","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"transform(mach, X, y): resample the data X and y using RandomWalkOversampler, returning both the new and original observations","category":"page"},{"location":"models/RandomWalkOversampler_Imbalance/#Example","page":"RandomWalkOversampler","title":"Example","text":"","category":"section"},{"location":"models/RandomWalkOversampler_Imbalance/","page":"RandomWalkOversampler","title":"RandomWalkOversampler","text":"using MLJ\nusing ScientificTypes\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows = 100\nnum_continuous_feats = 3\n## want two categorical features with three and two possible values respectively\nnum_vals_per_category = [3, 2]\n\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, num_vals_per_category, rng=42) \njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n\njulia> ScientificTypes.schema(X).scitypes\n(Continuous, Continuous, Continuous, Continuous, Continuous)\n## coerce nominal columns to a finite scitype (multiclass or ordered factor)\nX = coerce(X, :Column4=>Multiclass, :Column5=>Multiclass)\n\n## load RandomWalkOversampler model type:\nRandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance\n\n## oversample the minority classes to sizes relative to the majority class:\noversampler = RandomWalkOversampler(ratios = Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng = 42)\nmach = machine(oversampler)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#EvoTreeRegressor_EvoTrees","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"EvoTreeRegressor(;kwargs...)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"A model type for constructing a EvoTreeRegressor, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Hyper-parameters","page":"EvoTreeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"loss=:mse: Loss to be be minimized during training. One of:\n:mse\n:logloss\n:gamma\n:tweedie\n:quantile\n:l1\nnrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nalpha::T=0.5: Loss specific parameter in the [0, 1] range: - :quantile: target quantile for the regression. - :l1: weighting parameters to positive vs negative residuals. - Positive residual weights = alpha - Negative residual weights = (1 - alpha)\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). Only :linear, :logistic, :gamma and tweedie losses are supported at the moment.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Internal-API","page":"EvoTreeRegressor","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Do config = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Training-model","page":"EvoTreeRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Inference","page":"EvoTreeRegressor","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Predictions are obtained using predict which returns a Vector of length nobs:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"model(X)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#MLJ-Interface","page":"EvoTreeRegressor","title":"MLJ Interface","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Do model = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Training-model-2","page":"EvoTreeRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Operations","page":"EvoTreeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Fitted-parameters","page":"EvoTreeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Report","page":"EvoTreeRegressor","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/#Examples","page":"EvoTreeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeRegressor_EvoTrees/","page":"EvoTreeRegressor","title":"EvoTreeRegressor","text":"## MLJ Interface\nusing MLJ\nEvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees\nmodel = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_boston\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#KNNDetector_OutlierDetectionNeighbors","page":"KNNDetector","title":"KNNDetector","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"KNNDetector(k=5,\n metric=Euclidean,\n algorithm=:kdtree,\n leafsize=10,\n reorder=true,\n reduction=:maximum)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Calculate the anomaly score of an instance based on the distance to its k-nearest neighbors.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#Parameters","page":"KNNDetector","title":"Parameters","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"k::Integer","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"metric::Metric","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"leafsize::Int","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"reorder::Bool","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"parallel::Bool","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"reduction::Symbol","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"One of (:maximum, :median, :mean). (reduction=:maximum) was proposed by [1]. Angiulli et al. [2] proposed sum to reduce the distances, but mean has been implemented for numerical stability.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#Examples","page":"KNNDetector","title":"Examples","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"using OutlierDetection: KNNDetector, fit, transform\ndetector = KNNDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/#References","page":"KNNDetector","title":"References","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"[1] Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000): Efficient Algorithms for Mining Outliers from Large Data Sets.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionNeighbors/","page":"KNNDetector","title":"KNNDetector","text":"[2] Angiulli, Fabrizio; Pizzuti, Clara (2002): Fast Outlier Detection in High Dimensional Spaces.","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/#RANSACRegressor_MLJScikitLearnInterface","page":"RANSACRegressor","title":"RANSACRegressor","text":"","category":"section"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"RANSACRegressor","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"A model type for constructing a ransac regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"RANSACRegressor = @load RANSACRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"Do model = RANSACRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RANSACRegressor(estimator=...).","category":"page"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"RANSACRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RANSACRegressor_MLJScikitLearnInterface/","page":"RANSACRegressor","title":"RANSACRegressor","text":"estimator = nothing\nmin_samples = 5\nresidual_threshold = nothing\nis_data_valid = nothing\nis_model_valid = nothing\nmax_trials = 100\nmax_skips = 9223372036854775807\nstop_n_inliers = 9223372036854775807\nstop_score = Inf\nstop_probability = 0.99\nloss = absolute_error\nrandom_state = nothing","category":"page"},{"location":"models/NuSVR_LIBSVM/#NuSVR_LIBSVM","page":"NuSVR","title":"NuSVR","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"NuSVR","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"A model type for constructing a ν-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"NuSVR = @load NuSVR pkg=LIBSVM","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"Do model = NuSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVR(kernel=...).","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"This model is a re-parameterization of EpsilonSVR in which the epsilon hyper-parameter is replaced with a new parameter nu (denoted ν in the cited reference) which attempts to control the number of support vectors directly.","category":"page"},{"location":"models/NuSVR_LIBSVM/#Training-data","page":"NuSVR","title":"Training data","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"where","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/NuSVR_LIBSVM/#Hyper-parameters","page":"NuSVR","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be\ncalled, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\nnu=0.5 (range (0, 1]): An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of some neighborhood of the graph of the prediction function (\"tube\" or \"slab\") and a training error is said to occur when a data point (x, y) lies outside of that neighborhood.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/NuSVR_LIBSVM/#Operations","page":"NuSVR","title":"Operations","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/NuSVR_LIBSVM/#Fitted-parameters","page":"NuSVR","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"libsvm_model: the trained model object created by the LIBSVM.jl package","category":"page"},{"location":"models/NuSVR_LIBSVM/#Report","page":"NuSVR","title":"Report","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/NuSVR_LIBSVM/#Examples","page":"NuSVR","title":"Examples","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/#Using-a-built-in-kernel","page":"NuSVR","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"using MLJ\nimport LIBSVM\n\nNuSVR = @load NuSVR pkg=LIBSVM ## model type\nmodel = NuSVR(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = make_regression(rng=123) ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew, _ = make_regression(3, rng=123)\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 0.2008156459920009\n 0.1131520519131709\n -0.2076156254934889","category":"page"},{"location":"models/NuSVR_LIBSVM/#User-defined-kernels","page":"NuSVR","title":"User-defined kernels","text":"","category":"section"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = NuSVR(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element Vector{Float64}:\n 1.1211558175964662\n 0.06677125944808422\n -0.6817578942749346","category":"page"},{"location":"models/NuSVR_LIBSVM/","page":"NuSVR","title":"NuSVR","text":"See also EpsilonSVR, LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#CatBoostClassifier_CatBoost","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"CatBoostClassifier","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"A model type for constructing a CatBoost classifier, based on CatBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"CatBoostClassifier = @load CatBoostClassifier pkg=CatBoost","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"Do model = CatBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostClassifier(iterations=...).","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Training-data","page":"CatBoostClassifier","title":"Training data","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"where","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.\ny: the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with scitype(y)","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Hyper-parameters","page":"CatBoostClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Operations","page":"CatBoostClassifier","title":"Operations","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.\npredict_mode(mach, Xnew): returns the mode of each of the prediction above.","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Accessor-functions","page":"CatBoostClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Fitted-parameters","page":"CatBoostClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"model: The Python CatBoostClassifier model","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Report","page":"CatBoostClassifier","title":"Report","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"feature_importances: Vector{Pair{Symbol, Float64}} of feature importances","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/#Examples","page":"CatBoostClassifier","title":"Examples","text":"","category":"section"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"using CatBoost.MLJCatBoostInterface\nusing MLJ\n\nX = (\n duration = [1.5, 4.1, 5.0, 6.7], \n n_phone_calls = [4, 5, 6, 7], \n department = coerce([\"acc\", \"ops\", \"acc\", \"ops\"], Multiclass), \n)\ny = coerce([0, 0, 1, 1], Multiclass)\n\nmodel = CatBoostClassifier(iterations=5)\nmach = machine(model, X, y)\nfit!(mach)\nprobs = predict(mach, X)\npreds = predict_mode(mach, X)","category":"page"},{"location":"models/CatBoostClassifier_CatBoost/","page":"CatBoostClassifier","title":"CatBoostClassifier","text":"See also catboost and the unwrapped model type CatBoost.CatBoostClassifier.","category":"page"},{"location":"getting_started/#Getting-Started","page":"Getting Started","title":"Getting Started","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For an outline of MLJ's goals and features, see About MLJ.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This page introduces some MLJ basics, assuming some familiarity with machine learning. For a complete list of other MLJ learning resources, see Learning MLJ.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"MLJ collects together the functionality provided by mutliple packages. To learn how to install components separately, run using MLJ; @doc MLJ.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This section introduces only the most basic MLJ operations and concepts. It assumes MLJ has been successfully installed. See Installation if this is not the case.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"import Random.seed!\nusing MLJ\nusing InteractiveUtils\nMLJ.color_off()\nseed!(1234)","category":"page"},{"location":"getting_started/#Choosing-and-evaluating-a-model","page":"Getting Started","title":"Choosing and evaluating a model","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The following code loads Fisher's famous iris data set as a named tuple of column vectors:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"using MLJ\niris = load_iris();\nselectrows(iris, 1:3) |> pretty\nschema(iris)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Because this data format is compatible with Tables.jl (and satisfies Tables.istable(iris) == true) many MLJ methods (such as selectrows, pretty and schema used above) as well as many MLJ models can work with it. However, as most new users are already familiar with the access methods particular to DataFrames (also compatible with Tables.jl) we'll put our data into that format here:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"import DataFrames\niris = DataFrames.DataFrame(iris);\nnothing # hide","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Next, let's split the data \"horizontally\" into input and target parts, and specify an RNG seed, to force observations to be shuffled:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"y, X = unpack(iris, ==(:target); rng=123);\nfirst(X, 3) |> pretty","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This call to unpack splits off any column with name == to :target into something called y, and all the remaining columns into X.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"To list all models available in MLJ's model registry do models(). Listing the models compatible with the present data:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"models(matching(X,y))","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"In MLJ a model is a struct storing the hyperparameters of the learning algorithm indicated by the struct name (and nothing else). For common problems matching data to models, see Model Search and Preparing Data.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"To see the documentation for DecisionTreeClassifier (without loading its defining code) do","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Assuming the MLJDecisionTreeInterface.jl package is in your load path (see Installation) we can use @load to import the DecisionTreeClassifier model type, which we will bind to Tree:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"(In this case, we need to specify pkg=... because multiple packages provide a model type with the name DecisionTreeClassifier.) Now we can instantiate a model with default hyperparameters:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"tree = Tree()","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Important: DecisionTree.jl and most other packages implementing machine learning algorithms for use in MLJ are not MLJ dependencies. If such a package is not in your load path you will receive an error explaining how to add the package to your current environment. Alternatively, you can use the interactive macro @iload. For more on importing model types, see Loading Model Code.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Once instantiated, a model's performance can be evaluated with the evaluate method. Our classifier is a probabilistic predictor (check prediction_type(tree) == :probabilistic) which means we can specify a probabilistic measure (metric) like log_loss, as well deterministic measures like accuracy (which are applied after computing the mode of each prediction):","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"evaluate(tree, X, y,\n resampling=CV(shuffle=true),\n measures=[log_loss, accuracy],\n verbosity=0)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Under the hood, evaluate calls lower level functions predict or predict_mode according to the type of measure, as shown in the output. We shall call these operations directly below.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For more on performance evaluation, see Evaluating Model Performance for details.","category":"page"},{"location":"getting_started/#A-preview-of-data-type-specification-in-MLJ","page":"Getting Started","title":"A preview of data type specification in MLJ","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The target y above is a categorical vector, which is appropriate because our model is a decision tree classifier:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"typeof(y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"However, MLJ models do not prescribe the machine types for the data they operate on. Rather, they specify a scientific type, which refers to the way data is to be interpreted, as opposed to how it is encoded:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"target_scitype(tree)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Here Finite is an example of a \"scalar\" scientific type with two subtypes:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"subtypes(Finite)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"We use the scitype function to check how MLJ is going to interpret given data. Our choice of encoding for y works for DecisionTreeClassifier, because we have:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"scitype(y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"and Multiclass{3} <: Finite. If we would encode with integers instead, we obtain:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"yint = int.(y);\nscitype(yint)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"and using yint in place of y in classification problems will fail. See also Working with Categorical Data.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For more on scientific types, see Data containers and scientific types below.","category":"page"},{"location":"getting_started/#Fit-and-predict","page":"Getting Started","title":"Fit and predict","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"To illustrate MLJ's fit and predict interface, let's perform our performance evaluations by hand, but using a simple holdout set, instead of cross-validation.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Wrapping the model in data creates a machine which will store training outcomes:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"mach = machine(tree, X, y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Training and testing on a hold-out set:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"train, test = partition(eachindex(y), 0.7); # 70:30 split\nfit!(mach, rows=train);\nyhat = predict(mach, X[test,:]);\nyhat[3:5]\nlog_loss(yhat, y[test])","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Note that log_loss and cross_entropy are aliases for LogLoss() (which can be passed an optional keyword parameter, as in LogLoss(tol=0.001)). For a list of all losses and scores, and their aliases, run measures().","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Notice that yhat is a vector of Distribution objects, because DecisionTreeClassifier makes probabilistic predictions. The methods of the Distributions.jl package can be applied to such distributions:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"broadcast(pdf, yhat[3:5], \"virginica\") # predicted probabilities of virginica\nbroadcast(pdf, yhat, y[test])[3:5] # predicted probability of observed class\nmode.(yhat[3:5])","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Or, one can explicitly get modes by using predict_mode instead of predict:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"predict_mode(mach, X[test[3:5],:])","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Finally, we note that pdf() is overloaded to allow the retrieval of probabilities for all levels at once:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"L = levels(y)\npdf(yhat[3:5], L)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Unsupervised models have a transform method instead of predict, and may optionally implement an inverse_transform method:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"v = Float64[1, 2, 3, 4]\nstand = Standardizer() # this type is built-in\nmach2 = machine(stand, v)\nfit!(mach2)\nw = transform(mach2, v)\ninverse_transform(mach2, w)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Machines have an internal state which allows them to avoid redundant calculations when retrained, in certain conditions - for example when increasing the number of trees in a random forest, or the number of epochs in a neural network. The machine-building syntax also anticipates a more general syntax for composing multiple models, an advanced feature explained in Learning Networks.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"There is a version of evaluate for machines as well as models. This time we'll use a simple holdout strategy as above. (An exclamation point is added to the method name because machines are generally mutated when trained.)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"evaluate!(mach, resampling=Holdout(fraction_train=0.7),\n measures=[log_loss, accuracy],\n verbosity=0)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Changing a hyperparameter and re-evaluating:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"tree.max_depth = 3;\nevaluate!(mach, resampling=Holdout(fraction_train=0.7),\n measures=[log_loss, accuracy],\n verbosity=0)","category":"page"},{"location":"getting_started/#Next-steps","page":"Getting Started","title":"Next steps","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For next steps, consult the Learning MLJ section. At the least, we recommned you read the remainder of this page before considering serious use of MLJ.","category":"page"},{"location":"getting_started/#Data-containers-and-scientific-types","page":"Getting Started","title":"Data containers and scientific types","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The MLJ user should acquaint themselves with some basic assumptions about the form of data expected by MLJ, as outlined below. The basic machine constructors look like this (see also Constructing machines):","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"machine(model::Unsupervised, X)\nmachine(model::Supervised, X, y)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Each supervised model in MLJ declares the permitted scientific type of the inputs X and targets y that can be bound to it in the first constructor above, rather than specifying specific machine types (such as Array{Float32, 2}). Similar remarks apply to the input X of an unsupervised model.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Scientific types are julia types defined in the package ScientificTypesBase.jl; the package ScientificTypes.jl implements the particular convention used in the MLJ universe for assigning a specific scientific type (interpretation) to each julia object (see the scitype examples below).","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The basic \"scalar\" scientific types are Continuous, Multiclass{N}, OrderedFactor{N}, Count and Textual. Missing and Nothing are also considered scientific types. Be sure you read Scalar scientific types below to guarantee your scalar data is interpreted correctly. Tools exist to coerce the data to have the appropriate scientific type; see ScientificTypes.jl or run ?coerce for details.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Additionally, most data containers - such as tuples, vectors, matrices and tables - have a scientific type parameterized by scitype of the elements they contain.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"(Image: )","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Figure 1. Part of the scientific type hierarchy in ScientificTypesBase.jl.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"scitype(4.6)\nscitype(42)\nx1 = coerce([\"yes\", \"no\", \"yes\", \"maybe\"], Multiclass);\nscitype(x1)\nX = (x1=x1, x2=rand(4), x3=rand(4)) # a \"column table\"\nscitype(X)","category":"page"},{"location":"getting_started/#Two-dimensional-data","page":"Getting Started","title":"Two-dimensional data","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Generally, two-dimensional data in MLJ is expected to be tabular. All data containers X compatible with the Tables.jl interface and sastisfying Tables.istable(X) == true (most of the formats in this list) have the scientific type Table{K}, where K depends on the scientific types of the columns, which can be individually inspected using schema:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"schema(X)","category":"page"},{"location":"getting_started/#Matrix-data","page":"Getting Started","title":"Matrix data","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"MLJ models expecting a table do not generally accept a matrix instead. However, a matrix can be wrapped as a table, using MLJ.table:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"matrix_table = MLJ.table(rand(2,3));\nschema(matrix_table)","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The matrix is not copied, only wrapped. To manifest a table as a matrix, use MLJ.matrix.","category":"page"},{"location":"getting_started/#Observations-correspond-to-rows,-not-columns","page":"Getting Started","title":"Observations correspond to rows, not columns","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"When supplying models with matrices, or wrapping them in tables, each row should correspond to a different observation. That is, the matrix should be n x p, where n is the number of observations and p the number of features. However, some models may perform better if supplied the adjoint of a p x n matrix instead, and observation resampling is always more efficient in this case.","category":"page"},{"location":"getting_started/#Inputs","page":"Getting Started","title":"Inputs","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Since an MLJ model only specifies the scientific type of data, if that type is Table - which is the case for the majority of MLJ models - then any Tables.jl container X is permitted, so long as Tables.istable(X) == true.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Specifically, the requirement for an arbitrary model's input is scitype(X) <: input_scitype(model).","category":"page"},{"location":"getting_started/#Targets","page":"Getting Started","title":"Targets","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"The target y expected by MLJ models is generally an AbstractVector. A multivariate target y will generally be a table.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Specifically, the type requirement for a model target is scitype(y) <: target_scitype(model).","category":"page"},{"location":"getting_started/#Querying-a-model-for-acceptable-data-types","page":"Getting Started","title":"Querying a model for acceptable data types","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Given a model instance, one can inspect the admissible scientific types of its input and target, and without loading the code defining the model;","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"tree = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"i = info(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")\ni.input_scitype\ni.target_scitype","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"This output indicates that any table with Continuous, Count or OrderedFactor columns is acceptable as the input X, and that any vector with element scitype <: Finite is acceptable as the target y.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"For more on matching models to data, see Model Search.","category":"page"},{"location":"getting_started/#Scalar-scientific-types","page":"Getting Started","title":"Scalar scientific types","text":"","category":"section"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Models in MLJ will always apply the MLJ convention described in ScientificTypes.jl to decide how to interpret the elements of your container types. Here are the key features of that convention:","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Any AbstractFloat is interpreted as Continuous.\nAny Integer is interpreted as Count.\nAny CategoricalValue x, is interpreted as Multiclass or OrderedFactor, depending on the value of isordered(x).\nStrings and Chars are not interpreted as Multiclass or OrderedFactor (they have scitypes Textual and Unknown respectively).\nIn particular, integers (including Bools) cannot be used to represent categorical data. Use the preceding coerce operations to coerce to a Finite scitype.\nThe scientific types of nothing and missing are Nothing and Missing, native types we also regard as scientific.","category":"page"},{"location":"getting_started/","page":"Getting Started","title":"Getting Started","text":"Use coerce(v, OrderedFactor) or coerce(v, Multiclass) to coerce a vector v of integers, strings or characters to a vector with an appropriate Finite (categorical) scitype. See also Working with Categorical Data, and the ScientificTypes.jl documentation.","category":"page"},{"location":"transformers/#Transformers-and-Other-Unsupervised-Models","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised Models","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (a subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Finally, we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.","category":"page"},{"location":"transformers/#Built-in-transformers","page":"Transformers and Other Unsupervised models","title":"Built-in transformers","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"MLJModels.Standardizer\nMLJModels.OneHotEncoder\nMLJModels.ContinuousEncoder\nMLJModels.FillImputer\nMLJModels.UnivariateFillImputer\nFeatureSelection.FeatureSelector\nMLJModels.UnivariateBoxCoxTransformer\nMLJModels.UnivariateDiscretizer\nMLJModels.UnivariateTimeTypeToContinuous","category":"page"},{"location":"transformers/#MLJModels.Standardizer","page":"Transformers and Other Unsupervised models","title":"MLJModels.Standardizer","text":"Standardizer\n\nA model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nStandardizer = @load Standardizer pkg=MLJModels\n\nDo model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).\n\nUse this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: one of the following, with the behavior indicated below:\n[] (empty, the default): standardize all features (columns) having Continuous element scitype\nnon-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).\nfunction or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.\nNote this behavior is further modified if the ordered_factor or count flags are set to true; see below\nignore=false: whether to ignore or standardize specified features, as explained above\nordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above\ncount=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above\n\nOperations\n\ntransform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.\ninverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_fit - the names of features that will be standardized\nmeans - the corresponding untransformed mean values\nstds - the corresponding untransformed standard deviations\n\nReport\n\nThe fields of report(mach) are:\n\nfeatures_fit: the names of features that will be standardized\n\nExamples\n\nusing MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([:x, :y, :x], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\njulia> schema(X)\n┌──────────┬──────────────────┐\n│ names │ scitypes │\n├──────────┼──────────────────┤\n│ ordinal1 │ Count │\n│ ordinal2 │ OrderedFactor{2} │\n│ ordinal3 │ Continuous │\n│ ordinal4 │ Continuous │\n│ nominal │ Multiclass{3} │\n└──────────┴──────────────────┘\n\nstand1 = Standardizer();\n\njulia> transform(fit!(machine(stand1, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [-1.0, 0.0, 1.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\nstand2 = Standardizer(features=[:ordinal3, ], ignore=true, count=true);\n\njulia> transform(fit!(machine(stand2, X)), X)\n(ordinal1 = [-1.0, 0.0, 1.0],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\nSee also OneHotEncoder, ContinuousEncoder.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.OneHotEncoder","page":"Transformers and Other Unsupervised models","title":"MLJModels.OneHotEncoder","text":"OneHotEncoder\n\nA model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nOneHotEncoder = @load OneHotEncoder pkg=MLJModels\n\nDo model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).\n\nUse this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.\n\nNew data to be transformed may lack features present in the fit data, but no new features can be present.\n\nWarning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.\n\nTo ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.\nordered_factor=false: when true, OrderedFactor features are universally excluded\ndrop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nall_features: names of all features encountered in training\nfitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name\nref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded\n\nReport\n\nThe fields of report(mach) are:\n\nfeatures_to_be_encoded: names of input features to be encoded\nnew_features: names of all output features\n\nExample\n\nusing MLJ\n\nX = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n└───────────┴──────────────────┘\n\nhot = OneHotEncoder(drop_last=true)\nmach = fit!(machine(hot, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade__A │ Continuous │\n│ grade__B │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Count │\n└──────────────┴────────────┘\n\nSee also ContinuousEncoder.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.ContinuousEncoder","page":"Transformers and Other Unsupervised models","title":"MLJModels.ContinuousEncoder","text":"ContinuousEncoder\n\nA model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nContinuousEncoder = @load ContinuousEncoder pkg=MLJModels\n\nDo model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).\n\nUse this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:\n\nIf ftr is already Continuous retain it.\nIf ftr is Multiclass, one-hot encode it.\nIf ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.\nIf ftr is Count, replace it with coerce(ftr, Continuous).\nIf ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.\n\nWarning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.\n\nTo selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\ndrop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.\none_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_to_keep: names of features that will not be dropped from the table\none_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding\none_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model\n\nReport\n\nfeatures_to_keep: names of input features that will not be dropped from the table\nnew_features: names of all output features\n\nExample\n\nX = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3],\n comments=[\"the force\", \"be\", \"with you\", \"too\"])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n│ comments │ Textual │\n└───────────┴──────────────────┘\n\nencoder = ContinuousEncoder(drop_last=true)\nmach = fit!(machine(encoder, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Continuous │\n└──────────────┴────────────┘\n\njulia> setdiff(schema(X).names, report(mach).features_to_keep) # dropped features\n1-element Vector{Symbol}:\n :comments\n\n\nSee also OneHotEncoder\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.FillImputer","page":"Transformers and Other Unsupervised models","title":"MLJModels.FillImputer","text":"FillImputer\n\nA model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nFillImputer = @load FillImputer pkg=MLJModels\n\nDo model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).\n\nUse this model to impute missing values in tabular data. A fixed \"filler\" value is learned from the training data, one for each column of the table.\n\nFor imputing missing values in a vector, use UnivariateFillImputer instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as \"impute all\".\ncontinuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values\n\nOperations\n\ntransform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_seen_in_fit: the names of features (columns) encountered during training\nunivariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)\nfiller_given_feature: dictionary of filler values, keyed on feature (column) names\n\nExamples\n\nusing MLJ\nimputer = FillImputer()\n\nX = (a = [1.0, 2.0, missing, 3.0, missing],\n b = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass),\n c = [1, 1, 2, missing, 3])\n\nschema(X)\njulia> schema(X)\n┌───────┬───────────────────────────────┐\n│ names │ scitypes │\n├───────┼───────────────────────────────┤\n│ a │ Union{Missing, Continuous} │\n│ b │ Union{Missing, Multiclass{2}} │\n│ c │ Union{Missing, Count} │\n└───────┴───────────────────────────────┘\n\nmach = machine(imputer, X)\nfit!(mach)\n\njulia> fitted_params(mach).filler_given_feature\n(filler = 2.0,)\n\njulia> fitted_params(mach).filler_given_feature\nDict{Symbol, Any} with 3 entries:\n :a => 2.0\n :b => \"y\"\n :c => 2\n\njulia> transform(mach, X)\n(a = [1.0, 2.0, 2.0, 3.0, 2.0],\n b = CategoricalValue{String, UInt32}[\"y\", \"n\", \"y\", \"y\", \"y\"],\n c = [1, 1, 2, 2, 3],)\n\nSee also UnivariateFillImputer.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateFillImputer","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateFillImputer","text":"UnivariateFillImputer\n\nA model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels\n\nDo model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).\n\nUse this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.\n\nFor imputing missing values in tabular data, use FillImputer instead.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\ncontinuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values\n\nOperations\n\ntransform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfiller: the fill value to be imputed in all new data\n\nExamples\n\nusing MLJ\nimputer = UnivariateFillImputer()\n\nx_continuous = [1.0, 2.0, missing, 3.0]\nx_multiclass = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass)\nx_count = [1, 1, 1, 2, missing, 3, 3]\n\nmach = machine(imputer, x_continuous)\nfit!(mach)\n\njulia> fitted_params(mach)\n(filler = 2.0,)\n\njulia> transform(mach, [missing, missing, 101.0])\n3-element Vector{Float64}:\n 2.0\n 2.0\n 101.0\n\nmach2 = machine(imputer, x_multiclass) |> fit!\n\njulia> transform(mach2, x_multiclass)\n5-element CategoricalArray{String,1,UInt32}:\n \"y\"\n \"n\"\n \"y\"\n \"y\"\n \"y\"\n\nmach3 = machine(imputer, x_count) |> fit!\n\njulia> transform(mach3, [missing, missing, 5])\n3-element Vector{Int64}:\n 2\n 2\n 5\n\nFor imputing tabular data, use FillImputer.\n\n\n\n\n\n","category":"type"},{"location":"transformers/#FeatureSelection.FeatureSelector","page":"Transformers and Other Unsupervised models","title":"FeatureSelection.FeatureSelector","text":"FeatureSelector\n\nA model type for constructing a feature selector, based on FeatureSelection.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nFeatureSelector = @load FeatureSelector pkg=FeatureSelection\n\nDo model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).\n\nUse this model to select features (columns) of a table, usually as part of a model Pipeline.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X)\n\nwhere\n\nX: any table of input features, where \"table\" is in the sense of Tables.jl\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nfeatures: one of the following, with the behavior indicated:\n[] (empty, the default): filter out all features (columns) which were not encountered in training\nnon-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)\nfunction or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.\nignore: whether to ignore or keep specified features, as explained above\n\nOperations\n\ntransform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nfeatures_to_keep: the features that will be selected\n\nExample\n\nusing MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([\"x\", \"y\", \"x\"], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\nselector = FeatureSelector(features=[:ordinal3, ], ignore=true);\n\njulia> transform(fit!(machine(selector, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[\"x\", \"y\", \"x\"],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateBoxCoxTransformer","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateBoxCoxTransformer","text":"UnivariateBoxCoxTransformer\n\nA model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels\n\nDo model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).\n\nBox-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.\n\nA Box-Cox transformation (with shift) is of the form\n\nx -> ((x + c)^λ - 1)/λ\n\nfor some constant c and real λ, unless λ = 0, in which case the above is replaced with\n\nx -> log(x + c)\n\nGiven user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.\n\nReference: Wikipedia entry for power transform.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector with element scitype Continuous; check the scitype with scitype(x)\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nn=171: number of values of the exponent λ to try\nshift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros\n\nOperations\n\ntransform(mach, xnew): apply the Box-Cox transformation learned when fitting mach\ninverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nλ: the learned Box-Cox exponent\nc: the learned shift\n\nExamples\n\nusing MLJ\nusing UnicodePlots\nusing Random\nRandom.seed!(123)\n\ntransf = UnivariateBoxCoxTransformer()\n\nx = randn(1000).^2\n\nmach = machine(transf, x)\nfit!(mach)\n\nz = transform(mach, x)\n\njulia> histogram(x)\n ┌ ┐\n [ 0.0, 2.0) ┤███████████████████████████████████ 848\n [ 2.0, 4.0) ┤████▌ 109\n [ 4.0, 6.0) ┤█▍ 33\n [ 6.0, 8.0) ┤▍ 7\n [ 8.0, 10.0) ┤▏ 2\n [10.0, 12.0) ┤ 0\n [12.0, 14.0) ┤▏ 1\n └ ┘\n Frequency\n\njulia> histogram(z)\n ┌ ┐\n [-5.0, -4.0) ┤█▎ 8\n [-4.0, -3.0) ┤████████▊ 64\n [-3.0, -2.0) ┤█████████████████████▊ 159\n [-2.0, -1.0) ┤█████████████████████████████▊ 216\n [-1.0, 0.0) ┤███████████████████████████████████ 254\n [ 0.0, 1.0) ┤█████████████████████████▊ 188\n [ 1.0, 2.0) ┤████████████▍ 90\n [ 2.0, 3.0) ┤██▊ 20\n [ 3.0, 4.0) ┤▎ 1\n └ ┘\n Frequency\n\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateDiscretizer","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateDiscretizer","text":"UnivariateDiscretizer\n\nA model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels\n\nDo model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).\n\nDiscretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).\n\nThe transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector with Continuous element scitype; check scitype with scitype(x).\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nn_classes: number of discrete classes in the output\n\nOperations\n\ntransform(mach, xnew): discretize xnew according to the discretization learned when fitting mach\ninverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z\n\nFitted parameters\n\nThe fields of fitted_params(mach).fitesult include:\n\nodd_quantiles: quantiles used for transforming (length is n_classes - 1)\neven_quantiles: quantiles used for inverse transforming (length is n_classes)\n\nExample\n\nusing MLJ\nusing Random\nRandom.seed!(123)\n\ndiscretizer = UnivariateDiscretizer(n_classes=100)\nmach = machine(discretizer, randn(1000))\nfit!(mach)\n\njulia> x = rand(5)\n5-element Vector{Float64}:\n 0.8585244609846809\n 0.37541692370451396\n 0.6767070590395461\n 0.9208844241267105\n 0.7064611415680901\n\njulia> z = transform(mach, x)\n5-element CategoricalArrays.CategoricalArray{UInt8,1,UInt8}:\n 0x52\n 0x42\n 0x4d\n 0x54\n 0x4e\n\nx_approx = inverse_transform(mach, z)\njulia> x - x_approx\n5-element Vector{Float64}:\n 0.008224506144777322\n 0.012731354778359405\n 0.0056265330571125816\n 0.005738175684445124\n 0.006835652575801987\n\n\n\n\n\n","category":"type"},{"location":"transformers/#MLJModels.UnivariateTimeTypeToContinuous","page":"Transformers and Other Unsupervised models","title":"MLJModels.UnivariateTimeTypeToContinuous","text":"UnivariateTimeTypeToContinuous\n\nA model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nUnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels\n\nDo model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).\n\nUse this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, x)\n\nwhere\n\nx: any abstract vector whose element type is a subtype of Dates.TimeType\n\nTrain the machine using fit!(mach, rows=...).\n\nHyper-parameters\n\nzero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.\nstep::Period=Hour(24): time interval to correspond to one unit under transformation\n\nOperations\n\ntransform(mach, xnew): apply the encoding inferred when mach was fit\n\nFitted parameters\n\nfitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.\n\nExample\n\nusing MLJ\nusing Dates\n\nx = [Date(2001, 1, 1) + Day(i) for i in 0:4]\n\nencoder = UnivariateTimeTypeToContinuous(zero_time=Date(2000, 1, 1),\n step=Week(1))\n\nmach = machine(encoder, x)\nfit!(mach)\njulia> transform(mach, x)\n5-element Vector{Float64}:\n 52.285714285714285\n 52.42857142857143\n 52.57142857142857\n 52.714285714285715\n 52.857142\n\n\n\n\n\n","category":"type"},{"location":"transformers/#Static-transformers","page":"Transformers and Other Unsupervised models","title":"Static transformers","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"A static transformer is a model for transforming data that does not generalize to new data (does not \"learn\") but which nevertheless has hyperparameters. For example, the DBSAN clustering model from Clustering.jl can assign labels to some collection of observations, cannot directly assign a label to some new observation.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"The general user may define their own static models. The main use-case is insertion into a Linear Pipelines some parameter-dependent transformation. (If a static transformer has no hyper-parameters, it is tantamount to an ordinary function. An ordinary function can be inserted directly into a pipeline; the situation for learning networks is only slightly more complicated.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"The following example defines a new model type Averager to perform the weighted average of two vectors (target predictions, for example). We suppose the weighting is normalized, and therefore controlled by a single hyper-parameter, mix.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"using MLJ","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"mutable struct Averager <: Static\n mix::Float64\nend\n\nMLJ.transform(a::Averager, _, y1, y2) = (1 - a.mix)*y1 + a.mix*y2","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Important. Note the sub-typing <: Static.","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Such static transformers with (unlearned) parameters can have arbitrarily many inputs, but only one output. In the single input case, an inverse_transform can also be defined. Since they have no real learned parameters, you bind a static transformer to a machine without specifying training arguments; there is no need to fit! the machine:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"mach = machine(Averager(0.5))\ntransform(mach, [1, 2, 3], [3, 2, 1])","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Let's see how we can include our Averager in a learning network to mix the predictions of two regressors, with one-hot encoding of the inputs. Here's two regressors for mixing, and some dummy data for testing our learning network:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"ridge = (@load RidgeRegressor pkg=MultivariateStats)()\nknn = (@load KNNRegressor)()\n\nimport Random.seed!\nseed!(112)\nX = (\n x1=coerce(rand(\"ab\", 100), Multiclass),\n x2=rand(100),\n)\ny = X.x2 + 0.05*rand(100)\nschema(X)","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"And the learning network:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Xs = source(X)\nys = source(y)\n\naverager = Averager(0.5)\n\nmach0 = machine(OneHotEncoder(), Xs)\nW = transform(mach0, Xs) # one-hot encode the input\n\nmach1 = machine(ridge, W, ys)\ny1 = predict(mach1, W)\n\nmach2 = machine(knn, W, ys)\ny2 = predict(mach2, W)\n\nmach4= machine(averager)\nyhat = transform(mach4, y1, y2)\n\n# test:\nfit!(yhat)\nXnew = selectrows(X, 1:3)\nyhat(Xnew)","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"We next \"export\" the learning network as a standalone composite model type. First we need a struct for the composite model. Since we are restricting to Deterministic component regressors, the composite will also make deterministic predictions, and so gets the supertype DeterministicNetworkComposite:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"mutable struct DoubleRegressor <: DeterministicNetworkComposite\n regressor1\n regressor2\n averager\nend","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"As described in Learning Networks, we next paste the learning network into a prefit declaration, replace the component models with symbolic placeholders, and add a learning network \"interface\":","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"import MLJBase\nfunction MLJBase.prefit(composite::DoubleRegressor, verbosity, X, y)\n Xs = source(X)\n ys = source(y)\n\n mach0 = machine(OneHotEncoder(), Xs)\n W = transform(mach0, Xs) # one-hot encode the input\n\n mach1 = machine(:regressor1, W, ys)\n y1 = predict(mach1, W)\n\n mach2 = machine(:regressor2, W, ys)\n y2 = predict(mach2, W)\n\n mach4= machine(:averager)\n yhat = transform(mach4, y1, y2)\n\n # learning network interface:\n (; predict=yhat)\nend","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"The new model type can be evaluated like any other supervised model:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"X, y = @load_reduced_ames;\ncomposite = DoubleRegressor(ridge, knn, Averager(0.5))","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"composite.averager.mix = 0.25 # adjust mix from default of 0.5\nevaluate(composite, X, y, measure=l1)","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"A static transformer can also expose byproducts of the transform computation in the report of any associated machine. See Static models (models that do not generalize) for details.","category":"page"},{"location":"transformers/#Transformers-that-also-predict","page":"Transformers and Other Unsupervised models","title":"Transformers that also predict","text":"","category":"section"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Some clustering algorithms learn to label data by identifying a collection of \"centroids\" in the training data. Any new input observation is labeled with the cluster to which it is closest (this is the output of predict) while the vector of all distances from the centroids defines a lower-dimensional representation of the observation (the output of transform). In the following example a K-means clustering algorithm assigns one of three labels 1, 2, 3 to the input features of the iris data set and compares them with the actual species recorded in the target (not seen by the algorithm).","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"using MLJ","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"import Random.seed!\nseed!(123)\n\nX, y = @load_iris\nKMeans = @load KMeans pkg=Clustering\nkmeans = KMeans()\nmach = machine(kmeans, X) |> fit!\nnothing # hide","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Transforming:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Xsmall = transform(mach)\nselectrows(Xsmall, 1:4) |> pretty","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"Predicting:","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"yhat = predict(mach)\ncompare = zip(yhat, y) |> collect","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"compare[1:8]","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"compare[51:58]","category":"page"},{"location":"transformers/","page":"Transformers and Other Unsupervised models","title":"Transformers and Other Unsupervised models","text":"compare[101:108]","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/#GaussianProcessRegressor_MLJScikitLearnInterface","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"","category":"section"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"GaussianProcessRegressor","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"A model type for constructing a Gaussian process regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"GaussianProcessRegressor = @load GaussianProcessRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"Do model = GaussianProcessRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessRegressor(kernel=...).","category":"page"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"GaussianProcessRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/GaussianProcessRegressor_MLJScikitLearnInterface/","page":"GaussianProcessRegressor","title":"GaussianProcessRegressor","text":"kernel = nothing\nalpha = 1.0e-10\noptimizer = fmin_l_bfgs_b\nn_restarts_optimizer = 0\nnormalize_y = false\ncopy_X_train = true\nrandom_state = nothing","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/#MeanShift_MLJScikitLearnInterface","page":"MeanShift","title":"MeanShift","text":"","category":"section"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"MeanShift","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"A model type for constructing a mean shift, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"MeanShift = @load MeanShift pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"Do model = MeanShift() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MeanShift(bandwidth=...).","category":"page"},{"location":"models/MeanShift_MLJScikitLearnInterface/","page":"MeanShift","title":"MeanShift","text":"Mean shift clustering using a flat kernel. Mean shift clustering aims to discover \"blobs\" in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids.\"","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#StableRulesRegressor_SIRUS","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"StableRulesRegressor","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"A model type for constructing a stable rules regressor, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"StableRulesRegressor = @load StableRulesRegressor pkg=SIRUS","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"Do model = StableRulesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesRegressor(rng=...).","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"StableRulesRegressor implements the explainable rule-based regression model based on a random forest.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Training-data","page":"StableRulesRegressor","title":"Training data","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"where","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Hyperparameters","page":"StableRulesRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.\nmax_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.\nlambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Fitted-parameters","page":"StableRulesRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"fitresult: A StableRules object.","category":"page"},{"location":"models/StableRulesRegressor_SIRUS/#Operations","page":"StableRulesRegressor","title":"Operations","text":"","category":"section"},{"location":"models/StableRulesRegressor_SIRUS/","page":"StableRulesRegressor","title":"StableRulesRegressor","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"correcting_class_imbalance/#Correcting-Class-Imbalance","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"","category":"section"},{"location":"correcting_class_imbalance/#Oversampling-and-undersampling-methods","page":"Correcting Class Imbalance","title":"Oversampling and undersampling methods","text":"","category":"section"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"Models providing oversampling or undersampling methods, to correct for class imbalance, are listed under Class Imbalance. In particular, several popular algorithms are provided by the Imbalance.jl package, which includes detailed documentation and tutorials.","category":"page"},{"location":"correcting_class_imbalance/#Incorporating-class-imbalance-in-supervised-learning-pipelines","page":"Correcting Class Imbalance","title":"Incorporating class imbalance in supervised learning pipelines","text":"","category":"section"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"One or more oversampling/undersampling algorithms can be fused with an MLJ classifier using the BalancedModel wrapper. This creates a new classifier which can be treated like any other; resampling to correct for class imbalance, relevant only for training of the atomic classifier, is then carried out internally. If, for example, one applies cross-validation to the wrapped classifier (using evaluate!, say) then this means over/undersampling is then repeated for each training fold automatically.","category":"page"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"Refer to the MLJBalancing.jl documentation for further details.","category":"page"},{"location":"correcting_class_imbalance/","page":"Correcting Class Imbalance","title":"Correcting Class Imbalance","text":"MLJBalancing.BalancedModel","category":"page"},{"location":"correcting_class_imbalance/#MLJBalancing.BalancedModel","page":"Correcting Class Imbalance","title":"MLJBalancing.BalancedModel","text":"BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)\nBalancedModel(model; balancer1=balancer_model1, balancer2=balancer_model2, ...)\n\nGiven a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.\n\nOperation\n\nDuring training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.\nDuring prediction, the balancers have no effect.\n\nArguments\n\nmodel::Supervised: A classification model that implements the MLJModelInterface.\nbalancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.\nbalancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.\nand so on for an arbitrary number of balancers.\n\nReturns\n\nAn instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.\n\nExample\n\nusing MLJ\nusing Imbalance\n\n# generate data\nX, y = Imbalance.generate_imbalanced_data(1000, 5; class_probs=[0.2, 0.3, 0.5])\n\n# prepare classification and balancing models\nSMOTENC = @load SMOTENC pkg=Imbalance verbosity=0\nTomekUndersampler = @load TomekUndersampler pkg=Imbalance verbosity=0\nLogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0\n\noversampler = SMOTENC(k=5, ratios=1.0, rng=42)\nundersampler = TomekUndersampler(min_ratios=0.5, rng=42)\nlogistic_model = LogisticClassifier()\n\n# wrap them in a BalancedModel\nbalanced_model = BalancedModel(model=logistic_model, balancer1=oversampler, balancer2=undersampler)\n\n# now this behaves as a unified model that can be trained, validated, fine-tuned, etc.\nmach = machine(balanced_model, X, y)\nfit!(mach)\n\n\n\n\n\n","category":"function"},{"location":"working_with_categorical_data/#Working-with-Categorical-Data","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"","category":"section"},{"location":"working_with_categorical_data/#Scientific-types-for-discrete-data","page":"Working with Categorical Data","title":"Scientific types for discrete data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Recall that models articulate their data requirements using scientific types (see Getting Started or the ScientificTypes.jl documentation). There are three scientific types discrete data can have: Count, OrderedFactor and Multiclass.","category":"page"},{"location":"working_with_categorical_data/#Count-data","page":"Working with Categorical Data","title":"Count data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"In MLJ you cannot use integers to represent (finite) categorical data. Integers are reserved for discrete data you want interpreted as Count <: Infinite:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"using MLJ # hide\nscitype([1, 4, 5, 6])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The Count scientific type includes things like the number of phone calls, or city populations, and other \"frequency\" data of a generally unbounded nature.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"That said, you may have data that is theoretically Count, but which you coerce to OrderedFactor to enable the use of more models, trusting to your knowledge of how those models work to inform an appropriate interpretation.","category":"page"},{"location":"working_with_categorical_data/#OrderedFactor-and-Multiclass-data","page":"Working with Categorical Data","title":"OrderedFactor and Multiclass data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Other integer data, such as the number of an animal's legs, or number of rooms in homes, are, generally, coerced to OrderedFactor <: Finite. The other categorical scientific type is Multiclass <: Finite, which is for unordered categorical data. Coercing data to one of these two forms is discussed under Detecting and coercing improperly represented categorical data below.","category":"page"},{"location":"working_with_categorical_data/#Binary-data","page":"Working with Categorical Data","title":"Binary data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"There is no separate scientific type for binary data. Binary data is either OrderedFactor{2} if ordered, and Multiclass{2} otherwise. Data with type OrderedFactor{2} is considered to have an intrinsic \"positive\" class, e.g., the outcome of a medical test, and the \"pass/fail\" outcome of an exam. MLJ measures, such as true_positive assume the second class in the ordering is the \"positive\" class. Inspecting and changing order are discussed in the next section.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"If data has type Bool it is considered Count data (as Bool <: Integer) and, generally, users will want to coerce such data to Multiclass or OrderedFactor.","category":"page"},{"location":"working_with_categorical_data/#Detecting-and-coercing-improperly-represented-categorical-data","page":"Working with Categorical Data","title":"Detecting and coercing improperly represented categorical data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"One inspects the scientific type of data using scitype as shown above. To inspect all column scientific types in a table simultaneously, use schema. (The scitype(X) of a table X contains a condensed form of this information used in type dispatch; see here.)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"import DataFrames: DataFrame\nX = DataFrame(\n name = [\"Siri\", \"Robo\", \"Alexa\", \"Cortana\"],\n gender = [\"male\", \"male\", \"Female\", \"female\"],\n likes_soup = [true, false, false, true],\n height = [152, missing, 148, 163],\n rating = [2, 5, 2, 1],\n outcome = [\"rejected\", \"accepted\", \"accepted\", \"rejected\"],\n)\nschema(X)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Coercing a single column:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"X.outcome = coerce(X.outcome, OrderedFactor)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The machine type of the result is a CategoricalArray. For more on this type see Under the hood: CategoricalValue and CategoricalArray below.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Inspecting the order of the levels:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(X.outcome)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Since we wish to regard \"accepted\" as the positive class, it should appear second, which we correct with the levels! function:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels!(X.outcome, [\"rejected\", \"accepted\"])\nlevels(X.outcome)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"warning: Changing levels of categorical data\nThe order of levels should generally be changed early in your data science workflow and then not again. Similar remarks apply to adding levels (which is possible; see the CategorialArrays.jl documentation). MLJ supervised and unsupervised models assume levels and their order do not change.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Coercing all remaining types simultaneously:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Xnew = coerce(X, :gender => Multiclass,\n :likes_soup => OrderedFactor,\n :height => Continuous,\n :rating => OrderedFactor)\nschema(Xnew)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"For DataFrames there is also in-place coercion, using coerce!.","category":"page"},{"location":"working_with_categorical_data/#Tracking-all-levels","page":"Working with Categorical Data","title":"Tracking all levels","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The key property of vectors of scientific type OrderedFactor and Multiclass is that the pool of all levels is not lost when separating out one or more elements:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = Xnew.rating","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(v[1:2])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(v[2])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"By tracking all classes in this way, MLJ avoids common pain points around categorical data, such as evaluating models on an evaluation set, only to crash your code because classes appear there which were not seen during training.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"By drawing test, validation and training data from a common data structure (as described in Getting Started, for example) one ensures that all possible classes of categorical variables are tracked at all times. However, this does not mitigate problems with new production data, if categorical features there are missing classes or contain previously unseen classes.","category":"page"},{"location":"working_with_categorical_data/#New-or-missing-levels-in-production-data","page":"Working with Categorical Data","title":"New or missing levels in production data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"warning: Warning\nUnpredictable behavior may result whenever Finite categorical data presents in a production set with different classes (levels) from those presented during training","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Consider, for example, the following naive workflow:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"# train a one-hot encoder on some data:\nx = coerce([\"black\", \"white\", \"white\", \"black\"], Multiclass)\nX = DataFrame(x=x)\n\nmodel = OneHotEncoder()\nmach = machine(model, X) |> fit!\n\n# one-hot encode new data with missing classes:\nxproduction = coerce([\"white\", \"white\"], Multiclass)\nXproduction = DataFrame(x=xproduction)\nXproduction == X[2:3,:]","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"So far, so good. But the following operation throws an error:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"julia> transform(mach, Xproduction) == transform(mach, X[2:3,:])\nERROR: Found category level mismatch in feature `x`. Consider using `levels!` to ensure fitted and transforming features have the same category levels.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"The problem here is that levels(X.x) and levels(Xproduction.x) are different:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(X.x)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels(Xproduction.x)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"This could be anticipated by the fact that the training and production data have different schema:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"schema(X)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"schema(Xproduction)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"One fix is to manually correct the levels of the production data:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"levels!(Xproduction.x, levels(x))\ntransform(mach, Xproduction) == transform(mach, X[2:3,:])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Another solution is to pack all production data with dummy rows based on the training data (subsequently dropped) to ensure there are no missing classes. Currently, MLJ contains no general tooling to check and fix categorical levels in production data (although one can check that training data and production data have the same schema, to ensure the number of classes in categorical data is consistent).","category":"page"},{"location":"working_with_categorical_data/#Extracting-an-integer-representation-of-Finite-data","page":"Working with Categorical Data","title":"Extracting an integer representation of Finite data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Occasionally, you may really want an integer representation of data that currently has scitype Finite. For example, you are a developer wrapping an algorithm from an external package for use in MLJ, and that algorithm uses integer representations. Use the int method for this purpose, and use decoder to construct decoders for reversing the transformation:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = coerce([\"one\", \"two\", \"three\", \"one\"], OrderedFactor);\nlevels!(v, [\"one\", \"two\", \"three\"]);\nv_int = int(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d = decoder(v); # or decoder(v[1])\nd.(v_int)","category":"page"},{"location":"working_with_categorical_data/#Under-the-hood:-CategoricalValue-and-CategoricalArray","page":"Working with Categorical Data","title":"Under the hood: CategoricalValue and CategoricalArray","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"In MLJ the objects with OrderedFactor or Multiclass scientific type have machine type CategoricalValue, from the CategoricalArrays.jl package. In some sense CategoricalValues are an implementation detail users can ignore for the most part, as shown above. However, you may want some basic understanding of these types, and those implementing MLJ's model interface for new algorithms will have to understand them. For the complete API, see the CategoricalArrays.jl documentation. Here are the basics:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"To construct an OrderedFactor or Multiclass vector directly from raw labels, one uses categorical:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"using CategoricalArrays # hide\nv = categorical(['A', 'B', 'A', 'A', 'C'])\ntypeof(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"(Equivalent to the idiomatically MLJ v = coerce(['A', 'B', 'A', 'A', 'C']), Multiclass).)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"scitype(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = categorical(['A', 'B', 'A', 'A', 'C'], ordered=true, compress=true)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"scitype(v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"When you index a CategoricalVector you don't get a raw label, but instead an instance of CategoricalValue. As explained above, this value knows the complete pool of levels from the vector from which it came. Use get(val) to extract the raw label from a value val.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Despite the distinction that exists between a value (element) and a label, the two are the same, from the point of == and in:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v[1] == 'A' # true\n'A' in v # true","category":"page"},{"location":"working_with_categorical_data/#Probabilistic-predictions-of-categorical-data","page":"Working with Categorical Data","title":"Probabilistic predictions of categorical data","text":"","category":"section"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Recall from Getting Started that probabilistic classifiers ordinarily predict UnivariateFinite distributions, not raw probabilities (which are instead accessed using the pdf method.) Here's how to construct such a distribution yourself:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"v = coerce([\"yes\", \"no\", \"yes\", \"yes\", \"maybe\"], Multiclass)\nd = UnivariateFinite([v[2], v[1]], [0.9, 0.1])","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Or, equivalently,","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d = UnivariateFinite([\"no\", \"yes\"], [0.9, 0.1], pool=v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"This distribution tracks all levels, not just the ones to which you have assigned probabilities:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"pdf(d, \"maybe\")","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"However, pdf(d, \"dunno\") will throw an error.","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"You can declare pool=missing, but then \"maybe\" will not be tracked:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d = UnivariateFinite([\"no\", \"yes\"], [0.9, 0.1], pool=missing)\nlevels(d)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"To construct a whole vector of UnivariateFinite distributions, simply give the constructor a matrix of probabilities:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"yes_probs = rand(5)\nprobs = hcat(1 .- yes_probs, yes_probs)\nd_vec = UnivariateFinite([\"no\", \"yes\"], probs, pool=v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"Or, equivalently:","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"d_vec = UnivariateFinite([\"no\", \"yes\"], yes_probs, augment=true, pool=v)","category":"page"},{"location":"working_with_categorical_data/","page":"Working with Categorical Data","title":"Working with Categorical Data","text":"For more options, see UnivariateFinite.","category":"page"},{"location":"models/COPODDetector_OutlierDetectionPython/#COPODDetector_OutlierDetectionPython","page":"COPODDetector","title":"COPODDetector","text":"","category":"section"},{"location":"models/COPODDetector_OutlierDetectionPython/","page":"COPODDetector","title":"COPODDetector","text":"COPODDetector(n_jobs = 1)","category":"page"},{"location":"models/COPODDetector_OutlierDetectionPython/","page":"COPODDetector","title":"COPODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.copod","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#MultitargetNeuralNetworkRegressor_BetaML","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"mutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#Parameters:","page":"MultitargetNeuralNetworkRegressor","title":"Parameters:","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers\nloss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices.\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 300]\nbatch_size: Size of each individual batch [def: 16]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ndescr: An optional title and/or description for this model\ncb: A call back function to provide information during training [def: BetaML.fitting_info]\nrng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#Notes:","page":"MultitargetNeuralNetworkRegressor","title":"Notes:","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"data must be numerical\nthe label should be a n-records by n-dimensions matrix","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/#Example:","page":"MultitargetNeuralNetworkRegressor","title":"Example:","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_BetaML/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> ydouble = hcat(y, y .*2 .+5);\n\njulia> modelType = @load MultitargetNeuralNetworkRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Nn.MultitargetNeuralNetworkRegressor\n\njulia> layers = [BetaML.DenseLayer(12,50,f=BetaML.relu),BetaML.DenseLayer(50,50,f=BetaML.relu),BetaML.DenseLayer(50,50,f=BetaML.relu),BetaML.DenseLayer(50,2,f=BetaML.relu)];\n\njulia> model = modelType(layers=layers,opt_alg=BetaML.ADAM(),epochs=500)\nMultitargetNeuralNetworkRegressor(\n layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.2591582523441157 -0.027962845131416225 … 0.16044535560124418 -0.12838827994676857; -0.30381834909561184 0.2405495243851402 … -0.2588144861880588 0.09538577909777807; … ; -0.017320292924711156 -0.14042266424603767 … 0.06366999105841187 -0.13419651752478906; 0.07393079961409338 0.24521350531110264 … 0.04256867886217541 -0.0895506802948175], [0.14249427336553644, 0.24719379413682485, -0.25595911822556566, 0.10034088778965933, -0.017086404878505712, 0.21932184025609347, -0.031413516834861266, -0.12569076082247596, -0.18080140982481183, 0.14551901873323253 … -0.13321995621967364, 0.2436582233332092, 0.0552222336976439, 0.07000814133633904, 0.2280064379660025, -0.28885681475734193, -0.07414214246290696, -0.06783184733650621, -0.055318068046308455, -0.2573488383282579], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.0395424111703751 -0.22531232360829911 … -0.04341228943744482 0.024336206858365517; -0.16481887432946268 0.17798073384748508 … -0.18594039305095766 0.051159225856547474; … ; -0.011639475293705043 -0.02347011206244673 … 0.20508869536159186 -0.1158382446274592; -0.19078069527757857 -0.007487540070740484 … -0.21341165344291158 -0.24158671316310726], [-0.04283623889330032, 0.14924461547060602, -0.17039563392959683, 0.00907774027816255, 0.21738885963113852, -0.06308040225941691, -0.14683286822101105, 0.21726892197970937, 0.19784321784707126, -0.0344988665714947 … -0.23643089430602846, -0.013560425201427584, 0.05323948910726356, -0.04644175812567475, -0.2350400292671211, 0.09628312383424742, 0.07016420995205697, -0.23266392927140334, -0.18823664451487, 0.2304486691429084], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.11504184627266828 0.08601794194664503 … 0.03843129724045469 -0.18417305624127284; 0.10181551438831654 0.13459759904443674 … 0.11094951365942118 -0.1549466590355218; … ; 0.15279817525427697 0.0846661196058916 … -0.07993619892911122 0.07145402617285884; -0.1614160186346092 -0.13032002335149 … -0.12310552194729624 -0.15915773071049827], [-0.03435885900946367, -0.1198543931290306, 0.008454985905194445, -0.17980887188986966, -0.03557204910359624, 0.19125847393334877, -0.10949700778538696, -0.09343206702591, -0.12229583511781811, -0.09123969069220564 … 0.22119233518322862, 0.2053873143308657, 0.12756489387198222, 0.11567243705173319, -0.20982445664020496, 0.1595157838386987, -0.02087331046544119, -0.20556423263489765, -0.1622837764237961, -0.019220998739847395], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.25796717031347993 0.17579536633402948 … -0.09992960168785256 -0.09426177454620635; -0.026436330246675632 0.18070899284865127 … -0.19310119102392206 -0.06904005900252091], [0.16133004882307822, -0.3061228721091248], BetaML.Utils.relu, BetaML.Utils.drelu)], \n loss = BetaML.Utils.squared_cost, \n dloss = BetaML.Utils.dsquared_cost, \n epochs = 500, \n batch_size = 32, \n opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var\"#90#93\"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]), \n shuffle = true, \n descr = \"\", \n cb = BetaML.Nn.fitting_info, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, ydouble);\n\njulia> fit!(mach);\n\njulia> ŷdouble = predict(mach, X);\n\njulia> hcat(ydouble,ŷdouble)\n506×4 Matrix{Float64}:\n 24.0 53.0 28.4624 62.8607\n 21.6 48.2 22.665 49.7401\n 34.7 74.4 31.5602 67.9433\n 33.4 71.8 33.0869 72.4337\n ⋮ \n 23.9 52.8 23.3573 50.654\n 22.0 49.0 22.1141 48.5926\n 11.9 28.8 19.9639 45.5823","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/#MultinomialNBClassifier_MLJScikitLearnInterface","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"A model type for constructing a multinomial naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier = @load MultinomialNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).","category":"page"},{"location":"models/MultinomialNBClassifier_MLJScikitLearnInterface/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Multinomial naive bayes classifier. It is suitable for classification with discrete features (e.g. word counts for text classification).","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/#LarsRegressor_MLJScikitLearnInterface","page":"LarsRegressor","title":"LarsRegressor","text":"","category":"section"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"LarsRegressor","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"A model type for constructing a least angle regressor (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"LarsRegressor = @load LarsRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"Do model = LarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsRegressor(fit_intercept=...).","category":"page"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LarsRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LarsRegressor_MLJScikitLearnInterface/","page":"LarsRegressor","title":"LarsRegressor","text":"fit_intercept = true\nverbose = false\nprecompute = auto\nn_nonzero_coefs = 500\neps = 2.220446049250313e-16\ncopy_X = true\nfit_path = true","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#LOFDetector_OutlierDetectionNeighbors","page":"LOFDetector","title":"LOFDetector","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"LOFDetector(k = 5,\n metric = Euclidean(),\n algorithm = :kdtree,\n leafsize = 10,\n reorder = true,\n parallel = false)","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Calculate an anomaly score based on the density of an instance in comparison to its neighbors. This algorithm introduced the notion of local outliers and was developed by Breunig et al., see [1].","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#Parameters","page":"LOFDetector","title":"Parameters","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"k::Integer","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"metric::Metric","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"leafsize::Int","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"reorder::Bool","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"parallel::Bool","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#Examples","page":"LOFDetector","title":"Examples","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"using OutlierDetection: LOFDetector, fit, transform\ndetector = LOFDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/#References","page":"LOFDetector","title":"References","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionNeighbors/","page":"LOFDetector","title":"LOFDetector","text":"[1] Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.; Sander, Jörg (2000): LOF: Identifying Density-Based Local Outliers.","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/#AdaBoostClassifier_MLJScikitLearnInterface","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"","category":"section"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"AdaBoostClassifier","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"A model type for constructing a ada boost classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"AdaBoostClassifier = @load AdaBoostClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"Do model = AdaBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostClassifier(estimator=...).","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.","category":"page"},{"location":"models/AdaBoostClassifier_MLJScikitLearnInterface/","page":"AdaBoostClassifier","title":"AdaBoostClassifier","text":"This class implements the algorithm known as AdaBoost-SAMME.","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/#SVMLinearClassifier_MLJScikitLearnInterface","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"","category":"section"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"SVMLinearClassifier","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"A model type for constructing a linear support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"SVMLinearClassifier = @load SVMLinearClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"Do model = SVMLinearClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearClassifier(penalty=...).","category":"page"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMLinearClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMLinearClassifier_MLJScikitLearnInterface/","page":"SVMLinearClassifier","title":"SVMLinearClassifier","text":"penalty = l2\nloss = squared_hinge\ndual = true\ntol = 0.0001\nC = 1.0\nmulti_class = ovr\nfit_intercept = true\nintercept_scaling = 1.0\nrandom_state = nothing\nmax_iter = 1000","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#StableForestClassifier_SIRUS","page":"StableForestClassifier","title":"StableForestClassifier","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"StableForestClassifier","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"A model type for constructing a stable forest classifier, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"StableForestClassifier = @load StableForestClassifier pkg=SIRUS","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"Do model = StableForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestClassifier(rng=...).","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"StableForestClassifier implements the random forest classifier with a stabilized forest structure (Bénard et al., 2021). This stabilization increases stability when extracting rules. The impact on the predictive accuracy compared to standard random forests should be relatively small.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"note: Note\nJust like normal random forests, this model is not easily explainable. If you are interested in an explainable model, use the StableRulesClassifier or StableRulesRegressor.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Training-data","page":"StableForestClassifier","title":"Training data","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"where","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Hyperparameters","page":"StableForestClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Fitted-parameters","page":"StableForestClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"fitresult: A StableForest object.","category":"page"},{"location":"models/StableForestClassifier_SIRUS/#Operations","page":"StableForestClassifier","title":"Operations","text":"","category":"section"},{"location":"models/StableForestClassifier_SIRUS/","page":"StableForestClassifier","title":"StableForestClassifier","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"models/TunedModel_MLJTuning/#TunedModel_MLJTuning","page":"TunedModel","title":"TunedModel","text":"","category":"section"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"tuned_model = TunedModel(; model=,\n tuning=RandomSearch(),\n resampling=Holdout(),\n range=nothing,\n measure=nothing,\n n=default_n(tuning, range),\n operation=nothing,\n other_options...)","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Construct a model wrapper for hyper-parameter optimization of a supervised learner, specifying the tuning strategy and model whose hyper-parameters are to be mutated.","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"tuned_model = TunedModel(; models=,\n resampling=Holdout(),\n measure=nothing,\n n=length(models),\n operation=nothing,\n other_options...)","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Construct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"See below for a complete list of options.","category":"page"},{"location":"models/TunedModel_MLJTuning/#Training","page":"TunedModel","title":"Training","text":"","category":"section"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Instigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.\nFit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.","category":"page"},{"location":"models/TunedModel_MLJTuning/#Search-space","page":"TunedModel","title":"Search space","text":"","category":"section"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"The range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"The number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.","category":"page"},{"location":"models/TunedModel_MLJTuning/#Measures-(metrics)","page":"TunedModel","title":"Measures (metrics)","text":"","category":"section"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"If more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Important. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.","category":"page"},{"location":"models/TunedModel_MLJTuning/#Accessing-the-fitted-parameters-and-other-training-(tuning)-outcomes","page":"TunedModel","title":"Accessing the fitted parameters and other training (tuning) outcomes","text":"","category":"section"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"A Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Once a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"key value\nbest_model optimal model instance\nbest_fitted_params learned parameters of the optimal model","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"The named tuple report(mach) includes these keys/values:","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"key value\nbest_model optimal model instance\nbest_history_entry corresponding entry in the history, including performance estimate\nbest_report report generated by fitting the optimal model to all data\nhistory tuning strategy-specific history of all evaluations","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"plus other key/value pairs specific to the tuning strategy.","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"Each element of history is a property-accessible object with these properties:","category":"page"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"key value\nmeasure vector of measures (metrics)\nmeasurement vector of measurements, one per measure\nper_fold vector of vectors of unaggregated per-fold measurements\nevaluation full PerformanceEvaluation/CompactPerformaceEvaluation object","category":"page"},{"location":"models/TunedModel_MLJTuning/#Complete-list-of-key-word-options","page":"TunedModel","title":"Complete list of key-word options","text":"","category":"section"},{"location":"models/TunedModel_MLJTuning/","page":"TunedModel","title":"TunedModel","text":"model: Supervised model prototype that is cloned and mutated to generate models for evaluation\nmodels: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.\ntuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.\nresampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations\nmeasure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history\nweights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).\nclass_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).\nrepeats=1: for generating train/test sets multiple times in resampling (\"Monte Carlo\" resampling); see evaluate! for details\noperation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.\nrange: range object; tuning strategy documentation describes supported types\nselection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).\nn: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified\ntrain_best=true: whether to train the optimal model\nacceleration=default_resource(): mode of parallelization for tuning strategies that support this\nacceleration_resampling=CPU1(): mode of parallelization for resampling\ncheck_measure=true: whether to check measure is compatible with the specified model and operation)\ncache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.\ncompact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/#RidgeCVRegressor_MLJScikitLearnInterface","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"","category":"section"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"RidgeCVRegressor","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"A model type for constructing a ridge regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"RidgeCVRegressor = @load RidgeCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"Do model = RidgeCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVRegressor(alphas=...).","category":"page"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeCVRegressor_MLJScikitLearnInterface/","page":"RidgeCVRegressor","title":"RidgeCVRegressor","text":"alphas = (0.1, 1.0, 10.0)\nfit_intercept = true\nscoring = nothing\ncv = 5\ngcv_mode = nothing\nstore_cv_values = false","category":"page"},{"location":"models/CountTransformer_MLJText/#CountTransformer_MLJText","page":"CountTransformer","title":"CountTransformer","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"CountTransformer","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"A model type for constructing a count transformer, based on MLJText.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"CountTransformer = @load CountTransformer pkg=MLJText","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Do model = CountTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CountTransformer(max_doc_freq=...).","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of term counts.","category":"page"},{"location":"models/CountTransformer_MLJText/#Training-data","page":"CountTransformer","title":"Training data","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"mach = machine(model, X)","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Here:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:\nA vector of abstract strings (tokens), e.g., [\"I\", \"like\", \"Sam\", \".\", \"Sam\", \"is\", \"nice\", \".\"] (scitype AbstractVector{Textual})\nA dictionary of counts, indexed on abstract strings, e.g., Dict(\"I\"=>1, \"Sam\"=>2, \"Sam is\"=>1) (scitype Multiset{Textual}})\nA dictionary of counts, indexed on plain ngrams, e.g., Dict((\"I\",)=>1, (\"Sam\",)=>2, (\"I\", \"Sam\")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/CountTransformer_MLJText/#Hyper-parameters","page":"CountTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.\nmin_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.","category":"page"},{"location":"models/CountTransformer_MLJText/#Operations","page":"CountTransformer","title":"Operations","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"transform(mach, Xnew): Based on the vocabulary learned in training, return the matrix of counts for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.","category":"page"},{"location":"models/CountTransformer_MLJText/#Fitted-parameters","page":"CountTransformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"vocab: A vector containing the string used in the transformer's vocabulary.","category":"page"},{"location":"models/CountTransformer_MLJText/#Examples","page":"CountTransformer","title":"Examples","text":"","category":"section"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"CountTransformer accepts a variety of inputs. The example below transforms tokenized documents:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"using MLJ\nimport TextAnalysis\n\nCountTransformer = @load CountTransformer pkg=MLJText\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncount_transformer = CountTransformer()\n\njulia> tokenized_docs = TextAnalysis.tokenize.(docs)\n2-element Vector{Vector{String}}:\n [\"Hi\", \"my\", \"name\", \"is\", \"Sam\", \".\"]\n [\"How\", \"are\", \"you\", \"today\", \"?\"]\n\nmach = machine(count_transformer, tokenized_docs)\nfit!(mach)\n\nfitted_params(mach)\n\ntfidf_mat = transform(mach, tokenized_docs)","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"Alternatively, one can provide documents pre-parsed as ngrams counts:","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"using MLJ\nimport TextAnalysis\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncorpus = TextAnalysis.Corpus(TextAnalysis.NGramDocument.(docs, 1, 2))\nngram_docs = TextAnalysis.ngrams.(corpus)\n\njulia> ngram_docs[1]\nDict{AbstractString, Int64} with 11 entries:\n \"is\" => 1\n \"my\" => 1\n \"name\" => 1\n \".\" => 1\n \"Hi\" => 1\n \"Sam\" => 1\n \"my name\" => 1\n \"Hi my\" => 1\n \"name is\" => 1\n \"Sam .\" => 1\n \"is Sam\" => 1\n\ncount_transformer = CountTransformer()\nmach = machine(count_transformer, ngram_docs)\nMLJ.fit!(mach)\nfitted_params(mach)\n\ntfidf_mat = transform(mach, ngram_docs)","category":"page"},{"location":"models/CountTransformer_MLJText/","page":"CountTransformer","title":"CountTransformer","text":"See also TfidfTransformer, BM25Transformer","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#RandomForestRegressor_DecisionTree","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"A model type for constructing a CART random forest regressor, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(max_depth=...).","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"DecisionTreeRegressor implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): \"Random Forests.\", Machine Learning, vol. 45, pp. 5–32","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Training-data","page":"RandomForestRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"where","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Hyperparameters","page":"RandomForestRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: min number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)\nn_trees=10: number of trees to train\nsampling_fraction=0.7 fraction of samples to train each tree on\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Operations","page":"RandomForestRegressor","title":"Operations","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Fitted-parameters","page":"RandomForestRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"forest: the Ensemble object returned by the core DecisionTree.jl algorithm","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Report","page":"RandomForestRegressor","title":"Report","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Accessor-functions","page":"RandomForestRegressor","title":"Accessor functions","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/#Examples","page":"RandomForestRegressor","title":"Examples","text":"","category":"section"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"using MLJ\nForest = @load RandomForestRegressor pkg=DecisionTree\nforest = Forest(max_depth=4, min_samples_split=3)\n\nX, y = make_regression(100, 2) ## synthetic data\nmach = machine(forest, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2)\nyhat = predict(mach, Xnew) ## new predictions\n\nfitted_params(mach).forest ## raw `Ensemble` object from DecisionTree.jl\nfeature_importances(mach)","category":"page"},{"location":"models/RandomForestRegressor_DecisionTree/","page":"RandomForestRegressor","title":"RandomForestRegressor","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestRegressor.","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/#MultiTaskElasticNetRegressor_MLJScikitLearnInterface","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"","category":"section"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"MultiTaskElasticNetRegressor","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"A model type for constructing a multi-target elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"MultiTaskElasticNetRegressor = @load MultiTaskElasticNetRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"Do model = MultiTaskElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetRegressor(alpha=...).","category":"page"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskElasticNetRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetRegressor","title":"MultiTaskElasticNetRegressor","text":"alpha = 1.0\nl1_ratio = 0.5\nfit_intercept = true\ncopy_X = true\nmax_iter = 1000\ntol = 0.0001\nwarm_start = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/XGBoostCount_XGBoost/#XGBoostCount_XGBoost","page":"XGBoostCount","title":"XGBoostCount","text":"","category":"section"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"XGBoostCount","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"A model type for constructing a eXtreme Gradient Boosting Count Regressor, based on XGBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"XGBoostCount = @load XGBoostCount pkg=XGBoost","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"Do model = XGBoostCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostCount(test=...).","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"Univariate discrete regression using xgboost.","category":"page"},{"location":"models/XGBoostCount_XGBoost/#Training-data","page":"XGBoostCount","title":"Training data","text":"","category":"section"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"m = machine(model, X, y)","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"where","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.\ny: is an AbstractVector continuous target.","category":"page"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"Train using fit!(m, rows=...).","category":"page"},{"location":"models/XGBoostCount_XGBoost/#Hyper-parameters","page":"XGBoostCount","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/XGBoostCount_XGBoost/","page":"XGBoostCount","title":"XGBoostCount","text":"See https://xgboost.readthedocs.io/en/stable/parameter.html.","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/#HistGradientBoostingRegressor_MLJScikitLearnInterface","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"","category":"section"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"HistGradientBoostingRegressor","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"HistGradientBoostingRegressor = @load HistGradientBoostingRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"Do model = HistGradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingRegressor(loss=...).","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.","category":"page"},{"location":"models/HistGradientBoostingRegressor_MLJScikitLearnInterface/","page":"HistGradientBoostingRegressor","title":"HistGradientBoostingRegressor","text":"HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#EvoTreeMLE_EvoTrees","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"EvoTreeMLE(;kwargs...)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"A model type for constructing a EvoTreeMLE, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is specified through loss kwargs. Both Gaussian and Logistic distributions are supported.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Hyper-parameters","page":"EvoTreeMLE","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"loss=:gaussian: Loss to be be minimized during training. One of:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":":gaussian / :gaussian_mle\n:logistic / :logistic_mle\nnrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance. ","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Internal-API","page":"EvoTreeMLE","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Do config = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(max_depth=...).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Training-model","page":"EvoTreeMLE","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Inference","page":"EvoTreeMLE","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Predictions are obtained using predict which returns a Matrix of size [nobs, nparams] where the second dimensions refer to μ & σ for Normal/Gaussian and μ & s for Logistic.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"model(X)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#MLJ","page":"EvoTreeMLE","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"EvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Do model = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(loss=...).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Training-data","page":"EvoTreeMLE","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"where","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Operations","page":"EvoTreeMLE","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"predict(mach, Xnew): returns a vector of Gaussian or Logistic distributions (according to provided loss) given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Predictions are probabilistic.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"Specific metrics can also be predicted using:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"predict_mean(mach, Xnew)\npredict_mode(mach, Xnew)\npredict_median(mach, Xnew)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Fitted-parameters","page":"EvoTreeMLE","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Report","page":"EvoTreeMLE","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/#Examples","page":"EvoTreeMLE","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeMLE(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeMLE_EvoTrees/","page":"EvoTreeMLE","title":"EvoTreeMLE","text":"## MLJ Interface\nusing MLJ\nEvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees\nmodel = EvoTreeMLE(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_boston\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mean(mach, X)\npreds = predict_mode(mach, X)\npreds = predict_median(mach, X)","category":"page"},{"location":"models/Pipeline_MLJBase/#Pipeline_MLJBase","page":"Pipeline","title":"Pipeline","text":"","category":"section"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"Pipeline(component1, component2, ... , componentk; options...)\nPipeline(name1=component1, name2=component2, ..., namek=componentk; options...)\ncomponent1 |> component2 |> ... |> componentk","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"Create an instance of a composite model type which sequentially composes the specified components in order. This means component1 receives inputs, whose output is passed to component2, and so forth. A \"component\" is either a Model instance, a model type (converted immediately to its default instance) or any callable object. Here the \"output\" of a model is what predict returns if it is Supervised, or what transform returns if it is Unsupervised.","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"Names for the component fields are automatically generated unless explicitly specified, as in","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"Pipeline(encoder=ContinuousEncoder(drop_last=false),\n stand=Standardizer())","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"The Pipeline constructor accepts keyword options discussed further below.","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"Ordinary functions (and other callables) may be inserted in the pipeline as shown in the following example:","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"Pipeline(X->coerce(X, :age=>Continuous), OneHotEncoder, ConstantClassifier)","category":"page"},{"location":"models/Pipeline_MLJBase/#Syntactic-sugar","page":"Pipeline","title":"Syntactic sugar","text":"","category":"section"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"The |> operator is overloaded to construct pipelines out of models, callables, and existing pipelines:","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"LinearRegressor = @load LinearRegressor pkg=MLJLinearModels add=true\nPCA = @load PCA pkg=MultivariateStats add=true\n\npipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer\npipe2 = PCA |> LinearRegressor\npipe1 |> pipe2","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.","category":"page"},{"location":"models/Pipeline_MLJBase/#Special-operations","page":"Pipeline","title":"Special operations","text":"","category":"section"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.","category":"page"},{"location":"models/Pipeline_MLJBase/#Optional-key-word-arguments","page":"Pipeline","title":"Optional key-word arguments","text":"","category":"section"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"prediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)\noperation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)\ncache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"warning: Warning\nSet cache=false to guarantee data anonymization.","category":"page"},{"location":"models/Pipeline_MLJBase/","page":"Pipeline","title":"Pipeline","text":"To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.","category":"page"},{"location":"homogeneous_ensembles/#Homogeneous-Ensembles","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"","category":"section"},{"location":"homogeneous_ensembles/","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"Although an ensemble of models sharing a common set of hyperparameters can be defined using the learning network API, MLJ's EnsembleModel model wrapper is preferred, for convenience and best performance. Examples of using EnsembleModel are given in this Data Science Tutorial.","category":"page"},{"location":"homogeneous_ensembles/","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"When bagging decision trees, further randomness is normally introduced by subsampling features, when training each node of each tree (Ho (1995), Brieman and Cutler (2001)). A bagged ensemble of such trees is known as a Random Forest. You can see an example of using EnsembleModel to build a random forest in this Data Science Tutorial. However, you may also want to use a canned random forest model. Run models(\"RandomForest\") to list such models.","category":"page"},{"location":"homogeneous_ensembles/","page":"Homogeneous Ensembles","title":"Homogeneous Ensembles","text":"MLJEnsembles.EnsembleModel","category":"page"},{"location":"homogeneous_ensembles/#MLJEnsembles.EnsembleModel","page":"Homogeneous Ensembles","title":"MLJEnsembles.EnsembleModel","text":"EnsembleModel(model,\n atomic_weights=Float64[],\n bagging_fraction=0.8,\n n=100,\n rng=GLOBAL_RNG,\n acceleration=CPU1(),\n out_of_bag_measure=[])\n\nCreate a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.\n\nHere the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).\n\nIf rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.\n\nThe atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.\n\nThe ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.\n\nSpecify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.\n\nIf a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).\n\nImportant: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.\n\n\n\n\n\n","category":"function"},{"location":"models/PLSRegressor_PartialLeastSquaresRegressor/#PLSRegressor_PartialLeastSquaresRegressor","page":"PLSRegressor","title":"PLSRegressor","text":"","category":"section"},{"location":"models/PLSRegressor_PartialLeastSquaresRegressor/","page":"PLSRegressor","title":"PLSRegressor","text":"A Partial Least Squares Regressor. Contains PLS1, PLS2 (multi target) algorithms. Can be used mainly for regression.","category":"page"},{"location":"openml_integration/#OpenML-Integration","page":"OpenML Integration","title":"OpenML Integration","text":"","category":"section"},{"location":"openml_integration/","page":"OpenML Integration","title":"OpenML Integration","text":"The OpenML platform provides an integration platform for carrying out and comparing machine learning solutions across a broad collection of public datasets and software platforms.","category":"page"},{"location":"openml_integration/","page":"OpenML Integration","title":"OpenML Integration","text":"Integration with OpenML API is presently limited to querying and downloading datasets.","category":"page"},{"location":"openml_integration/","page":"OpenML Integration","title":"OpenML Integration","text":"Documentation is here.","category":"page"},{"location":"models/ECODDetector_OutlierDetectionPython/#ECODDetector_OutlierDetectionPython","page":"ECODDetector","title":"ECODDetector","text":"","category":"section"},{"location":"models/ECODDetector_OutlierDetectionPython/","page":"ECODDetector","title":"ECODDetector","text":"ECODDetector(n_jobs = 1)","category":"page"},{"location":"models/ECODDetector_OutlierDetectionPython/","page":"ECODDetector","title":"ECODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.ecod","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#UnivariateBoxCoxTransformer_MLJModels","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"UnivariateBoxCoxTransformer","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"A Box-Cox transformation (with shift) is of the form","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"x -> ((x + c)^λ - 1)/λ","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"for some constant c and real λ, unless λ = 0, in which case the above is replaced with","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"x -> log(x + c)","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Reference: Wikipedia entry for power transform.","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Training-data","page":"UnivariateBoxCoxTransformer","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"where","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Hyper-parameters","page":"UnivariateBoxCoxTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"n=171: number of values of the exponent λ to try\nshift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Operations","page":"UnivariateBoxCoxTransformer","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach\ninverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Fitted-parameters","page":"UnivariateBoxCoxTransformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"λ: the learned Box-Cox exponent\nc: the learned shift","category":"page"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/#Examples","page":"UnivariateBoxCoxTransformer","title":"Examples","text":"","category":"section"},{"location":"models/UnivariateBoxCoxTransformer_MLJModels/","page":"UnivariateBoxCoxTransformer","title":"UnivariateBoxCoxTransformer","text":"using MLJ\nusing UnicodePlots\nusing Random\nRandom.seed!(123)\n\ntransf = UnivariateBoxCoxTransformer()\n\nx = randn(1000).^2\n\nmach = machine(transf, x)\nfit!(mach)\n\nz = transform(mach, x)\n\njulia> histogram(x)\n ┌ ┐\n [ 0.0, 2.0) ┤███████████████████████████████████ 848\n [ 2.0, 4.0) ┤████▌ 109\n [ 4.0, 6.0) ┤█▍ 33\n [ 6.0, 8.0) ┤▍ 7\n [ 8.0, 10.0) ┤▏ 2\n [10.0, 12.0) ┤ 0\n [12.0, 14.0) ┤▏ 1\n └ ┘\n Frequency\n\njulia> histogram(z)\n ┌ ┐\n [-5.0, -4.0) ┤█▎ 8\n [-4.0, -3.0) ┤████████▊ 64\n [-3.0, -2.0) ┤█████████████████████▊ 159\n [-2.0, -1.0) ┤█████████████████████████████▊ 216\n [-1.0, 0.0) ┤███████████████████████████████████ 254\n [ 0.0, 1.0) ┤█████████████████████████▊ 188\n [ 1.0, 2.0) ┤████████████▍ 90\n [ 2.0, 3.0) ┤██▊ 20\n [ 3.0, 4.0) ┤▎ 1\n └ ┘\n Frequency\n","category":"page"},{"location":"performance_measures/#Performance-Measures","page":"Performance Measures","title":"Performance Measures","text":"","category":"section"},{"location":"performance_measures/#Quick-links","page":"Performance Measures","title":"Quick links","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"List of aliases of all measures\nMigration guide for changes to measures in MLJBase 1.0","category":"page"},{"location":"performance_measures/#Introduction","page":"Performance Measures","title":"Introduction","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are collectively referred to as measures. These measures are provided by the package StatisticalMeasures.jl but are immediately available to the MLJ user. Here's a simple example of direct application of the log_loss measures to compute a training loss:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"using MLJ\nX, y = @load_iris\nDecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree\ntree = DecisionTreeClassifier(max_depth=2)\nmach = machine(tree, X, y) |> fit!\nyhat = predict(mach, X)\nlog_loss(yhat, y)","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"A list of all measures, ready to use after running using MLJ or using StatisticalMeasures, is here. Alternatively, call measures() (experimental) to generate a dictionary keyed on available measure constructors, with measure metadata as values.","category":"page"},{"location":"performance_measures/#Custom-measures","page":"Performance Measures","title":"Custom measures","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an \"atomic\" measure can be transformed into a multi-target measure using this package.","category":"page"},{"location":"performance_measures/#Uses-of-measures","page":"Performance Measures","title":"Uses of measures","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"In MLJ, measures are specified:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"when evaluating model performance using evaluate!/evaluate; see Evaluating Model Performance\nwhen wrapping models using TunedModel - see Tuning Models\nwhen wrapping iterative models using IteratedModel - see Controlling Iterative Models\nwhen generating learning curves using learning_curve - see Learning Curves","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"and elsewhere.","category":"page"},{"location":"performance_measures/#Using-LossFunctions.jl","page":"Performance Measures","title":"Using LossFunctions.jl","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.","category":"page"},{"location":"performance_measures/#Receiver-operator-characteristics","page":"Performance Measures","title":"Receiver operator characteristics","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve method:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"StatisticalMeasures.roc_curve","category":"page"},{"location":"performance_measures/#StatisticalMeasures.roc_curve","page":"Performance Measures","title":"StatisticalMeasures.roc_curve","text":"roc_curve(ŷ, y) -> false_positive_rates, true_positive_rates, thresholds\n\nReturn data for plotting the receiver operator characteristic (ROC curve) for a binary classification problem.\n\nHere ŷ is a vector of UnivariateFinite distributions (from CategoricalDistributions.jl) over the two values taken by the ground truth observations y, a CategoricalVector. \n\nIf there are k unique probabilities, then there are correspondingly k thresholds and k+1 \"bins\" over which the false positive and true positive rates are constant.:\n\n[0.0 - thresholds[1]]\n[thresholds[1] - thresholds[2]]\n...\n[thresholds[k] - 1]\n\nConsequently, true_positive_rates and false_positive_rates have length k+1 if thresholds has length k.\n\nTo plot the curve using your favorite plotting backend, do something like plot(false_positive_rates, true_positive_rates).\n\nCore algorithm: Functions.roc_curve\n\nSee also AreaUnderCurve. \n\n\n\n\n\n","category":"function"},{"location":"performance_measures/#Migration-guide-for-changes-to-measures-in-MLJBase-1.0","page":"Performance Measures","title":"Migration guide for changes to measures in MLJBase 1.0","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:","category":"page"},{"location":"performance_measures/#Breaking-behavior-likely-relevant-to-many-users","page":"Performance Measures","title":"Breaking behavior likely relevant to many users","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"If using MLJBase without MLJ, then, in Julia 1.9 or higher, StatisticalMeasures must be explicitly imported to use measures that were previously part of MLJBase. If using MLJ, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below).\nAll measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping Unaggregated) no longer do so. To get per-observation measurements, use the new method StatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights]).\nThe default measure for regression models (used in evaluate/evaluate! when measures is unspecified) is changed from rms to l2=LPLoss(2) (mean sum of squares).\nMeanAbsoluteError has been removed and instead mae is an alias for LPLoss(p=1).\nMeasures that previously skipped NaN values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not support missing now do.\nAliases for measure types have been removed. For example RMSE (alias for RootMeanSquaredError) is gone. Aliases for instances, such as rms and cross_entropy persist. The exception is precision, for which ppv can be used in its place. (This is to avoid conflict with Base.precision, which was previously pirated.)\ninfo(measure) has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measure rms, do import StatisticalMeasures as SM; SM.orientation(rms).\nBehavior of the measures() method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples; measures(predicate) is decommissioned, but measures(needle) is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.)\nMeasures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly import LossFunctions and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples.\nSome user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.\nMeasures with a \"feature argument\" X, as in some_measure(ŷ, y, X), are no longer supported. See What is a measure? for allowed signatures in measures.","category":"page"},{"location":"performance_measures/#Packages-implementing-the-MLJ-model-interface","page":"Performance Measures","title":"Packages implementing the MLJ model interface","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:","category":"page"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"Add StatisticalMeasures as test dependency, and add using StatisticalMeasures to your runtests.jl (and/or included submodules).\nIf measures are qualified, as in MLJBase.rms, then the qualification must be removed or changed to StatisticalMeasures.rms, etc.\nBe aware that the default measure used in methods such as evaluate!, when measure is not specified, is changed from rms to l2 for regression models.\nBe aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.","category":"page"},{"location":"performance_measures/#Breaking-behavior-possibly-relevant-to-some-developers","page":"Performance Measures","title":"Breaking behavior possibly relevant to some developers","text":"","category":"section"},{"location":"performance_measures/","page":"Performance Measures","title":"Performance Measures","text":"The abstract measure types Aggregated, Unaggregated, Measure have been decommissioned. (A measure is now defined purely by its calling behavior.)\nWhat were previously exported as measure types are now only constructors.\ntarget_scitype(measure) is decommissioned. Related is StatisticalMeasures.observation_scitype(measure) which declares an upper bound on the allowed scitype of a single observation.\nprediction_type(measure) is decommissioned. Instead use StatisticalMeasures.kind_of_proxy(measure).\nThe trait reports_each_observation is decommissioned. Related is StatisticalMeasures.can_report_unaggregated; if false the new measurements method simply returns n copies of the aggregated measurement, where n is the number of observations provided, instead of individual observation-dependent measurements.\naggregation(measure) has been decommissioned. Instead use StatisticalMeasures.external_mode_of_aggregation(measure).\ninstances(measure) has been decommissioned; query docstrings for measure aliases, or follow this example: aliases = measures()[RootMeanSquaredError].aliases.\nis_feature_dependent(measure) has been decommissioned. Measures consuming feature data are not longer supported; see above.\ndistribution_type(measure) has been decommissioned.\ndocstring(measure) has been decommissioned.\nBehavior of aggregate has changed.\nThe following traits, previously exported by MLJBase and MLJ, cannot be applied to measures: supports_weights, supports_class_weights, orientation, human_name. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as in import StatisticalMeasures; StatisticalMeasures.orientation(measure)).","category":"page"},{"location":"models/GMMDetector_OutlierDetectionPython/#GMMDetector_OutlierDetectionPython","page":"GMMDetector","title":"GMMDetector","text":"","category":"section"},{"location":"models/GMMDetector_OutlierDetectionPython/","page":"GMMDetector","title":"GMMDetector","text":"GMMDetector(n_components=1,\n covariance_type=\"full\",\n tol=0.001,\n reg_covar=1e-06,\n max_iter=100,\n n_init=1,\n init_params=\"kmeans\",\n weights_init=None,\n means_init=None,\n precisions_init=None,\n random_state=None,\n warm_start=False)","category":"page"},{"location":"models/GMMDetector_OutlierDetectionPython/","page":"GMMDetector","title":"GMMDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.gmm","category":"page"},{"location":"models/LGBMRegressor_LightGBM/#LGBMRegressor_LightGBM","page":"LGBMRegressor","title":"LGBMRegressor","text":"","category":"section"},{"location":"models/LGBMRegressor_LightGBM/","page":"LGBMRegressor","title":"LGBMRegressor","text":"Microsoft LightGBM FFI wrapper: Regressor","category":"page"},{"location":"models/LMDDDetector_OutlierDetectionPython/#LMDDDetector_OutlierDetectionPython","page":"LMDDDetector","title":"LMDDDetector","text":"","category":"section"},{"location":"models/LMDDDetector_OutlierDetectionPython/","page":"LMDDDetector","title":"LMDDDetector","text":"LMDDDetector(n_iter = 50,\n dis_measure = \"aad\",\n random_state = nothing)","category":"page"},{"location":"models/LMDDDetector_OutlierDetectionPython/","page":"LMDDDetector","title":"LMDDDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lmdd","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#EvoTreeClassifier_EvoTrees","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"EvoTreeClassifier(;kwargs...)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"A model type for constructing a EvoTreeClassifier, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface. EvoTreeClassifier is used to perform multi-class classification, using cross-entropy loss.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Hyper-parameters","page":"EvoTreeClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Internal-API","page":"EvoTreeClassifier","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Do config = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(max_depth=...).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Training-model","page":"EvoTreeClassifier","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Inference","page":"EvoTreeClassifier","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Predictions are obtained using predict which returns a Matrix of size [nobs, K] where K is the number of classes:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"model(X)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#MLJ","page":"EvoTreeClassifier","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Do model = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(loss=...).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Training-data","page":"EvoTreeClassifier","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"where","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Multiclas or <:OrderedFactor; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Operations","page":"EvoTreeClassifier","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.\npredict_mode(mach, Xnew): returns the mode of each of the prediction above.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Fitted-parameters","page":"EvoTreeClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Report","page":"EvoTreeClassifier","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/#Examples","page":"EvoTreeClassifier","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeClassifier(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(1:3, nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"## MLJ Interface\nusing MLJ\nEvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees\nmodel = EvoTreeClassifier(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_iris\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mode(mach, X)","category":"page"},{"location":"models/EvoTreeClassifier_EvoTrees/","page":"EvoTreeClassifier","title":"EvoTreeClassifier","text":"See also EvoTrees.jl.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#FactorAnalysis_MultivariateStats","page":"FactorAnalysis","title":"FactorAnalysis","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"FactorAnalysis","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Do model = FactorAnalysis() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...).","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Training-data","page":"FactorAnalysis","title":"Training data","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"mach = machine(model, X)","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Here:","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Hyper-parameters","page":"FactorAnalysis","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"method::Symbol=:cm: Method to use to solve the problem, one of :ml, :em, :bayes.\nmaxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.\nmaxiter::Int=1000: Maximum number of iterations.\ntol::Real=1e-6: Convergence tolerance.\neta::Real=tol: Variance lower bound.\nmean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Operations","page":"FactorAnalysis","title":"Operations","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Fitted-parameters","page":"FactorAnalysis","title":"Fitted parameters","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Report","page":"FactorAnalysis","title":"Report","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"The fields of report(mach) are:","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data (number of factors).\nvariance: The variance of the factors.\ncovariance_matrix: The estimated covariance matrix.\nmean: The mean of the untransformed training data, of length indim.\nloadings: The factor loadings. A matrix of size (indim, outdim) where indim and outdim are as defined above.","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/#Examples","page":"FactorAnalysis","title":"Examples","text":"","category":"section"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"using MLJ\n\nFactorAnalysis = @load FactorAnalysis pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = FactorAnalysis(maxoutdim=2)\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/FactorAnalysis_MultivariateStats/","page":"FactorAnalysis","title":"FactorAnalysis","text":"See also KernelPCA, ICA, PPCA, PCA","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#SRRegressor_SymbolicRegression","page":"SRRegressor","title":"SRRegressor","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"SRRegressor","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"SRRegressor = @load SRRegressor pkg=SymbolicRegression","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(binary_operators=...).","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Training-data","page":"SRRegressor","title":"Training data","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"OR","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Here:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.\nw is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Hyper-parameters","page":"SRRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.\nunary_operators: Same, but for unary operators (one input scalar, gives an output scalar).\nconstraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.\nbatching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.\nbatch_size: What batch size to use if using batching.\nelementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).\nloss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,\n function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}\n prediction, flag = eval_tree_array(tree, dataset.X, options)\n if !flag\n return L(Inf)\n end\n return sum((prediction .- dataset.y) .^ 2) / dataset.n\n end\npopulations: How many populations of equations to use.\npopulation_size: How many equations in each population.\nncycles_per_iteration: How many generations to consider per iteration.\ntournament_selection_n: Number of expressions considered in each tournament.\ntournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.\ntopn: Number of equations to return to the host process, and to consider for the hall of fame.\ncomplexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].\ncomplexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.\ncomplexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.\nalpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.\nmaxsize: Maximum size of equations during the search.\nmaxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.\nparsimony: A multiplicative factor for how much complexity is punished.\ndimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.\nuse_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.\nuse_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.\nadaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.\nturbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!\nmigration: Whether to migrate equations between processes.\nhof_migration: Whether to migrate equations from the hall of fame to processes.\nfraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.\nfraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.\nshould_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.\nshould_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.\noptimizer_nrestarts: How many different random starting positions to consider for optimization of constants.\noptimizer_algorithm: Select algorithm to use for optimizing constants. Default is \"BFGS\", but \"NelderMead\" is also supported.\noptimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.\noutput_file: What file to store equations to, as a backup.\nperturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).\nprobability_negate_constant: Probability of negating a constant in the equation when mutating it.\nmutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.\ncrossover_probability: Probability of performing crossover.\nannealing: Whether to use simulated annealing.\nwarmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.\nverbosity: Whether to print debugging statements or not.\nprint_precision: How many digits to print when printing equations. By default, this is 5.\nsave_to_file: Whether to save equations to a file during the search.\nbin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).\nuna_constraints: Likewise, for unary operators.\nseed: What random seed to use. nothing uses no seed.\nprogress: Whether to use a progress bar output (verbosity will have no effect).\nearly_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.\ntimeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).\nmax_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.\nskip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.\nenable_autodiff: Whether to enable automatic differentiation functionality. This is turned off by default. If turned on, this will be turned off if one of the operators does not have well-defined gradients.\nnested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.\ndeterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.\ndefine_helper_functions: Whether to define helper functions for constructing and evaluating trees.\nniterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.\nparallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like \"multithreading\".\nnumprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).\nprocs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.\naddprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.\nheap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.\nruntests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.\nloss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.\nselection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.\ndimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Operations","page":"SRRegressor","title":"Operations","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.\npredict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Fitted-parameters","page":"SRRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). T is equal to the element type of the passed data.\nequation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Report","page":"SRRegressor","title":"Report","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity).\nequation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.\ncomplexities::Vector{Int}: The complexity of each expression in the Pareto frontier.\nlosses::Vector{L}: The loss of each expression in the Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.\nscores::Vector{L}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/#Examples","page":"SRRegressor","title":"Examples","text":"","category":"section"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"using MLJ\nSRRegressor = @load SRRegressor pkg=SymbolicRegression\nX, y = @load_boston\nmodel = SRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)\nmach = machine(model, X, y)\nfit!(mach)\ny_hat = predict(mach, X)\n## View the equation used:\nr = report(mach)\nprintln(\"Equation used:\", r.equation_strings[r.best_idx])","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"With units and variable names:","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"using MLJ\nusing DynamicQuantities\nSRegressor = @load SRRegressor pkg=SymbolicRegression\n\nX = (; x1=rand(32) .* us\"km/h\", x2=rand(32) .* us\"km\")\ny = @. X.x2 / X.x1 + 0.5us\"h\"\nmodel = SRRegressor(binary_operators=[+, -, *, /])\nmach = machine(model, X, y)\nfit!(mach)\ny_hat = predict(mach, X)\n## View the equation used:\nr = report(mach)\nprintln(\"Equation used:\", r.equation_strings[r.best_idx])","category":"page"},{"location":"models/SRRegressor_SymbolicRegression/","page":"SRRegressor","title":"SRRegressor","text":"See also MultitargetSRRegressor.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#EvoTreeGaussian_EvoTrees","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"EvoTreeGaussian(;kwargs...)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"A model type for constructing a EvoTreeGaussian, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ and σ parameters to maximize likelihood.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Hyper-parameters","page":"EvoTreeGaussian","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Internal-API","page":"EvoTreeGaussian","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Do config = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(max_depth=...).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Training-model","page":"EvoTreeGaussian","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Inference","page":"EvoTreeGaussian","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Predictions are obtained using predict which returns a Matrix of size [nobs, 2] where the second dimensions refer to μ and σ respectively:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"model(X)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#MLJ","page":"EvoTreeGaussian","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"EvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Do model = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(loss=...).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Training-data","page":"EvoTreeGaussian","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"where","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Operations","page":"EvoTreeGaussian","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"predict(mach, Xnew): returns a vector of Gaussian distributions given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Predictions are probabilistic.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"Specific metrics can also be predicted using:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"predict_mean(mach, Xnew)\npredict_mode(mach, Xnew)\npredict_median(mach, Xnew)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Fitted-parameters","page":"EvoTreeGaussian","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Report","page":"EvoTreeGaussian","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/#Examples","page":"EvoTreeGaussian","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"## Internal API\nusing EvoTrees\nparams = EvoTreeGaussian(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(nobs)\nmodel = fit_evotree(params; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeGaussian_EvoTrees/","page":"EvoTreeGaussian","title":"EvoTreeGaussian","text":"## MLJ Interface\nusing MLJ\nEvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees\nmodel = EvoTreeGaussian(max_depth=5, nbins=32, nrounds=100)\nX, y = @load_boston\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mean(mach, X)\npreds = predict_mode(mach, X)\npreds = predict_median(mach, X)","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/#GaussianMixtureImputer_BetaML","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"","category":"section"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"mutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/#Hyperparameters:","page":"GaussianMixtureImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to \"gived\" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures areSphericalGaussian,DiagonalGaussianandFullGaussian. [def:DiagonalGaussian`]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/GaussianMixtureImputer_BetaML/#Example-:","page":"GaussianMixtureImputer","title":"Example :","text":"","category":"section"},{"location":"models/GaussianMixtureImputer_BetaML/","page":"GaussianMixtureImputer","title":"GaussianMixtureImputer","text":"julia> using MLJ\n\njulia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;\n\njulia> modelType = @load GaussianMixtureImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.GaussianMixtureImputer\n\njulia> model = modelType(initialisation_strategy=\"grid\")\nGaussianMixtureImputer(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"grid\", \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(GaussianMixtureImputer(n_classes = 3, …), …).\nIter. 1: Var. of the post 2.0225921341714286 Log-likelihood -42.96100103213314\n\njulia> X_full = transform(mach) |> MLJ.matrix\n9×2 Matrix{Float64}:\n 1.0 10.5\n 1.5 14.7366\n 1.8 8.0\n 1.7 15.0\n 3.2 40.0\n 2.51842 15.1747\n 3.3 38.0\n 2.47412 -2.3\n 5.2 -2.4","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#HuberRegressor_MLJLinearModels","page":"HuberRegressor","title":"HuberRegressor","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"A model type for constructing a huber regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"HuberRegressor = @load HuberRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"Do model = HuberRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to HuberRho(delta), where delta is a new hyperparameter.","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#Training-data","page":"HuberRegressor","title":"Training data","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"where:","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#Hyperparameters","page":"HuberRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"delta::Real: parameterizes the HuberRho function (radius of the ball within which the loss is a quadratic loss) Default: 0.5\nlambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/#Example","page":"HuberRegressor","title":"Example","text":"","category":"section"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(HuberRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/HuberRegressor_MLJLinearModels/","page":"HuberRegressor","title":"HuberRegressor","text":"See also RobustRegressor, QuantileRegressor.","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#UnivariateTimeTypeToContinuous_MLJModels","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"UnivariateTimeTypeToContinuous","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Training-data","page":"UnivariateTimeTypeToContinuous","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"where","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"x: any abstract vector whose element type is a subtype of Dates.TimeType","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Hyper-parameters","page":"UnivariateTimeTypeToContinuous","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.\nstep::Period=Hour(24): time interval to correspond to one unit under transformation","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Operations","page":"UnivariateTimeTypeToContinuous","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"transform(mach, xnew): apply the encoding inferred when mach was fit","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Fitted-parameters","page":"UnivariateTimeTypeToContinuous","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.","category":"page"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/#Example","page":"UnivariateTimeTypeToContinuous","title":"Example","text":"","category":"section"},{"location":"models/UnivariateTimeTypeToContinuous_MLJModels/","page":"UnivariateTimeTypeToContinuous","title":"UnivariateTimeTypeToContinuous","text":"using MLJ\nusing Dates\n\nx = [Date(2001, 1, 1) + Day(i) for i in 0:4]\n\nencoder = UnivariateTimeTypeToContinuous(zero_time=Date(2000, 1, 1),\n step=Week(1))\n\nmach = machine(encoder, x)\nfit!(mach)\njulia> transform(mach, x)\n5-element Vector{Float64}:\n 52.285714285714285\n 52.42857142857143\n 52.57142857142857\n 52.714285714285715\n 52.857142","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#AdaBoostStumpClassifier_DecisionTree","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"AdaBoostStumpClassifier","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"A model type for constructing a Ada-boosted stump classifier, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"AdaBoostStumpClassifier = @load AdaBoostStumpClassifier pkg=DecisionTree","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"Do model = AdaBoostStumpClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostStumpClassifier(n_iter=...).","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Training-data","page":"AdaBoostStumpClassifier","title":"Training data","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"where:","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Hyperparameters","page":"AdaBoostStumpClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"n_iter=10: number of iterations of AdaBoost\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Operations","page":"AdaBoostStumpClassifier","title":"Operations","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.\npredict_mode(mach, Xnew): instead return the mode of each prediction above.","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Fitted-Parameters","page":"AdaBoostStumpClassifier","title":"Fitted Parameters","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"stumps: the Ensemble object returned by the core DecisionTree.jl algorithm.\ncoefficients: the stump coefficients (one per stump)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Report","page":"AdaBoostStumpClassifier","title":"Report","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Accessor-functions","page":"AdaBoostStumpClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/#Examples","page":"AdaBoostStumpClassifier","title":"Examples","text":"","category":"section"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"using MLJ\nBooster = @load AdaBoostStumpClassifier pkg=DecisionTree\nbooster = Booster(n_iter=15)\n\nX, y = @load_iris\nmach = machine(booster, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\nyhat = predict(mach, Xnew) ## probabilistic predictions\npredict_mode(mach, Xnew) ## point predictions\npdf.(yhat, \"virginica\") ## probabilities for the \"verginica\" class\n\nfitted_params(mach).stumps ## raw `Ensemble` object from DecisionTree.jl\nfitted_params(mach).coefs ## coefficient associated with each stump\nfeature_importances(mach)","category":"page"},{"location":"models/AdaBoostStumpClassifier_DecisionTree/","page":"AdaBoostStumpClassifier","title":"AdaBoostStumpClassifier","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.AdaBoostStumpClassifier.","category":"page"},{"location":"models/CBLOFDetector_OutlierDetectionPython/#CBLOFDetector_OutlierDetectionPython","page":"CBLOFDetector","title":"CBLOFDetector","text":"","category":"section"},{"location":"models/CBLOFDetector_OutlierDetectionPython/","page":"CBLOFDetector","title":"CBLOFDetector","text":"CBLOFDetector(n_clusters = 8,\n alpha = 0.9,\n beta = 5,\n use_weights = false,\n random_state = nothing,\n n_jobs = 1)","category":"page"},{"location":"models/CBLOFDetector_OutlierDetectionPython/","page":"CBLOFDetector","title":"CBLOFDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cblof","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/#LassoLarsRegressor_MLJScikitLearnInterface","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"","category":"section"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"LassoLarsRegressor","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"A model type for constructing a Lasso model fit with least angle regression (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"LassoLarsRegressor = @load LassoLarsRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"Do model = LassoLarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsRegressor(alpha=...).","category":"page"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoLarsRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoLarsRegressor_MLJScikitLearnInterface/","page":"LassoLarsRegressor","title":"LassoLarsRegressor","text":"alpha = 1.0\nfit_intercept = true\nverbose = false\nprecompute = auto\nmax_iter = 500\neps = 2.220446049250313e-16\ncopy_X = true\nfit_path = true\npositive = false","category":"page"},{"location":"models/TSVDTransformer_TSVD/#TSVDTransformer_TSVD","page":"TSVDTransformer","title":"TSVDTransformer","text":"","category":"section"},{"location":"models/TSVDTransformer_TSVD/","page":"TSVDTransformer","title":"TSVDTransformer","text":"Truncated SVD dimensionality reduction","category":"page"},{"location":"models/COFDetector_OutlierDetectionPython/#COFDetector_OutlierDetectionPython","page":"COFDetector","title":"COFDetector","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionPython/","page":"COFDetector","title":"COFDetector","text":"COFDetector(n_neighbors = 5,\n method=\"fast\")","category":"page"},{"location":"models/COFDetector_OutlierDetectionPython/","page":"COFDetector","title":"COFDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cof","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#ProbabilisticSVC_LIBSVM","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"ProbabilisticSVC","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Do model = ProbabilisticSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...).","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"This model is identical to SVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Platt, John (1999): \"Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.\"","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Training-data","page":"ProbabilisticSVC","title":"Training data","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"where","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)\nw: a dictionary of class weights, keyed on levels(y).","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Hyper-parameters","page":"ProbabilisticSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Operations","page":"ProbabilisticSVC","title":"Operations","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"predict(mach, Xnew): return probabilistic predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Fitted-parameters","page":"ProbabilisticSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Report","page":"ProbabilisticSVC","title":"Report","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Examples","page":"ProbabilisticSVC","title":"Examples","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/#Using-a-built-in-kernel","page":"ProbabilisticSVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"using MLJ\nimport LIBSVM\n\nProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM ## model type\nmodel = ProbabilisticSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> probs = predict(mach, Xnew)\n3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.00186, versicolor=>0.003, virginica=>0.995)\n UnivariateFinite{Multiclass{3}}(setosa=>0.000563, versicolor=>0.0554, virginica=>0.944)\n UnivariateFinite{Multiclass{3}}(setosa=>1.4e-6, versicolor=>1.68e-6, virginica=>1.0)\n\n\njulia> labels = mode.(probs)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#User-defined-kernels","page":"ProbabilisticSVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = ProbabilisticSVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\nprobs = predict(mach, Xnew)","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/#Incorporating-class-weights","page":"ProbabilisticSVC","title":"Incorporating class weights","text":"","category":"section"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"In either scenario above, we can do:","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"weights = Dict(\"virginica\" => 1, \"versicolor\" => 20, \"setosa\" => 1)\nmach = machine(model, X, y, weights) |> fit!\n\nprobs = predict(mach, Xnew)","category":"page"},{"location":"models/ProbabilisticSVC_LIBSVM/","page":"ProbabilisticSVC","title":"ProbabilisticSVC","text":"See also the classifiers SVC, NuSVC and LinearSVC, and LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/#LogisticClassifier_MLJScikitLearnInterface","page":"LogisticClassifier","title":"LogisticClassifier","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"A model type for constructing a logistic regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"LogisticClassifier = @load LogisticClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"Do model = LogisticClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticClassifier(penalty=...).","category":"page"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"LogisticClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LogisticClassifier_MLJScikitLearnInterface/","page":"LogisticClassifier","title":"LogisticClassifier","text":"penalty = l2\ndual = false\ntol = 0.0001\nC = 1.0\nfit_intercept = true\nintercept_scaling = 1.0\nclass_weight = nothing\nrandom_state = nothing\nsolver = lbfgs\nmax_iter = 100\nmulti_class = auto\nverbose = 0\nwarm_start = false\nn_jobs = nothing\nl1_ratio = nothing","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#ImageClassifier_MLJFlux","page":"ImageClassifier","title":"ImageClassifier","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"A model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier = @load ImageClassifier pkg=MLJFlux","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Do model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Training-data","page":"ImageClassifier","title":"Training data","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Here:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"X is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Hyper-parameters","page":"ImageClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"builder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Operations","page":"ImageClassifier","title":"Operations","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Fitted-parameters","page":"ImageClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Report","page":"ImageClassifier","title":"Report","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/#Examples","page":"ImageClassifier","title":"Examples","text":"","category":"section"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"In this example we use MLJFlux and a custom builder to classify the MNIST image dataset.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"using MLJ\nusing Flux\nimport MLJFlux\nimport MLJIteration ## for `skip` control","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"First we want to download the MNIST dataset, and unpack into images and labels:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"import MLDatasets: MNIST\ndata = MNIST(split=:train)\nimages, labels = data.features, data.targets","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"In MLJ, integers cannot be used for encoding categorical data, so we must coerce them into the Multiclass scitype:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"labels = coerce(labels, Multiclass);","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Above images is a single array but MLJFlux requires the images to be a vector of individual image arrays:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"images = coerce(images, GrayImage);\nimages[1]","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"We start by defining a suitable builder object. This is a recipe for building the neural network. Our builder will work for images of any (constant) size, whether they be color or black and white (ie, single or multi-channel). The architecture always consists of six alternating convolution and max-pool layers, and a final dense layer; the filter size and the number of channels after each convolution layer is customizable.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"import MLJFlux\n\nstruct MyConvBuilder\n filter_size::Int\n channels1::Int\n channels2::Int\n channels3::Int\nend\n\nmake2d(x::AbstractArray) = reshape(x, :, size(x)[end])\n\nfunction MLJFlux.build(b::MyConvBuilder, rng, n_in, n_out, n_channels)\n k, c1, c2, c3 = b.filter_size, b.channels1, b.channels2, b.channels3\n mod(k, 2) == 1 || error(\"`filter_size` must be odd. \")\n p = div(k - 1, 2) ## padding to preserve image size\n init = Flux.glorot_uniform(rng)\n front = Chain(\n Conv((k, k), n_channels => c1, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c1 => c2, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c2 => c3, pad=(p, p), relu, init=init),\n MaxPool((2 ,2)),\n make2d)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n return Chain(front, Dense(d, n_out, init=init))\nend","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"It is important to note that in our build function, there is no final softmax. This is applied by default in all MLJFlux classifiers (override this using the finaliser hyperparameter).","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Now that our builder is defined, we can instantiate the actual MLJFlux model. If you have a GPU, you can substitute in acceleration=CUDALibs() below to speed up training.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"ImageClassifier = @load ImageClassifier pkg=MLJFlux\nclf = ImageClassifier(builder=MyConvBuilder(3, 16, 32, 32),\n batch_size=50,\n epochs=10,\n rng=123)","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"You can add Flux options such as optimiser and loss in the snippet above. Currently, loss must be a flux-compatible loss, and not an MLJ measure.","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"Next, we can bind the model with the data in a machine, and train using the first 500 images:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"mach = machine(clf, images, labels);\nfit!(mach, rows=1:500, verbosity=2);\nreport(mach)\nchain = fitted_params(mach)\nFlux.params(chain)[2]","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"We can tack on 20 more epochs by modifying the epochs field, and iteratively fit some more:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"clf.epochs = clf.epochs + 20\nfit!(mach, rows=1:500, verbosity=2);","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"We can also make predictions and calculate an out-of-sample loss estimate, using any MLJ measure (loss/score):","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"predicted_labels = predict(mach, rows=501:1000);\ncross_entropy(predicted_labels, labels[501:1000]) |> mean","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"The preceding fit!/predict/evaluate workflow can be alternatively executed as follows:","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"evaluate!(mach,\n resampling=Holdout(fraction_train=0.5),\n measure=cross_entropy,\n rows=1:1000,\n verbosity=0)","category":"page"},{"location":"models/ImageClassifier_MLJFlux/","page":"ImageClassifier","title":"ImageClassifier","text":"See also NeuralNetworkClassifier.","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/#DeterministicConstantRegressor_MLJModels","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"","category":"section"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"DeterministicConstantRegressor","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"A model type for constructing a deterministic constant regressor, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"DeterministicConstantRegressor = @load DeterministicConstantRegressor pkg=MLJModels","category":"page"},{"location":"models/DeterministicConstantRegressor_MLJModels/","page":"DeterministicConstantRegressor","title":"DeterministicConstantRegressor","text":"Do model = DeterministicConstantRegressor() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/SMOTE_Imbalance/#SMOTE_Imbalance","page":"SMOTE","title":"SMOTE","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Initiate a SMOTE model with the given hyper-parameters.","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"SMOTE","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"A model type for constructing a smote, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"SMOTE = @load SMOTE pkg=Imbalance","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Do model = SMOTE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTE(k=...).","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"SMOTE implements the SMOTE algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.","category":"page"},{"location":"models/SMOTE_Imbalance/#Training-data","page":"SMOTE","title":"Training data","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"mach = machine(model)","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"model = SMOTE()","category":"page"},{"location":"models/SMOTE_Imbalance/#Hyperparameters","page":"SMOTE","title":"Hyperparameters","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"k=5: Number of nearest neighbors to consider in the SMOTE algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/SMOTE_Imbalance/#Transform-Inputs","page":"SMOTE","title":"Transform Inputs","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/SMOTE_Imbalance/#Transform-Outputs","page":"SMOTE","title":"Transform Outputs","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/SMOTE_Imbalance/#Operations","page":"SMOTE","title":"Operations","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"transform(mach, X, y): resample the data X and y using SMOTE, returning both the new and original observations","category":"page"},{"location":"models/SMOTE_Imbalance/#Example","page":"SMOTE","title":"Example","text":"","category":"section"},{"location":"models/SMOTE_Imbalance/","page":"SMOTE","title":"SMOTE","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n## load SMOTE\nSMOTE = @load SMOTE pkg=Imbalance\n\n## wrap the model in a machine\noversampler = SMOTE(k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n","category":"page"},{"location":"adding_models_for_general_use/#Adding-Models-for-General-Use","page":"Adding Models for General Use","title":"Adding Models for General Use","text":"","category":"section"},{"location":"adding_models_for_general_use/","page":"Adding Models for General Use","title":"Adding Models for General Use","text":"To write a complete MLJ model interface for new or existing machine learning models, suitable for addition to the MLJ Model Registry, consult the MLJModelInterface.jl documentation.","category":"page"},{"location":"adding_models_for_general_use/","page":"Adding Models for General Use","title":"Adding Models for General Use","text":"For quick-and-dirty user-defined models see Simple User Defined Models.","category":"page"},{"location":"models/PartLS_PartitionedLS/#PartLS_PartitionedLS","page":"PartLS","title":"PartLS","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"PartLS","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"A model type for fitting a partitioned least squares model to data. Both an MLJ and native interface are provided.","category":"page"},{"location":"models/PartLS_PartitionedLS/#MLJ-Interface","page":"PartLS","title":"MLJ Interface","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"PartLS = @load PartLS pkg=PartitionedLS","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).","category":"page"},{"location":"models/PartLS_PartitionedLS/#Training-data","page":"PartLS","title":"Training data","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"where","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).\ny: any vector with Continuous element scitype. Check scitype with scitype(y).","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"Train the machine using fit!(mach).","category":"page"},{"location":"models/PartLS_PartitionedLS/#Hyper-parameters","page":"PartLS","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).\nP: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.\nη: the regularization parameter. It controls the strength of the regularization.\nϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.\nT: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.\nrng: the random number generator to use.\nIf nothing, the global random number generator rand is used.\nIf an integer, the global number generator rand is used after seeding it with the given integer.\nIf an object of type AbstractRNG, the given random number generator is used.","category":"page"},{"location":"models/PartLS_PartitionedLS/#Operations","page":"PartLS","title":"Operations","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"predict(mach, Xnew): return the predictions of the model on new data Xnew","category":"page"},{"location":"models/PartLS_PartitionedLS/#Fitted-parameters","page":"PartLS","title":"Fitted parameters","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"α: the values of the α variables. For each partition k, it holds the values of the α variables are such that sum_i in P_k alpha_k = 1.\nβ: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.\nt: the intercept term of the model.\nP: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.","category":"page"},{"location":"models/PartLS_PartitionedLS/#Examples","page":"PartLS","title":"Examples","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"PartLS = @load PartLS pkg=PartitionedLS\n\nX = [[1. 2. 3.];\n [3. 3. 4.];\n [8. 1. 3.];\n [5. 3. 1.]]\n\ny = [1.;\n 1.;\n 2.;\n 3.]\n\nP = [[1 0];\n [1 0];\n [0 1]]\n\n\nmodel = PartLS(P=P)\nmach = machine(model, X, y) |> fit!\n\n## predictions on the training set:\npredict(mach, X)\n","category":"page"},{"location":"models/PartLS_PartitionedLS/#Native-Interface","page":"PartLS","title":"Native Interface","text":"","category":"section"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"using PartitionedLS\n\nX = [[1. 2. 3.];\n [3. 3. 4.];\n [8. 1. 3.];\n [5. 3. 1.]]\n\ny = [1.;\n 1.;\n 2.;\n 3.]\n\nP = [[1 0];\n [1 0];\n [0 1]]\n\n\n## fit using the optimal algorithm\nresult = fit(Opt, X, y, P, η = 0.0)\ny_hat = predict(result.model, X)","category":"page"},{"location":"models/PartLS_PartitionedLS/","page":"PartLS","title":"PartLS","text":"For other fit keyword options, refer to the \"Hyper-parameters\" section for the MLJ interface.","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/#HDBSCAN_MLJScikitLearnInterface","page":"HDBSCAN","title":"HDBSCAN","text":"","category":"section"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"HDBSCAN","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"A model type for constructing a hdbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"HDBSCAN = @load HDBSCAN pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"Do model = HDBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HDBSCAN(min_cluster_size=...).","category":"page"},{"location":"models/HDBSCAN_MLJScikitLearnInterface/","page":"HDBSCAN","title":"HDBSCAN","text":"Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection. ","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/#MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"","category":"section"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"MultiTaskElasticNetCVRegressor","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"A model type for constructing a multi-target elastic net regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"MultiTaskElasticNetCVRegressor = @load MultiTaskElasticNetCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"Do model = MultiTaskElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetCVRegressor(l1_ratio=...).","category":"page"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskElasticNetCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskElasticNetCVRegressor","title":"MultiTaskElasticNetCVRegressor","text":"l1_ratio = 0.5\neps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nmax_iter = 1000\ntol = 0.0001\ncv = 5\ncopy_X = true\nverbose = 0\nn_jobs = nothing\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/#XGBoostRegressor_XGBoost","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"","category":"section"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"XGBoostRegressor","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"A model type for constructing a eXtreme Gradient Boosting Regressor, based on XGBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"XGBoostRegressor = @load XGBoostRegressor pkg=XGBoost","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"Do model = XGBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostRegressor(test=...).","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"Univariate continuous regression using xgboost.","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/#Training-data","page":"XGBoostRegressor","title":"Training data","text":"","category":"section"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"m = machine(model, X, y)","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"where","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"X: any table of input features whose columns have Continuous element scitype; check column scitypes with schema(X).\ny: is an AbstractVector target with Continuous elements; check the scitype with scitype(y).","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"Train using fit!(m, rows=...).","category":"page"},{"location":"models/XGBoostRegressor_XGBoost/#Hyper-parameters","page":"XGBoostRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/XGBoostRegressor_XGBoost/","page":"XGBoostRegressor","title":"XGBoostRegressor","text":"See https://xgboost.readthedocs.io/en/stable/parameter.html.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#LinearCountRegressor_GLM","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"LinearCountRegressor","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"A model type for constructing a linear count regressor, based on GLM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"LinearCountRegressor = @load LinearCountRegressor pkg=GLM","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"Do model = LinearCountRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearCountRegressor(fit_intercept=...).","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"LinearCountRegressor is a generalized linear model, specialised to the case of a Count target variable (non-negative, unbounded integer) with user-specified link function. Options exist to specify an intercept or offset feature.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Training-data","page":"LinearCountRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"Here","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Count; check the scitype with schema(y)\nw: is a vector of Real per-observation weights","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Hyper-parameters","page":"LinearCountRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)\ndistribution=Distributions.Poisson(): The distribution which the residuals/errors of the model should fit.\nlink=GLM.LogLink(): The function which links the linear prediction function to the probability of a particular outcome or class. This should be one of the following: GLM.IdentityLink(), GLM.InverseLink(), GLM.InverseSquareLink(), GLM.LogLink(), GLM.SqrtLink().\noffsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.\nmaxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.\natol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nrtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nminstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.\nreport_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Operations","page":"LinearCountRegressor","title":"Operations","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.\npredict_mean(mach, Xnew): instead return the mean of each prediction above\npredict_median(mach, Xnew): instead return the median of each prediction above.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Fitted-parameters","page":"LinearCountRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"features: The names of the features encountered during model fitting.\ncoef: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Report","page":"LinearCountRegressor","title":"Report","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares\ndof_residual: The degrees of freedom for residuals, when meaningful.\nstderror: The standard errors of the coefficients.\nvcov: The estimated variance-covariance matrix of the coefficient estimates.\ncoef_table: Table which displays coefficients and summarizes their significance and confidence intervals.\nglm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.","category":"page"},{"location":"models/LinearCountRegressor_GLM/#Examples","page":"LinearCountRegressor","title":"Examples","text":"","category":"section"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"using MLJ\nimport MLJ.Distributions.Poisson\n\n## Generate some data whose target y looks Poisson when conditioned on\n## X:\nN = 10_000\nw = [1.0, -2.0, 3.0]\nmu(x) = exp(w'x) ## mean for a log link function\nXmat = rand(N, 3)\nX = MLJ.table(Xmat)\ny = map(1:N) do i\n x = Xmat[i, :]\n rand(Poisson(mu(x)))\nend;\n\nCountRegressor = @load LinearCountRegressor pkg=GLM\nmodel = CountRegressor(fit_intercept=false)\nmach = machine(model, X, y)\nfit!(mach)\n\nXnew = MLJ.table(rand(3, 3))\nyhat = predict(mach, Xnew)\nyhat_point = predict_mean(mach, Xnew)\n\n## get coefficients approximating `w`:\njulia> fitted_params(mach).coef\n3-element Vector{Float64}:\n 0.9969008753103842\n -2.0255901752504775\n 3.014407534033522\n\nreport(mach)","category":"page"},{"location":"models/LinearCountRegressor_GLM/","page":"LinearCountRegressor","title":"LinearCountRegressor","text":"See also LinearRegressor, LinearBinaryClassifier","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/#ElasticNetCVRegressor_MLJScikitLearnInterface","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"","category":"section"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"ElasticNetCVRegressor","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"A model type for constructing a elastic net regression with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"ElasticNetCVRegressor = @load ElasticNetCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"Do model = ElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetCVRegressor(l1_ratio=...).","category":"page"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"ElasticNetCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ElasticNetCVRegressor_MLJScikitLearnInterface/","page":"ElasticNetCVRegressor","title":"ElasticNetCVRegressor","text":"l1_ratio = 0.5\neps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nprecompute = auto\nmax_iter = 1000\ntol = 0.0001\ncv = 5\ncopy_X = true\nverbose = 0\nn_jobs = nothing\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#NeuralNetworkRegressor_BetaML","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"mutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#Parameters:","page":"NeuralNetworkRegressor","title":"Parameters:","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers\nloss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices, even if the regression task is 1-D\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]\nbatch_size: Size of each individual batch [def: 16]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ndescr: An optional title and/or description for this model\ncb: A call back function to provide information during training [def: fitting_info]\nrng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#Notes:","page":"NeuralNetworkRegressor","title":"Notes:","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"data must be numerical\nthe label should be be a n-records vector.","category":"page"},{"location":"models/NeuralNetworkRegressor_BetaML/#Example:","page":"NeuralNetworkRegressor","title":"Example:","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_BetaML/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load NeuralNetworkRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Nn.NeuralNetworkRegressor\n\njulia> layers = [BetaML.DenseLayer(12,20,f=BetaML.relu),BetaML.DenseLayer(20,20,f=BetaML.relu),BetaML.DenseLayer(20,1,f=BetaML.relu)];\n\njulia> model = modelType(layers=layers,opt_alg=BetaML.ADAM());\nNeuralNetworkRegressor(\n layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.23249759178069676 -0.4125090172711131 … 0.41401934928739 -0.33017881111237535; -0.27912169279319965 0.270551221249931 … 0.19258414323473344 0.1703002982374256; … ; 0.31186742456482447 0.14776438287394805 … 0.3624993442655036 0.1438885872964824; 0.24363744610286758 -0.3221033024934767 … 0.14886090419299408 0.038411663101909355], [-0.42360286004241765, -0.34355377040029594, 0.11510963232946697, 0.29078650404397893, -0.04940236502546075, 0.05142849152316714, -0.177685375947775, 0.3857630523957018, -0.25454667127064756, -0.1726731848206195, 0.29832456225553444, -0.21138505291162835, -0.15763643112604903, -0.08477044513587562, -0.38436681165349196, 0.20538016429104916, -0.25008157754468335, 0.268681800562054, 0.10600581996650865, 0.4262194464325672], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.08534180387478185 0.19659398307677617 … -0.3413633217504578 -0.0484925247381256; 0.0024419192794883915 -0.14614102508129 … -0.21912059923003044 0.2680725396694708; … ; 0.25151545823147886 -0.27532269951606037 … 0.20739970895058063 0.2891938885916349; -0.1699020711688904 -0.1350423717084296 … 0.16947589410758873 0.3629006047373296], [0.2158116357688406, -0.3255582642532289, -0.057314442103850394, 0.29029696770539953, 0.24994080694366455, 0.3624239027782297, -0.30674318230919984, -0.3854738338935017, 0.10809721838554087, 0.16073511121016176, -0.005923262068960489, 0.3157147976348795, -0.10938918304264739, -0.24521229198853187, -0.307167732178712, 0.0808907777008302, -0.014577497150872254, -0.0011287181458157214, 0.07522282588658086, 0.043366500526073104], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.021367697115938555 -0.28326652172347155 … 0.05346175368370165 -0.26037328415871647], [-0.2313659199724562], BetaML.Utils.relu, BetaML.Utils.drelu)], \n loss = BetaML.Utils.squared_cost, \n dloss = BetaML.Utils.dsquared_cost, \n epochs = 100, \n batch_size = 32, \n opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var\"#90#93\"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]), \n shuffle = true, \n descr = \"\", \n cb = BetaML.Nn.fitting_info, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> ŷ = predict(mach, X);\n\njulia> hcat(y,ŷ)\n506×2 Matrix{Float64}:\n 24.0 30.7726\n 21.6 28.0811\n 34.7 31.3194\n ⋮ \n 23.9 30.9032\n 22.0 29.49\n 11.9 27.2438","category":"page"},{"location":"learning_mlj/#Learning-MLJ","page":"Learning MLJ","title":"Learning MLJ","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"MLJ Cheatsheet","category":"page"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"See also Getting help and reporting problems.","category":"page"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"The present document, although littered with examples, is primarily intended as a complete reference. ","category":"page"},{"location":"learning_mlj/#Where-to-start?","page":"Learning MLJ","title":"Where to start?","text":"","category":"section"},{"location":"learning_mlj/#Completely-new-to-Julia?","page":"Learning MLJ","title":"Completely new to Julia?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Julia's learning resources page | Learn X in Y minutes | HelloJulia","category":"page"},{"location":"learning_mlj/#New-to-data-science?","page":"Learning MLJ","title":"New to data science?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Julia Data Science","category":"page"},{"location":"learning_mlj/#New-to-machine-learning?","page":"Learning MLJ","title":"New to machine learning?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Introduction to Statistical Learning with Julia versions of the R labs here","category":"page"},{"location":"learning_mlj/#Know-some-ML-and-just-want-MLJ-basics?","page":"Learning MLJ","title":"Know some ML and just want MLJ basics?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Getting Started | Common MLJ Workflows","category":"page"},{"location":"learning_mlj/#An-ML-practitioner-transitioning-from-another-platform?","page":"Learning MLJ","title":"An ML practitioner transitioning from another platform?","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"MLJ for Data Scientists in Two Hours | MLJTutorial","category":"page"},{"location":"learning_mlj/#Other-resources","page":"Learning MLJ","title":"Other resources","text":"","category":"section"},{"location":"learning_mlj/","page":"Learning MLJ","title":"Learning MLJ","text":"Data Science Tutorials: MLJ tutorials including end-to-end examples, and \"Introduction to Statistical Learning\" labs\nMLCourse: Teaching material for an introductory machine learning course at EPFL (for an interactive preview see here).\nJulia Boards the Titanic Blog post on using MLJ for users new to Julia. \nAnalyzing the Glass Dataset: A gentle introduction to data science using Julia and MLJ (three-part blog post)\nLightning Tour: A compressed demonstration of key MLJ functionality\nMLJ JuliaCon2020 Workshop: older version of MLJTutorial with video\nLearning Networks: For advanced MLJ users wanting to wrap workflows more complicated than linear pipelines\nMachine Learning Property Loans for Fun and Profit - Blog post demonstrating the use of MLJ to predict prospects for investment in property development loans. \nPredicting a Successful Mt Everest Climb - Blog post using MLJ to discover factors correlating with success in expeditions to climb the world's highest peak.","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/#LassoLarsICRegressor_MLJScikitLearnInterface","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"","category":"section"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"LassoLarsICRegressor","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"A model type for constructing a Lasso model with LARS using BIC or AIC for model selection, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"LassoLarsICRegressor = @load LassoLarsICRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"Do model = LassoLarsICRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsICRegressor(criterion=...).","category":"page"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoLarsICRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoLarsICRegressor_MLJScikitLearnInterface/","page":"LassoLarsICRegressor","title":"LassoLarsICRegressor","text":"criterion = aic\nfit_intercept = true\nverbose = false\nprecompute = auto\nmax_iter = 500\neps = 2.220446049250313e-16\ncopy_X = true\npositive = false","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/#GaussianMixtureRegressor_BetaML","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"","category":"section"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"mutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor.","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/#Hyperparameters:","page":"GaussianMixtureRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to \"gived\" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nmaximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/GaussianMixtureRegressor_BetaML/#Example:","page":"GaussianMixtureRegressor","title":"Example:","text":"","category":"section"},{"location":"models/GaussianMixtureRegressor_BetaML/","page":"GaussianMixtureRegressor","title":"GaussianMixtureRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load GaussianMixtureRegressor pkg = \"BetaML\" verbosity=0\nBetaML.GMM.GaussianMixtureRegressor\n\njulia> model = modelType()\nGaussianMixtureRegressor(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"kmeans\", \n maximum_iterations = 9223372036854775807, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(GaussianMixtureRegressor(n_classes = 3, …), …).\nIter. 1: Var. of the post 21.74887448784976 Log-likelihood -21687.09917379566\n\njulia> ŷ = predict(mach, X)\n506-element Vector{Float64}:\n 24.703442835305577\n 24.70344283512716\n ⋮\n 17.172486989759676\n 17.172486989759644","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#MultitargetNeuralNetworkRegressor_MLJFlux","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"A model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Do model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Training-data","page":"MultitargetNeuralNetworkRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Hyper-parameters","page":"MultitargetNeuralNetworkRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Operations","page":"MultitargetNeuralNetworkRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Fitted-parameters","page":"MultitargetNeuralNetworkRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Report","page":"MultitargetNeuralNetworkRegressor","title":"Report","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/#Examples","page":"MultitargetNeuralNetworkRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"In this example we apply a multi-target regression model to synthetic data:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"using MLJ\nimport MLJFlux\nusing Flux","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"First, we generate some synthetic data (needs MLJBase 0.20.16 or higher):","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"X, y = make_regression(100, 9; n_targets = 2) ## both tables\nschema(y)\nschema(X)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Splitting off a test set:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Next, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features and n_out the number of target variables (both known at fit! time), while rng is a proxy for a RNG (which will be passed from the rng field of model defined below).","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"builder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"Instantiating the regression model:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor\nmodel = MultitargetNeuralNetworkRegressor(builder=builder, rng=123, epochs=20)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"We will arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"pipe = Standardizer |> TransformedTargetModel(model, target=Standardizer)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"If we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"mach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n## first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"For experimenting with learning rate, see the NeuralNetworkRegressor example.","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"pipe.transformed_target_model_deterministic.model.optimiser.eta = 0.0001","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"With the learning rate fixed, we can now compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"## custom MLJ loss:\nmulti_loss(yhat, y) = l2(MLJ.matrix(yhat), MLJ.matrix(y)) |> mean\n\n## CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=multi_loss)\n\n## loss for `(Xtest, test)`:\nfit!(mach) ## trains on all data `(X, y)`\nyhat = predict(mach, Xtest)\nmulti_loss(yhat, ytest)","category":"page"},{"location":"models/MultitargetNeuralNetworkRegressor_MLJFlux/","page":"MultitargetNeuralNetworkRegressor","title":"MultitargetNeuralNetworkRegressor","text":"See also NeuralNetworkRegressor","category":"page"},{"location":"models/Standardizer_MLJModels/#Standardizer_MLJModels","page":"Standardizer","title":"Standardizer","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Standardizer","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Standardizer = @load Standardizer pkg=MLJModels","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.","category":"page"},{"location":"models/Standardizer_MLJModels/#Training-data","page":"Standardizer","title":"Training data","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"mach = machine(model, X)","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"where","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/Standardizer_MLJModels/#Hyper-parameters","page":"Standardizer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"features: one of the following, with the behavior indicated below:\n[] (empty, the default): standardize all features (columns) having Continuous element scitype\nnon-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).\nfunction or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.\nNote this behavior is further modified if the ordered_factor or count flags are set to true; see below\nignore=false: whether to ignore or standardize specified features, as explained above\nordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above\ncount=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above","category":"page"},{"location":"models/Standardizer_MLJModels/#Operations","page":"Standardizer","title":"Operations","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.\ninverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.","category":"page"},{"location":"models/Standardizer_MLJModels/#Fitted-parameters","page":"Standardizer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"features_fit - the names of features that will be standardized\nmeans - the corresponding untransformed mean values\nstds - the corresponding untransformed standard deviations","category":"page"},{"location":"models/Standardizer_MLJModels/#Report","page":"Standardizer","title":"Report","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"The fields of report(mach) are:","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"features_fit: the names of features that will be standardized","category":"page"},{"location":"models/Standardizer_MLJModels/#Examples","page":"Standardizer","title":"Examples","text":"","category":"section"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"using MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([:x, :y, :x], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\njulia> schema(X)\n┌──────────┬──────────────────┐\n│ names │ scitypes │\n├──────────┼──────────────────┤\n│ ordinal1 │ Count │\n│ ordinal2 │ OrderedFactor{2} │\n│ ordinal3 │ Continuous │\n│ ordinal4 │ Continuous │\n│ nominal │ Multiclass{3} │\n└──────────┴──────────────────┘\n\nstand1 = Standardizer();\n\njulia> transform(fit!(machine(stand1, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [-1.0, 0.0, 1.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n\nstand2 = Standardizer(features=[:ordinal3, ], ignore=true, count=true);\n\njulia> transform(fit!(machine(stand2, X)), X)\n(ordinal1 = [-1.0, 0.0, 1.0],\n ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [1.0, 0.0, -1.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)","category":"page"},{"location":"models/Standardizer_MLJModels/","page":"Standardizer","title":"Standardizer","text":"See also OneHotEncoder, ContinuousEncoder.","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/#BernoulliNBClassifier_MLJScikitLearnInterface","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"","category":"section"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"BernoulliNBClassifier","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"A model type for constructing a Bernoulli naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"BernoulliNBClassifier = @load BernoulliNBClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"Do model = BernoulliNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BernoulliNBClassifier(alpha=...).","category":"page"},{"location":"models/BernoulliNBClassifier_MLJScikitLearnInterface/","page":"BernoulliNBClassifier","title":"BernoulliNBClassifier","text":"Binomial naive bayes classifier. It is suitable for classification with binary features; features will be binarized based on the binarize keyword (unless it's nothing in which case the features are assumed to be binary).","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#MultitargetKNNRegressor_NearestNeighborModels","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"MultitargetKNNRegressor","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"A model type for constructing a multitarget K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"MultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Do model = MultitargetKNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNRegressor(K=...).","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Multi-target K-Nearest Neighbors regressor (MultitargetKNNRegressor) is a variation of KNNRegressor that assumes the target variable is vector-valued with Continuous components. (Target data must be presented as a table, however.)","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Training-data","page":"MultitargetKNNRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"OR","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any table of responses whose element scitype is Continuous; check column scitypes with schema(y).\nw is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Hyper-parameters","page":"MultitargetKNNRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Operations","page":"MultitargetKNNRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Fitted-parameters","page":"MultitargetKNNRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/#Examples","page":"MultitargetKNNRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"using MLJ\n\n## Create Data\nX, y = make_regression(10, 5, n_targets=2)\n\n## load MultitargetKNNRegressor\nMultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels\n\n## view possible kernels\nNearestNeighborModels.list_kernels()\n\n## MutlitargetKNNRegressor instantiation\nmodel = MultitargetKNNRegressor(weights = NearestNeighborModels.Inverse())\n\n## Wrap model and required data in an MLJ machine and fit.\nmach = machine(model, X, y) |> fit! \n\n## Predict\ny_hat = predict(mach, X)\n","category":"page"},{"location":"models/MultitargetKNNRegressor_NearestNeighborModels/","page":"MultitargetKNNRegressor","title":"MultitargetKNNRegressor","text":"See also KNNRegressor","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#ABODDetector_OutlierDetectionNeighbors","page":"ABODDetector","title":"ABODDetector","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"ABODDetector(k = 5,\n metric = Euclidean(),\n algorithm = :kdtree,\n static = :auto,\n leafsize = 10,\n reorder = true,\n parallel = false,\n enhanced = false)","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Determine outliers based on the angles to its nearest neighbors. This implements the FastABOD variant described in the paper, that is, it uses the variance of angles to its nearest neighbors, not to the whole dataset, see [1]. ","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Notice: The scores are inverted, to conform to our notion that higher scores describe higher outlierness.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#Parameters","page":"ABODDetector","title":"Parameters","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"k::Integer","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"metric::Metric","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"leafsize::Int","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"reorder::Bool","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"parallel::Bool","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"enhanced::Bool","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"When enhanced=true, it uses the enhanced ABOD (EABOD) adaptation proposed by [2].","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#Examples","page":"ABODDetector","title":"Examples","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"using OutlierDetection: ABODDetector, fit, transform\ndetector = ABODDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/#References","page":"ABODDetector","title":"References","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"[1] Kriegel, Hans-Peter; S hubert, Matthias; Zimek, Arthur (2008): Angle-based outlier detection in high-dimensional data.","category":"page"},{"location":"models/ABODDetector_OutlierDetectionNeighbors/","page":"ABODDetector","title":"ABODDetector","text":"[2] Li, Xiaojie; Lv, Jian Cheng; Cheng, Dongdong (2015): Angle-Based Outlier Detection Algorithm with More Stable Relationships.","category":"page"},{"location":"models/Stack_MLJBase/#Stack_MLJBase","page":"Stack","title":"Stack","text":"","category":"section"},{"location":"models/Stack_MLJBase/","page":"Stack","title":"Stack","text":"Union{Types...}","category":"page"},{"location":"models/Stack_MLJBase/","page":"Stack","title":"Stack","text":"A type union is an abstract type which includes all instances of any of its argument types. The empty union Union{} is the bottom type of Julia.","category":"page"},{"location":"models/Stack_MLJBase/#Examples","page":"Stack","title":"Examples","text":"","category":"section"},{"location":"models/Stack_MLJBase/","page":"Stack","title":"Stack","text":"julia> IntOrString = Union{Int,AbstractString}\nUnion{Int64, AbstractString}\n\njulia> 1 isa IntOrString\ntrue\n\njulia> \"Hello!\" isa IntOrString\ntrue\n\njulia> 1.0 isa IntOrString\nfalse","category":"page"},{"location":"machines/#Machines","page":"Machines","title":"Machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Recall from Getting Started that a machine binds a model (i.e., a choice of algorithm + hyperparameters) to data (see more at Constructing machines below). A machine is also the object storing learned parameters. Under the hood, calling fit! on a machine calls either MLJBase.fit or MLJBase.update, depending on the machine's internal state (as recorded in private fields old_model and old_rows). These lower-level fit and update methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows keyword argument of fit! (all rows by default).","category":"page"},{"location":"machines/#Warm-restarts","page":"Machines","title":"Warm restarts","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"If a model update method has been implemented for the model, calls to fit! will avoid redundant calculations for certain kinds of model mutations. The main use-case is increasing an iteration parameter, such as the number of epochs in a neural network. To test if SomeIterativeModel supports this feature, check iteration_parameter(SomeIterativeModel) is different from nothing.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"using MLJ; color_off() # hide\ntree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()\nforest = EnsembleModel(model=tree, n=10);\nX, y = @load_iris;\nmach = machine(forest, X, y)\nfit!(mach, verbosity=2);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Generally, changing a hyperparameter triggers retraining on calls to subsequent fit!:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"forest.bagging_fraction = 0.5;\nfit!(mach, verbosity=2);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"However, for this iterative model, increasing the iteration parameter only adds models to the existing ensemble:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"forest.n = 15;\nfit!(mach, verbosity=2);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Call fit! again without making a change and no retraining occurs:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"However, retraining can be forced:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach, force=true);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"And is re-triggered if the view of the data changes:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach, rows=1:100);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fit!(mach, rows=1:100);","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"If an iterative model exposes its iteration parameter as a hyperparameter, and it implements the warm restart behavior above, then it can be wrapped in a \"control strategy\", like an early stopping criterion. See Controlling Iterative Models for details.","category":"page"},{"location":"machines/#Inspecting-machines","page":"Machines","title":"Inspecting machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"There are two principal methods for inspecting the outcomes of training in MLJ. To obtain a named-tuple describing the learned parameters (in a user-friendly way where possible) use fitted_params(mach). All other training-related outcomes are inspected with report(mach).","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"X, y = @load_iris\npca = (@load PCA verbosity=0)()\nmach = machine(pca, X)\nfit!(mach)","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fitted_params(mach)\nreport(mach)","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"fitted_params(::Machine)\nreport(::Machine)","category":"page"},{"location":"machines/#MLJModelInterface.fitted_params-Tuple{Machine}","page":"Machines","title":"MLJModelInterface.fitted_params","text":"fitted_params(mach)\n\nReturn the learned parameters for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using the pipeline syntax model1 |> model2 |> ..., then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\njulia> using MLJ\njulia> @load LogisticClassifier pkg=MLJLinearModels\njulia> X, y = @load_crabs;\njulia> pipe = Standardizer() |> LogisticClassifier();\njulia> mach = machine(pipe, X, y) |> fit!;\n\njulia> fitted_params(mach).logistic_classifier\n(classes = CategoricalArrays.CategoricalValue{String,UInt32}[\"B\", \"O\"],\n coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471],\n intercept = 0.0883301599726305,)\n\nSee also report\n\n\n\n\n\n","category":"method"},{"location":"machines/#MLJBase.report-Tuple{Machine}","page":"Machines","title":"MLJBase.report","text":"report(mach)\n\nReturn the report for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using the pipeline syntax model1 |> model2 |> ..., then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\njulia> using MLJ\njulia> @load LinearBinaryClassifier pkg=GLM\njulia> X, y = @load_crabs;\njulia> pipe = Standardizer() |> LinearBinaryClassifier();\njulia> mach = machine(pipe, X, y) |> fit!;\n\njulia> report(mach).linear_binary_classifier\n(deviance = 3.8893386087844543e-7,\n dof_residual = 195.0,\n stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],\n vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)\n\n\nSee also fitted_params\n\n\n\n\n\n","category":"method"},{"location":"machines/#Training-losses-and-feature-importances","page":"Machines","title":"Training losses and feature importances","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Training losses and feature importances, if reported by a model, will be available in the machine's report (see above). However, there are also direct access methods where supported:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"training_losses(mach::Machine) -> vector_of_losses","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Here vector_of_losses will be in historical order (most recent loss last). This kind of access is supported for model = mach.model if supports_training_losses(model) == true.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"feature_importances(mach::Machine) -> vector_of_pairs","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Here a vector_of_pairs is a vector of elements of the form feature => importance_value, where feature is a symbol. For example, vector_of_pairs = [:gender => 0.23, :height => 0.7, :weight => 0.1]. If a model does not support feature importances for some model hyperparameters, every importance_value will be zero. This kind of access is supported for model = mach.model if reports_feature_importances(model) == true.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"If a model can report multiple types of feature importances, then there will be a model hyper-parameter controlling the active type.","category":"page"},{"location":"machines/#Constructing-machines","page":"Machines","title":"Constructing machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"A machine is constructed with the syntax machine(model, args...) where the possibilities for args (called training arguments) are summarized in the table below. Here X and y represent inputs and target, respectively, and Xout is the output of a transform call. Machines for supervised models may have additional training arguments, such as a vector of per-observation weights (in which case supports_weights(model) == true).","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"model supertype machine constructor calls operation calls (first compulsory)\nDeterministic <: Supervised machine(model, X, y, extras...) predict(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)\nProbabilistic <: Supervised machine(model, X, y, extras...) predict(mach, Xnew), predict_mean(mach, Xnew), predict_median(mach, Xnew), predict_mode(mach, Xnew), transform(mach, Xnew), inverse_transform(mach, Xout)\nUnsupervised (except Static) machine(model, X) transform(mach, Xnew), inverse_transform(mach, Xout), predict(mach, Xnew)\nStatic machine(model) transform(mach, Xnews...), inverse_transform(mach, Xout)","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"All operations on machines (predict, transform, etc) have exactly one argument (Xnew or Xout above) after mach, the machine instance. An exception is a machine bound to a Static model, which can have any number of arguments after mach. For more on Static transformers (which have no training arguments) see Static transformers.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"A machine is reconstructed from a file using the syntax machine(\"my_machine.jlso\"), or machine(\"my_machine.jlso\", args...) if retraining using new data. See Saving machines below.","category":"page"},{"location":"machines/#Lowering-memory-demands","page":"Machines","title":"Lowering memory demands","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"For large data sets, you may be able to save memory by suppressing data caching that some models perform to increase speed. To do this, specify cache=false, as in","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"machine(model, X, y, cache=false)","category":"page"},{"location":"machines/#Constructing-machines-in-learning-networks","page":"Machines","title":"Constructing machines in learning networks","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Instead of data X, y, etc, the machine constructor is provided Node or Source objects (\"dynamic data\") when building a learning network. See Learning Networks for more on this advanced feature.","category":"page"},{"location":"machines/#Saving-machines","page":"Machines","title":"Saving machines","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Users can save and restore MLJ machines using any external serialization package by suitably preparing their Machine object, and applying a post-processing step to the deserialized object. This is explained under Using an arbitrary serializer below.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"However, if a user is happy to use Julia's standard library Serialization module, there is a simplified workflow described first.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"The usual serialization provisos apply. For example, when deserializing you need to have all code on which the serialization object depended loaded at the time of deserialization also. If a hyper-parameter happens to be a user-defined function, then that function must be defined at deserialization. And you should only deserialize objects from trusted sources.","category":"page"},{"location":"machines/#Using-Julia's-native-serializer","page":"Machines","title":"Using Julia's native serializer","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"MLJBase.save","category":"page"},{"location":"machines/#MLJModelInterface.save","page":"Machines","title":"MLJModelInterface.save","text":"MLJ.save(filename, mach::Machine)\nMLJ.save(io, mach::Machine)\n\nMLJBase.save(filename, mach::Machine)\nMLJBase.save(io, mach::Machine)\n\nSerialize the machine mach to a file with path filename, or to an input/output stream io (at least IOBuffer instances are supported) using the Serialization module.\n\nTo serialise using a different format, see serializable.\n\nMachines are deserialized using the machine constructor as shown in the example below.\n\nThe implementation of save for machines changed in MLJ 0.18 (MLJBase 0.20). You can only restore a machine saved using older versions of MLJ using an older version.\n\nExample\n\nusing MLJ\nTree = @load DecisionTreeClassifier\nX, y = @load_iris\nmach = fit!(machine(Tree(), X, y))\n\nMLJ.save(\"tree.jls\", mach)\nmach_predict_only = machine(\"tree.jls\")\npredict(mach_predict_only, X)\n\n# using a buffer:\nio = IOBuffer()\nMLJ.save(io, mach)\nseekstart(io)\npredict_only_mach = machine(io)\npredict(predict_only_mach, X)\n\nwarning: Only load files from trusted sources\nMaliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.\n\nSee also serializable, machine.\n\n\n\n\n\n","category":"function"},{"location":"machines/#Using-an-arbitrary-serializer","page":"Machines","title":"Using an arbitrary serializer","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"Since machines contain training data, serializing a machine directly is not recommended. Also, the learned parameters of models implemented in a language other than Julia may not have persistent representations, which means serializing them is useless. To address these two issues, users:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Call serializable(mach) on a machine mach they wish to save (to remove data and create persistent learned parameters)\nSerialize the returned object using SomeSerializationPkg","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"To restore the original machine (minus training data) they:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Deserialize using SomeSerializationPkg to obtain a new object mach\nCall restore!(mach) to ensure mach can be used to predict or transform new data.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"MLJBase.serializable\nMLJBase.restore!","category":"page"},{"location":"machines/#MLJBase.serializable","page":"Machines","title":"MLJBase.serializable","text":"serializable(mach::Machine)\n\nReturns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.\n\nAny general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.\n\nIf using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.\n\nA machine returned by serializable is characterized by the property mach.state == -1.\n\nExample using JLSO\n\nusing MLJ\nusing JLSO\nTree = @load DecisionTreeClassifier\ntree = Tree()\nX, y = @load_iris\nmach = fit!(machine(tree, X, y))\n\n# This machine can now be serialized\nsmach = serializable(mach)\nJLSO.save(\"machine.jlso\", :machine => smach)\n\n# Deserialize and restore learned parameters to useable form:\nloaded_mach = JLSO.load(\"machine.jlso\")[:machine]\nrestore!(loaded_mach)\n\npredict(loaded_mach, X)\npredict(mach, X)\n\nSee also restore!, MLJBase.save.\n\n\n\n\n\n","category":"function"},{"location":"machines/#MLJBase.restore!","page":"Machines","title":"MLJBase.restore!","text":"restore!(mach::Machine)\n\nRestore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.\n\nFor an example see serializable.\n\n\n\n\n\n","category":"function"},{"location":"machines/#Internals","page":"Machines","title":"Internals","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"For a supervised machine, the predict method calls a lower-level MLJBase.predict method, dispatched on the underlying model and the fitresult (see below). To see predict in action, as well as its unsupervised cousins transform and inverse_transform, see Getting Started.","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"Except for model, a Machine instance has several fields which the user should not directly access; these include:","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"model - the struct containing the hyperparameters to be used in calls to fit!\nfitresult - the learned parameters in a raw form, initially undefined\nargs - a tuple of the data, each element wrapped in a source node; see Learning Networks (in the supervised learning example above, args = (source(X), source(y)))\nreport - outputs of training not encoded in fitresult (eg, feature rankings), initially undefined\nold_model - a deep copy of the model used in the last call to fit!\nold_rows - a copy of the row indices used in the last call to fit!\ncache","category":"page"},{"location":"machines/","page":"Machines","title":"Machines","text":"The interested reader can learn more about machine internals by examining the simplified code excerpt in Internals.","category":"page"},{"location":"machines/#API-Reference","page":"Machines","title":"API Reference","text":"","category":"section"},{"location":"machines/","page":"Machines","title":"Machines","text":"MLJBase.machine\nfit!\nfit_only!","category":"page"},{"location":"machines/#MLJBase.machine","page":"Machines","title":"MLJBase.machine","text":"machine(model, args...; cache=true, scitype_check_level=1)\n\nConstruct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:\n\nusing MLJModels\nX, y = make_regression()\n\nPCA = @load PCA pkg=MultivariateStats\nmodel = PCA()\nmach = machine(model, X)\nfit!(mach, rows=1:50)\ntransform(mach, selectrows(X, 51:100)) # or transform(mach, rows=51:100)\n\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nmodel = DecisionTreeRegressor()\nmach = machine(model, X, y)\nfit!(mach, rows=1:50)\npredict(mach, selectrows(X, 51:100)) # or predict(mach, rows=51:100)\n\nSpecify cache=false to prioritize memory management over speed.\n\nWhen building a learning network, Node objects can be substituted for the concrete data but no type or dimension checks are applied.\n\nChecks on the types of training data\n\nA model articulates its data requirements using scientific types, i.e., using the scitype function instead of the typeof function.\n\nIf scitype_check_level > 0 then the scitype of each arg in args is computed, and this is compared with the scitypes expected by the model, unless args contains Unknown scitypes and scitype_check_level < 4, in which case no further action is taken. Whether warnings are issued or errors thrown depends the level. For details, see default_scitype_check_level, a method to inspect or change the default level (1 at startup).\n\nMachines with model placeholders\n\nA symbol can be substituted for a model in machine constructors to act as a placeholder for a model specified at training time. The symbol must be the field name for a struct whose corresponding value is a model, as shown in the following example:\n\nmutable struct MyComposite\n transformer\n classifier\nend\n\nmy_composite = MyComposite(Standardizer(), ConstantClassifier)\n\nX, y = make_blobs()\nmach = machine(:classifier, X, y)\nfit!(mach, composite=my_composite)\n\nThe last two lines are equivalent to\n\nmach = machine(ConstantClassifier(), X, y)\nfit!(mach)\n\nDelaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.\n\nSee also fit!, default_scitype_check_level, MLJBase.save, serializable.\n\n\n\n\n\n","category":"function"},{"location":"machines/#StatsAPI.fit!","page":"Machines","title":"StatsAPI.fit!","text":"fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)\n\nFit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.\n\nTo attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!\n\n\n\n\n\nfit!(N::Node;\n rows=nothing,\n verbosity=1,\n force=false,\n acceleration=CPU1())\n\nTrain all machines required to call the node N, in an appropriate order, but parallelizing where possible using specified acceleration mode. These machines are those returned by machines(N).\n\nSupported modes of acceleration: CPU1(), CPUThreads().\n\n\n\n\n\n","category":"function"},{"location":"machines/#MLJBase.fit_only!","page":"Machines","title":"MLJBase.fit_only!","text":"MLJBase.fit_only!(\n mach::Machine;\n rows=nothing,\n verbosity=1,\n force=false,\n composite=nothing,\n)\n\nWithout mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:\n\nAb initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.\nTraining update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.\nNo-operation. Leave existing learned parameters untouched. Do not increment mach.state.\n\nIf the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.\n\nTraining action logic\n\nFor the action to be a no-operation, either mach.frozen == true or or none of the following apply:\n\n(i) mach has never been trained (mach.state == 0).\n(ii) force == true.\n(iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).\n(iv) The specified rows have changed since the last retraining and mach.model does not have Static type.\n(v) mach.model is a model and different from the last model used for training, but has the same type.\n(vi) mach.model is a model but has a type different from the last model used for training.\n(vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.\n(viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.\n\nIn any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.\n\nTo freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).\n\nImplementation details\n\nThe data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the \"Adding models for general use\" section of the MLJ documentation for more on these lower-level training methods.\n\n\n\n\n\n","category":"function"},{"location":"models/TfidfTransformer_MLJText/#TfidfTransformer_MLJText","page":"TfidfTransformer","title":"TfidfTransformer","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"TfidfTransformer","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"A model type for constructing a TF-IFD transformer, based on MLJText.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"TfidfTransformer = @load TfidfTransformer pkg=MLJText","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Do model = TfidfTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TfidfTransformer(max_doc_freq=...).","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of TF-IDF scores. Here \"TF\" means term-frequency while \"IDF\" means inverse document frequency (defined below). The TF-IDF score is the product of the two. This is a common term weighting scheme in information retrieval, that has also found good use in document classification. The goal of using TF-IDF instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Training-data","page":"TfidfTransformer","title":"Training data","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"mach = machine(model, X)","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Here:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:\nA vector of abstract strings (tokens), e.g., [\"I\", \"like\", \"Sam\", \".\", \"Sam\", \"is\", \"nice\", \".\"] (scitype AbstractVector{Textual})\nA dictionary of counts, indexed on abstract strings, e.g., Dict(\"I\"=>1, \"Sam\"=>2, \"Sam is\"=>1) (scitype Multiset{Textual}})\nA dictionary of counts, indexed on plain ngrams, e.g., Dict((\"I\",)=>1, (\"Sam\",)=>2, (\"I\", \"Sam\")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Hyper-parameters","page":"TfidfTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.\nmin_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.\nsmooth_idf=true: Control which definition of IDF to use (see above).","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Operations","page":"TfidfTransformer","title":"Operations","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"transform(mach, Xnew): Based on the vocabulary and IDF learned in training, return the matrix of TF-IDF scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Fitted-parameters","page":"TfidfTransformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"vocab: A vector containing the strings used in the transformer's vocabulary.\nidf_vector: The transformer's calculated IDF vector.","category":"page"},{"location":"models/TfidfTransformer_MLJText/#Examples","page":"TfidfTransformer","title":"Examples","text":"","category":"section"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"TfidfTransformer accepts a variety of inputs. The example below transforms tokenized documents:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"using MLJ\nimport TextAnalysis\n\nTfidfTransformer = @load TfidfTransformer pkg=MLJText\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ntfidf_transformer = TfidfTransformer()\n\njulia> tokenized_docs = TextAnalysis.tokenize.(docs)\n2-element Vector{Vector{String}}:\n [\"Hi\", \"my\", \"name\", \"is\", \"Sam\", \".\"]\n [\"How\", \"are\", \"you\", \"today\", \"?\"]\n\nmach = machine(tfidf_transformer, tokenized_docs)\nfit!(mach)\n\nfitted_params(mach)\n\ntfidf_mat = transform(mach, tokenized_docs)","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"Alternatively, one can provide documents pre-parsed as ngrams counts:","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"using MLJ\nimport TextAnalysis\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncorpus = TextAnalysis.Corpus(TextAnalysis.NGramDocument.(docs, 1, 2))\nngram_docs = TextAnalysis.ngrams.(corpus)\n\njulia> ngram_docs[1]\nDict{AbstractString, Int64} with 11 entries:\n \"is\" => 1\n \"my\" => 1\n \"name\" => 1\n \".\" => 1\n \"Hi\" => 1\n \"Sam\" => 1\n \"my name\" => 1\n \"Hi my\" => 1\n \"name is\" => 1\n \"Sam .\" => 1\n \"is Sam\" => 1\n\ntfidf_transformer = TfidfTransformer()\nmach = machine(tfidf_transformer, ngram_docs)\nMLJ.fit!(mach)\nfitted_params(mach)\n\ntfidf_mat = transform(mach, ngram_docs)","category":"page"},{"location":"models/TfidfTransformer_MLJText/","page":"TfidfTransformer","title":"TfidfTransformer","text":"See also CountTransformer, BM25Transformer","category":"page"},{"location":"models/AutoEncoder_BetaML/#AutoEncoder_BetaML","page":"AutoEncoder","title":"AutoEncoder","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"mutable struct AutoEncoder <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks","category":"page"},{"location":"models/AutoEncoder_BetaML/#Parameters:","page":"AutoEncoder","title":"Parameters:","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"encoded_size: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def: 0.33]\nlayers_size: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def: nothing that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. If e_layers or d_layers are specified, this parameter is ignored for the respective part.\ne_layers: The layers (vector of AbstractLayers) responsable of the encoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers\nd_layers: The layers (vector of AbstractLayers) responsable of the decoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers\nloss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as (n x d) matrices.\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dsquared_cost if loss==squared_cost, nothing otherwise, i.e. use the derivative of the squared cost or autodiff]\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]\nbatch_size: Size of each individual batch [def: 8]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()] See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ntunemethod: The method - and its parameters - to employ for hyperparameters autotuning. See SuccessiveHalvingSearch for the default method. To implement automatic hyperparameter tuning during the (first) fit! call simply set autotune=true and eventually change the default tunemethod options (including the parameter ranges, the resources to employ and the loss function to adopt).\ndescr: An optional title and/or description for this model\nrng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/AutoEncoder_BetaML/#Notes:","page":"AutoEncoder","title":"Notes:","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"data must be numerical\nuse transform to obtain the encoded data, and inverse_trasnform to decode to the original data","category":"page"},{"location":"models/AutoEncoder_BetaML/#Example:","page":"AutoEncoder","title":"Example:","text":"","category":"section"},{"location":"models/AutoEncoder_BetaML/","page":"AutoEncoder","title":"AutoEncoder","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load AutoEncoder pkg = \"BetaML\" verbosity=0;\n\njulia> model = modelType(encoded_size=2,layers_size=10);\n\njulia> mach = machine(model, X)\nuntrained Machine; caches model-specific representations of data\n model: AutoEncoder(e_layers = nothing, …)\n args: \n 1:\tSource @334 ⏎ Table{AbstractVector{Continuous}}\n\njulia> fit!(mach,verbosity=2)\n[ Info: Training machine(AutoEncoder(e_layers = nothing, …), …).\n***\n*** Training for 200 epochs with algorithm BetaML.Nn.ADAM.\nTraining.. \t avg loss on epoch 1 (1): \t 35.48243542158747\nTraining.. \t avg loss on epoch 20 (20): \t 0.07528042222678126\nTraining.. \t avg loss on epoch 40 (40): \t 0.06293071729378613\nTraining.. \t avg loss on epoch 60 (60): \t 0.057035588828991145\nTraining.. \t avg loss on epoch 80 (80): \t 0.056313167754822875\nTraining.. \t avg loss on epoch 100 (100): \t 0.055521461091809436\nTraining the Neural Network... 52%|██████████████████████████████████████ | ETA: 0:00:01Training.. \t avg loss on epoch 120 (120): \t 0.06015206472927942\nTraining.. \t avg loss on epoch 140 (140): \t 0.05536835903285201\nTraining.. \t avg loss on epoch 160 (160): \t 0.05877560142428245\nTraining.. \t avg loss on epoch 180 (180): \t 0.05476302769966953\nTraining.. \t avg loss on epoch 200 (200): \t 0.049240864053557445\nTraining the Neural Network... 100%|█████████████████████████████████████████████████████████████████████████| Time: 0:00:01\nTraining of 200 epoch completed. Final epoch error: 0.049240864053557445.\ntrained Machine; caches model-specific representations of data\n model: AutoEncoder(e_layers = nothing, …)\n args: \n 1:\tSource @334 ⏎ Table{AbstractVector{Continuous}}\n\n\njulia> X_latent = transform(mach, X)\n150×2 Matrix{Float64}:\n 7.01701 -2.77285\n 6.50615 -2.9279\n 6.5233 -2.60754\n ⋮ \n 6.70196 -10.6059\n 6.46369 -11.1117\n 6.20212 -10.1323\n\njulia> X_recovered = inverse_transform(mach,X_latent)\n150×4 Matrix{Float64}:\n 5.04973 3.55838 1.43251 0.242215\n 4.73689 3.19985 1.44085 0.295257\n 4.65128 3.25308 1.30187 0.244354\n ⋮ \n 6.50077 2.93602 5.3303 1.87647\n 6.38639 2.83864 5.54395 2.04117\n 6.01595 2.67659 5.03669 1.83234\n\njulia> BetaML.relative_mean_error(MLJ.matrix(X),X_recovered)\n0.03387721261716176\n\n","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/#SVMLinearRegressor_MLJScikitLearnInterface","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"","category":"section"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"SVMLinearRegressor","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"A model type for constructing a linear support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"SVMLinearRegressor = @load SVMLinearRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"Do model = SVMLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearRegressor(epsilon=...).","category":"page"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMLinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMLinearRegressor_MLJScikitLearnInterface/","page":"SVMLinearRegressor","title":"SVMLinearRegressor","text":"epsilon = 0.0\ntol = 0.0001\nC = 1.0\nloss = epsilon_insensitive\nfit_intercept = true\nintercept_scaling = 1.0\ndual = true\nrandom_state = nothing\nmax_iter = 1000","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/#DecisionTreeRegressor_BetaML","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"mutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/#Hyperparameters:","page":"DecisionTreeRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/DecisionTreeRegressor_BetaML/#Example:","page":"DecisionTreeRegressor","title":"Example:","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_BetaML/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> modelType = @load DecisionTreeRegressor pkg = \"BetaML\" verbosity=0\nBetaML.Trees.DecisionTreeRegressor\n\njulia> model = modelType()\nDecisionTreeRegressor(\n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.variance, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(DecisionTreeRegressor(max_depth = 0, …), …).\n\njulia> ŷ = predict(mach, X);\n\njulia> hcat(y,ŷ)\n506×2 Matrix{Float64}:\n 24.0 26.35\n 21.6 21.6\n 34.7 34.8\n ⋮ \n 23.9 23.75\n 22.0 22.2\n 11.9 13.2","category":"page"},{"location":"models/LinearSVC_LIBSVM/#LinearSVC_LIBSVM","page":"LinearSVC","title":"LinearSVC","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"LinearSVC","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"LinearSVC = @load LinearSVC pkg=LIBSVM","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"Reference for algorithm and core C-library: Rong-En Fan et al (2008): \"LIBLINEAR: A Library for Large Linear Classification.\" Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf. ","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Training-data","page":"LinearSVC","title":"Training data","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"where","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)\nw: a dictionary of class weights, keyed on levels(y).","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Hyper-parameters","page":"LinearSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"solver=LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: linear solver, which must be one of the following from the LIBSVM.jl package:\nLIBSVM.Linearsolver.L2R_LR: L2-regularized logistic regression (primal))\nLIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: L2-regularized L2-loss support vector classification (dual)\nLIBSVM.Linearsolver.L2R_L2LOSS_SVC: L2-regularized L2-loss support vector classification (primal)\nLIBSVM.Linearsolver.L2R_L1LOSS_SVC_DUAL: L2-regularized L1-loss support vector classification (dual)\nLIBSVM.Linearsolver.MCSVM_CS: support vector classification by Crammer and Singer) LIBSVM.Linearsolver.L1R_L2LOSS_SVC: L1-regularized L2-loss support vector classification)\nLIBSVM.Linearsolver.L1R_LR: L1-regularized logistic regression\nLIBSVM.Linearsolver.L2R_LR_DUAL: L2-regularized logistic regression (dual)\ntolerance::Float64=Inf: tolerance for the stopping criterion;\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\nbias= -1.0: if bias >= 0, instance x becomes [x; bias]; if bias < 0, no bias term added (default -1)","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Operations","page":"LinearSVC","title":"Operations","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Fitted-parameters","page":"LinearSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Examples","page":"LinearSVC","title":"Examples","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"using MLJ\nimport LIBSVM\n\nLinearSVC = @load LinearSVC pkg=LIBSVM ## model type\nmodel = LinearSVC(solver=LIBSVM.Linearsolver.L2R_LR) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"versicolor\"\n \"virginica\"","category":"page"},{"location":"models/LinearSVC_LIBSVM/#Incorporating-class-weights","page":"LinearSVC","title":"Incorporating class weights","text":"","category":"section"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"weights = Dict(\"virginica\" => 1, \"versicolor\" => 20, \"setosa\" => 1)\nmach = machine(model, X, y, weights) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"versicolor\"\n \"versicolor\"\n \"versicolor\"","category":"page"},{"location":"models/LinearSVC_LIBSVM/","page":"LinearSVC","title":"LinearSVC","text":"See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"model_browser/#Model-Browser","page":"Model Browser","title":"Model Browser","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"Models may appear under multiple categories.","category":"page"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"Below an encoder is any transformer that does not fall under another category, such as \"Missing Value Imputation\" or \"Dimension Reduction\".","category":"page"},{"location":"model_browser/#Categories","page":"Model Browser","title":"Categories","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"Regression | Classification | Outlier Detection | Iterative Models | Ensemble Models | Dimension Reduction | Clustering | Bayesian Models | Class Imbalance | Encoders | Meta Algorithms | Neural networks | Static Models | Missing Value Imputation | Distribution Fitter | Feature Engineering | Text Analysis | Image Processing","category":"page"},{"location":"model_browser/#Regression","page":"Model Browser","title":"Regression","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ARDRegressor (MLJScikitLearnInterface.jl)\nAdaBoostRegressor (MLJScikitLearnInterface.jl)\nBaggingRegressor (MLJScikitLearnInterface.jl)\nBayesianRidgeRegressor (MLJScikitLearnInterface.jl)\nCatBoostRegressor (CatBoost.jl)\nConstantRegressor (MLJModels.jl)\nDecisionTreeRegressor (BetaML.jl)\nDecisionTreeRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nDeterministicConstantRegressor (MLJModels.jl)\nDummyRegressor (MLJScikitLearnInterface.jl)\nElasticNetCVRegressor (MLJScikitLearnInterface.jl)\nElasticNetRegressor (MLJLinearModels.jl)\nElasticNetRegressor (MLJScikitLearnInterface.jl)\nEpsilonSVR (LIBSVM.jl/MLJLIBSVMInterface.jl)\nEvoLinearRegressor (EvoLinear.jl)\nEvoSplineRegressor (EvoLinear.jl)\nEvoTreeCount (EvoTrees.jl)\nEvoTreeGaussian (EvoTrees.jl)\nEvoTreeMLE (EvoTrees.jl)\nEvoTreeRegressor (EvoTrees.jl)\nExtraTreesRegressor (MLJScikitLearnInterface.jl)\nGaussianMixtureRegressor (BetaML.jl)\nGaussianProcessRegressor (MLJScikitLearnInterface.jl)\nGradientBoostingRegressor (MLJScikitLearnInterface.jl)\nHistGradientBoostingRegressor (MLJScikitLearnInterface.jl)\nHuberRegressor (MLJLinearModels.jl)\nHuberRegressor (MLJScikitLearnInterface.jl)\nKNNRegressor (NearestNeighborModels.jl)\nKNeighborsRegressor (MLJScikitLearnInterface.jl)\nKPLSRegressor (PartialLeastSquaresRegressor.jl)\nLADRegressor (MLJLinearModels.jl)\nLGBMRegressor (LightGBM.jl)\nLarsCVRegressor (MLJScikitLearnInterface.jl)\nLarsRegressor (MLJScikitLearnInterface.jl)\nLassoCVRegressor (MLJScikitLearnInterface.jl)\nLassoLarsCVRegressor (MLJScikitLearnInterface.jl)\nLassoLarsICRegressor (MLJScikitLearnInterface.jl)\nLassoLarsRegressor (MLJScikitLearnInterface.jl)\nLassoRegressor (MLJLinearModels.jl)\nLassoRegressor (MLJScikitLearnInterface.jl)\nLinearCountRegressor (GLM.jl/MLJGLMInterface.jl)\nLinearRegressor (GLM.jl/MLJGLMInterface.jl)\nLinearRegressor (MLJLinearModels.jl)\nLinearRegressor (MLJScikitLearnInterface.jl)\nLinearRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMultiTaskElasticNetCVRegressor (MLJScikitLearnInterface.jl)\nMultiTaskElasticNetRegressor (MLJScikitLearnInterface.jl)\nMultiTaskLassoCVRegressor (MLJScikitLearnInterface.jl)\nMultiTaskLassoRegressor (MLJScikitLearnInterface.jl)\nMultitargetGaussianMixtureRegressor (BetaML.jl)\nMultitargetKNNRegressor (NearestNeighborModels.jl)\nMultitargetLinearRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMultitargetNeuralNetworkRegressor (BetaML.jl)\nMultitargetNeuralNetworkRegressor (MLJFlux.jl)\nMultitargetRidgeRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMultitargetSRRegressor (SymbolicRegression.jl)\nNeuralNetworkRegressor (BetaML.jl)\nNeuralNetworkRegressor (MLJFlux.jl)\nNuSVR (LIBSVM.jl/MLJLIBSVMInterface.jl)\nOrthogonalMatchingPursuitCVRegressor (MLJScikitLearnInterface.jl)\nOrthogonalMatchingPursuitRegressor (MLJScikitLearnInterface.jl)\nPLSRegressor (PartialLeastSquaresRegressor.jl)\nPartLS (PartitionedLS.jl)\nPassiveAggressiveRegressor (MLJScikitLearnInterface.jl)\nQuantileRegressor (MLJLinearModels.jl)\nRANSACRegressor (MLJScikitLearnInterface.jl)\nRandomForestRegressor (BetaML.jl)\nRandomForestRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestRegressor (MLJScikitLearnInterface.jl)\nRidgeRegressor (MLJLinearModels.jl)\nRidgeRegressor (MLJScikitLearnInterface.jl)\nRidgeRegressor (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nRobustRegressor (MLJLinearModels.jl)\nSGDRegressor (MLJScikitLearnInterface.jl)\nSRRegressor (SymbolicRegression.jl)\nSVMLinearRegressor (MLJScikitLearnInterface.jl)\nSVMNuRegressor (MLJScikitLearnInterface.jl)\nSVMRegressor (MLJScikitLearnInterface.jl)\nStableForestRegressor (SIRUS.jl)\nStableRulesRegressor (SIRUS.jl)\nTheilSenRegressor (MLJScikitLearnInterface.jl)\nXGBoostCount (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostRegressor (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Classification","page":"Model Browser","title":"Classification","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AdaBoostClassifier (MLJScikitLearnInterface.jl)\nAdaBoostStumpClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nBaggingClassifier (MLJScikitLearnInterface.jl)\nBalancedBaggingClassifier (MLJBalancing.jl)\nBayesianLDA (MLJScikitLearnInterface.jl)\nBayesianLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBayesianQDA (MLJScikitLearnInterface.jl)\nBayesianSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBernoulliNBClassifier (MLJScikitLearnInterface.jl)\nBinaryThresholdPredictor (MLJModels.jl)\nCatBoostClassifier (CatBoost.jl)\nComplementNBClassifier (MLJScikitLearnInterface.jl)\nConstantClassifier (MLJModels.jl)\nDecisionTreeClassifier (BetaML.jl)\nDecisionTreeClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nDeterministicConstantClassifier (MLJModels.jl)\nDummyClassifier (MLJScikitLearnInterface.jl)\nEvoTreeClassifier (EvoTrees.jl)\nExtraTreesClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)\nGaussianProcessClassifier (MLJScikitLearnInterface.jl)\nGradientBoostingClassifier (MLJScikitLearnInterface.jl)\nHistGradientBoostingClassifier (MLJScikitLearnInterface.jl)\nImageClassifier (MLJFlux.jl)\nKNNClassifier (NearestNeighborModels.jl)\nKNeighborsClassifier (MLJScikitLearnInterface.jl)\nKernelPerceptronClassifier (BetaML.jl)\nLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nLGBMClassifier (LightGBM.jl)\nLinearBinaryClassifier (GLM.jl/MLJGLMInterface.jl)\nLinearSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nLogisticCVClassifier (MLJScikitLearnInterface.jl)\nLogisticClassifier (MLJLinearModels.jl)\nLogisticClassifier (MLJScikitLearnInterface.jl)\nMultinomialClassifier (MLJLinearModels.jl)\nMultinomialNBClassifier (MLJScikitLearnInterface.jl)\nMultinomialNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)\nMultitargetKNNClassifier (NearestNeighborModels.jl)\nNeuralNetworkClassifier (BetaML.jl)\nNeuralNetworkClassifier (MLJFlux.jl)\nNuSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nOneRuleClassifier (OneRule.jl)\nPassiveAggressiveClassifier (MLJScikitLearnInterface.jl)\nPegasosClassifier (BetaML.jl)\nPerceptronClassifier (BetaML.jl)\nPerceptronClassifier (MLJScikitLearnInterface.jl)\nProbabilisticNuSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nProbabilisticSGDClassifier (MLJScikitLearnInterface.jl)\nProbabilisticSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nRandomForestClassifier (BetaML.jl)\nRandomForestClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestClassifier (MLJScikitLearnInterface.jl)\nRidgeCVClassifier (MLJScikitLearnInterface.jl)\nRidgeCVRegressor (MLJScikitLearnInterface.jl)\nRidgeClassifier (MLJScikitLearnInterface.jl)\nSGDClassifier (MLJScikitLearnInterface.jl)\nSVC (LIBSVM.jl/MLJLIBSVMInterface.jl)\nSVMClassifier (MLJScikitLearnInterface.jl)\nSVMLinearClassifier (MLJScikitLearnInterface.jl)\nSVMNuClassifier (MLJScikitLearnInterface.jl)\nStableForestClassifier (SIRUS.jl)\nStableRulesClassifier (SIRUS.jl)\nSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nXGBoostClassifier (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Outlier-Detection","page":"Model Browser","title":"Outlier Detection","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ABODDetector (OutlierDetectionNeighbors.jl)\nABODDetector (OutlierDetectionPython.jl)\nCBLOFDetector (OutlierDetectionPython.jl)\nCDDetector (OutlierDetectionPython.jl)\nCOFDetector (OutlierDetectionNeighbors.jl)\nCOFDetector (OutlierDetectionPython.jl)\nCOPODDetector (OutlierDetectionPython.jl)\nDNNDetector (OutlierDetectionNeighbors.jl)\nECODDetector (OutlierDetectionPython.jl)\nGMMDetector (OutlierDetectionPython.jl)\nHBOSDetector (OutlierDetectionPython.jl)\nIForestDetector (OutlierDetectionPython.jl)\nINNEDetector (OutlierDetectionPython.jl)\nKDEDetector (OutlierDetectionPython.jl)\nKNNDetector (OutlierDetectionNeighbors.jl)\nKNNDetector (OutlierDetectionPython.jl)\nLMDDDetector (OutlierDetectionPython.jl)\nLOCIDetector (OutlierDetectionPython.jl)\nLODADetector (OutlierDetectionPython.jl)\nLOFDetector (OutlierDetectionNeighbors.jl)\nLOFDetector (OutlierDetectionPython.jl)\nMCDDetector (OutlierDetectionPython.jl)\nOCSVMDetector (OutlierDetectionPython.jl)\nOneClassSVM (LIBSVM.jl/MLJLIBSVMInterface.jl)\nPCADetector (OutlierDetectionPython.jl)\nRODDetector (OutlierDetectionPython.jl)\nSODDetector (OutlierDetectionPython.jl)\nSOSDetector (OutlierDetectionPython.jl)\nTransformedTargetModel (MLJBase.jl)","category":"page"},{"location":"model_browser/#Iterative-Models","page":"Model Browser","title":"Iterative Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"CatBoostClassifier (CatBoost.jl)\nCatBoostRegressor (CatBoost.jl)\nEvoSplineRegressor (EvoLinear.jl)\nEvoTreeClassifier (EvoTrees.jl)\nEvoTreeCount (EvoTrees.jl)\nEvoTreeGaussian (EvoTrees.jl)\nEvoTreeMLE (EvoTrees.jl)\nEvoTreeRegressor (EvoTrees.jl)\nExtraTreesClassifier (MLJScikitLearnInterface.jl)\nExtraTreesRegressor (MLJScikitLearnInterface.jl)\nImageClassifier (MLJFlux.jl)\nIteratedModel (MLJIteration.jl)\nLGBMClassifier (LightGBM.jl)\nLGBMRegressor (LightGBM.jl)\nMultitargetNeuralNetworkRegressor (MLJFlux.jl)\nNeuralNetworkClassifier (MLJFlux.jl)\nNeuralNetworkRegressor (MLJFlux.jl)\nPerceptronClassifier (BetaML.jl)\nPerceptronClassifier (MLJScikitLearnInterface.jl)\nRandomForestClassifier (BetaML.jl)\nRandomForestClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestClassifier (MLJScikitLearnInterface.jl)\nRandomForestImputer (BetaML.jl)\nRandomForestRegressor (BetaML.jl)\nRandomForestRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestRegressor (MLJScikitLearnInterface.jl)\nXGBoostClassifier (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostCount (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostRegressor (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Ensemble-Models","page":"Model Browser","title":"Ensemble Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BaggingClassifier (MLJScikitLearnInterface.jl)\nBaggingRegressor (MLJScikitLearnInterface.jl)\nCatBoostClassifier (CatBoost.jl)\nCatBoostRegressor (CatBoost.jl)\nEnsembleModel (MLJEnsembles.jl)\nEvoSplineRegressor (EvoLinear.jl)\nEvoTreeClassifier (EvoTrees.jl)\nEvoTreeCount (EvoTrees.jl)\nEvoTreeGaussian (EvoTrees.jl)\nEvoTreeMLE (EvoTrees.jl)\nEvoTreeRegressor (EvoTrees.jl)\nLGBMClassifier (LightGBM.jl)\nLGBMRegressor (LightGBM.jl)\nRandomForestClassifier (BetaML.jl)\nRandomForestClassifier (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestClassifier (MLJScikitLearnInterface.jl)\nRandomForestImputer (BetaML.jl)\nRandomForestRegressor (BetaML.jl)\nRandomForestRegressor (DecisionTree.jl/MLJDecisionTreeInterface.jl)\nRandomForestRegressor (MLJScikitLearnInterface.jl)\nStack (MLJBase.jl)\nXGBoostClassifier (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostCount (XGBoost.jl/MLJXGBoostInterface.jl)\nXGBoostRegressor (XGBoost.jl/MLJXGBoostInterface.jl)","category":"page"},{"location":"model_browser/#Dimension-Reduction","page":"Model Browser","title":"Dimension Reduction","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AutoEncoder (BetaML.jl)\nBayesianLDA (MLJScikitLearnInterface.jl)\nBayesianLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBayesianQDA (MLJScikitLearnInterface.jl)\nBayesianSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBirch (MLJScikitLearnInterface.jl)\nBisectingKMeans (MLJScikitLearnInterface.jl)\nFactorAnalysis (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nFeatureSelector (FeatureSelection.jl)\nKMeans (Clustering.jl/MLJClusteringInterface.jl)\nKMeans (MLJScikitLearnInterface.jl)\nKMeans (ParallelKMeans.jl)\nKMedoids (Clustering.jl/MLJClusteringInterface.jl)\nKernelPCA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nMiniBatchKMeans (MLJScikitLearnInterface.jl)\nPCA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nPPCA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nRecursiveFeatureElimination (FeatureSelection.jl)\nSelfOrganizingMap (SelfOrganizingMaps.jl)\nSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nTSVDTransformer (TSVD.jl/MLJTSVDInterface.jl)","category":"page"},{"location":"model_browser/#Clustering","page":"Model Browser","title":"Clustering","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AffinityPropagation (MLJScikitLearnInterface.jl)\nAgglomerativeClustering (MLJScikitLearnInterface.jl)\nBirch (MLJScikitLearnInterface.jl)\nBisectingKMeans (MLJScikitLearnInterface.jl)\nDBSCAN (Clustering.jl/MLJClusteringInterface.jl)\nDBSCAN (MLJScikitLearnInterface.jl)\nFeatureAgglomeration (MLJScikitLearnInterface.jl)\nGaussianMixtureClusterer (BetaML.jl)\nHDBSCAN (MLJScikitLearnInterface.jl)\nHierarchicalClustering (Clustering.jl/MLJClusteringInterface.jl)\nKMeans (Clustering.jl/MLJClusteringInterface.jl)\nKMeans (MLJScikitLearnInterface.jl)\nKMeans (ParallelKMeans.jl)\nKMeansClusterer (BetaML.jl)\nKMedoids (Clustering.jl/MLJClusteringInterface.jl)\nKMedoidsClusterer (BetaML.jl)\nMeanShift (MLJScikitLearnInterface.jl)\nMiniBatchKMeans (MLJScikitLearnInterface.jl)\nOPTICS (MLJScikitLearnInterface.jl)\nSelfOrganizingMap (SelfOrganizingMaps.jl)\nSpectralClustering (MLJScikitLearnInterface.jl)","category":"page"},{"location":"model_browser/#Bayesian-Models","page":"Model Browser","title":"Bayesian Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ARDRegressor (MLJScikitLearnInterface.jl)\nBayesianLDA (MLJScikitLearnInterface.jl)\nBayesianLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBayesianQDA (MLJScikitLearnInterface.jl)\nBayesianRidgeRegressor (MLJScikitLearnInterface.jl)\nBayesianSubspaceLDA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nBernoulliNBClassifier (MLJScikitLearnInterface.jl)\nComplementNBClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (MLJScikitLearnInterface.jl)\nGaussianNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)\nGaussianProcessClassifier (MLJScikitLearnInterface.jl)\nGaussianProcessRegressor (MLJScikitLearnInterface.jl)\nMultinomialNBClassifier (MLJScikitLearnInterface.jl)\nMultinomialNBClassifier (NaiveBayes.jl/MLJNaiveBayesInterface.jl)","category":"page"},{"location":"model_browser/#Class-Imbalance","page":"Model Browser","title":"Class Imbalance","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BalancedBaggingClassifier (MLJBalancing.jl)\nBalancedModel (MLJBalancing.jl)\nBorderlineSMOTE1 (Imbalance.jl)\nClusterUndersampler (Imbalance.jl)\nENNUndersampler (Imbalance.jl)\nROSE (Imbalance.jl)\nRandomOversampler (Imbalance.jl)\nRandomUndersampler (Imbalance.jl)\nRandomWalkOversampler (Imbalance.jl)\nSMOTE (Imbalance.jl)\nSMOTEN (Imbalance.jl)\nSMOTENC (Imbalance.jl)\nTomekUndersampler (Imbalance.jl)","category":"page"},{"location":"model_browser/#Encoders","page":"Model Browser","title":"Encoders","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BM25Transformer (MLJText.jl)\nContinuousEncoder (MLJModels.jl)\nCountTransformer (MLJText.jl)\nICA (MultivariateStats.jl/MLJMultivariateStatsInterface.jl)\nOneHotEncoder (MLJModels.jl)\nStandardizer (MLJModels.jl)\nTfidfTransformer (MLJText.jl)\nUnivariateBoxCoxTransformer (MLJModels.jl)\nUnivariateDiscretizer (MLJModels.jl)\nUnivariateStandardizer (MLJModels.jl)\nUnivariateTimeTypeToContinuous (MLJModels.jl)","category":"page"},{"location":"model_browser/#Meta-Algorithms","page":"Model Browser","title":"Meta Algorithms","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BalancedBaggingClassifier (MLJBalancing.jl)\nBalancedModel (MLJBalancing.jl)\nBinaryThresholdPredictor (MLJModels.jl)\nEnsembleModel (MLJEnsembles.jl)\nIteratedModel (MLJIteration.jl)\nPipeline (MLJBase.jl)\nRecursiveFeatureElimination (FeatureSelection.jl)\nResampler (MLJBase.jl)\nStack (MLJBase.jl)\nTransformedTargetModel (MLJBase.jl)\nTunedModel (MLJTuning.jl)","category":"page"},{"location":"model_browser/#Neural-networks","page":"Model Browser","title":"Neural networks","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"KernelPerceptronClassifier (BetaML.jl)\nMultitargetNeuralNetworkRegressor (BetaML.jl)\nMultitargetNeuralNetworkRegressor (MLJFlux.jl)\nNeuralNetworkClassifier (BetaML.jl)\nNeuralNetworkClassifier (MLJFlux.jl)\nNeuralNetworkRegressor (BetaML.jl)\nNeuralNetworkRegressor (MLJFlux.jl)\nPerceptronClassifier (BetaML.jl)\nPerceptronClassifier (MLJScikitLearnInterface.jl)","category":"page"},{"location":"model_browser/#Static-Models","page":"Model Browser","title":"Static Models","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"AgglomerativeClustering (MLJScikitLearnInterface.jl)\nDBSCAN (Clustering.jl/MLJClusteringInterface.jl)\nDBSCAN (MLJScikitLearnInterface.jl)\nFeatureAgglomeration (MLJScikitLearnInterface.jl)\nHDBSCAN (MLJScikitLearnInterface.jl)\nInteractionTransformer (MLJModels.jl)\nOPTICS (MLJScikitLearnInterface.jl)\nSpectralClustering (MLJScikitLearnInterface.jl)","category":"page"},{"location":"model_browser/#Missing-Value-Imputation","page":"Model Browser","title":"Missing Value Imputation","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"FillImputer (MLJModels.jl)\nGaussianMixtureImputer (BetaML.jl)\nGeneralImputer (BetaML.jl)\nRandomForestImputer (BetaML.jl)\nSimpleImputer (BetaML.jl)\nUnivariateFillImputer (MLJModels.jl)","category":"page"},{"location":"model_browser/#Distribution-Fitter","page":"Model Browser","title":"Distribution Fitter","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"GaussianMixtureClusterer (BetaML.jl)\nGaussianMixtureImputer (BetaML.jl)\nGaussianMixtureRegressor (BetaML.jl)\nMultitargetGaussianMixtureRegressor (BetaML.jl)","category":"page"},{"location":"model_browser/#Feature-Engineering","page":"Model Browser","title":"Feature Engineering","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"FeatureAgglomeration (MLJScikitLearnInterface.jl)\nFeatureSelector (FeatureSelection.jl)\nInteractionTransformer (MLJModels.jl)\nRecursiveFeatureElimination (FeatureSelection.jl)","category":"page"},{"location":"model_browser/#Text-Analysis","page":"Model Browser","title":"Text Analysis","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"BM25Transformer (MLJText.jl)\nCountTransformer (MLJText.jl)\nTfidfTransformer (MLJText.jl)","category":"page"},{"location":"model_browser/#Image-Processing","page":"Model Browser","title":"Image Processing","text":"","category":"section"},{"location":"model_browser/","page":"Model Browser","title":"Model Browser","text":"ImageClassifier (MLJFlux.jl)","category":"page"},{"location":"linear_pipelines/#Linear-Pipelines","page":"Linear Pipelines","title":"Linear Pipelines","text":"","category":"section"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"In MLJ a pipeline is a composite model in which models are chained together in a linear (non-branching) chain. For other arrangements, including custom architectures via learning networks, see Composing Models.","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"For purposes of illustration, consider a supervised learning problem with the following toy data:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"using MLJ\nX = (age = [23, 45, 34, 25, 67],\n gender = categorical(['m', 'm', 'f', 'm', 'f']));\ny = [67.0, 81.5, 55.6, 90.0, 61.1]\n nothing # hide","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"We would like to train using a K-nearest neighbor model, but the model type KNNRegressor assumes the features are all Continuous. This can be fixed by first:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"coercing the :age feature to have Continuous type by replacing X with coerce(X, :age=>Continuous)\nstandardizing continuous features and one-hot encoding the Multiclass features using the ContinuousEncoder model","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"However, we can avoid separately applying these preprocessing steps (two of which require fit! steps) by combining them with the supervised KKNRegressor model in a new pipeline model, using Julia's |> syntax:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels\npipe = (X -> coerce(X, :age=>Continuous)) |> ContinuousEncoder() |> KNNRegressor(K=2)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"We see above that pipe is a model whose hyperparameters are themselves other models or a function. (The names of these hyper-parameters are automatically generated. To specify your own names, use the explicit Pipeline constructor instead.)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"The |> syntax can also be used to extend an existing pipeline or concatenate two existing pipelines. So, we could instead have defined:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"pipe_transformer = (X -> coerce(X, :age=>Continuous)) |> ContinuousEncoder()\npipe = pipe_transformer |> KNNRegressor(K=2)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"A pipeline is just a model like any other. For example, we can evaluate its performance on the data above:","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"evaluate(pipe, X, y, resampling=CV(nfolds=3), measure=mae)","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"To include target transformations in a pipeline, wrap the supervised component using TransformedTargetModel.","category":"page"},{"location":"linear_pipelines/","page":"Linear Pipelines","title":"Linear Pipelines","text":"Pipeline","category":"page"},{"location":"linear_pipelines/#MLJBase.Pipeline","page":"Linear Pipelines","title":"MLJBase.Pipeline","text":"Pipeline(component1, component2, ... , componentk; options...)\nPipeline(name1=component1, name2=component2, ..., namek=componentk; options...)\ncomponent1 |> component2 |> ... |> componentk\n\nCreate an instance of a composite model type which sequentially composes the specified components in order. This means component1 receives inputs, whose output is passed to component2, and so forth. A \"component\" is either a Model instance, a model type (converted immediately to its default instance) or any callable object. Here the \"output\" of a model is what predict returns if it is Supervised, or what transform returns if it is Unsupervised.\n\nNames for the component fields are automatically generated unless explicitly specified, as in\n\nPipeline(encoder=ContinuousEncoder(drop_last=false),\n stand=Standardizer())\n\nThe Pipeline constructor accepts keyword options discussed further below.\n\nOrdinary functions (and other callables) may be inserted in the pipeline as shown in the following example:\n\nPipeline(X->coerce(X, :age=>Continuous), OneHotEncoder, ConstantClassifier)\n\nSyntactic sugar\n\nThe |> operator is overloaded to construct pipelines out of models, callables, and existing pipelines:\n\nLinearRegressor = @load LinearRegressor pkg=MLJLinearModels add=true\nPCA = @load PCA pkg=MultivariateStats add=true\n\npipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer\npipe2 = PCA |> LinearRegressor\npipe1 |> pipe2\n\nAt most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.\n\nSpecial operations\n\nIf all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.\n\nOptional key-word arguments\n\nprediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)\noperation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)\ncache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)\n\nwarning: Warning\nSet cache=false to guarantee data anonymization.\n\nTo build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.\n\n\n\n\n\n","category":"function"},{"location":"models/InteractionTransformer_MLJModels/#InteractionTransformer_MLJModels","page":"InteractionTransformer","title":"InteractionTransformer","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"InteractionTransformer","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"A model type for constructing a interaction transformer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"InteractionTransformer = @load InteractionTransformer pkg=MLJModels","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"Do model = InteractionTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in InteractionTransformer(order=...).","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"Generates all polynomial interaction terms up to the given order for the subset of chosen columns. Any column that contains elements with scitype <:Infinite is a valid basis to generate interactions. If features is not specified, all such columns with scitype <:Infinite in the table are used as a basis.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"In MLJ or MLJBase, you can transform features X with the single call","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"transform(machine(model), X)","category":"page"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"See also the example below.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/#Hyper-parameters","page":"InteractionTransformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"order: Maximum order of interactions to be generated.\nfeatures: Restricts interations generation to those columns","category":"page"},{"location":"models/InteractionTransformer_MLJModels/#Operations","page":"InteractionTransformer","title":"Operations","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"transform(machine(model), X): Generates polynomial interaction terms out of table X using the hyper-parameters specified in model.","category":"page"},{"location":"models/InteractionTransformer_MLJModels/#Example","page":"InteractionTransformer","title":"Example","text":"","category":"section"},{"location":"models/InteractionTransformer_MLJModels/","page":"InteractionTransformer","title":"InteractionTransformer","text":"using MLJ\n\nX = (\n A = [1, 2, 3],\n B = [4, 5, 6],\n C = [7, 8, 9],\n D = [\"x₁\", \"x₂\", \"x₃\"]\n)\nit = InteractionTransformer(order=3)\nmach = machine(it)\n\njulia> transform(mach, X)\n(A = [1, 2, 3],\n B = [4, 5, 6],\n C = [7, 8, 9],\n D = [\"x₁\", \"x₂\", \"x₃\"],\n A_B = [4, 10, 18],\n A_C = [7, 16, 27],\n B_C = [28, 40, 54],\n A_B_C = [28, 80, 162],)\n\nit = InteractionTransformer(order=2, features=[:A, :B])\nmach = machine(it)\n\njulia> transform(mach, X)\n(A = [1, 2, 3],\n B = [4, 5, 6],\n C = [7, 8, 9],\n D = [\"x₁\", \"x₂\", \"x₃\"],\n A_B = [4, 10, 18],)\n","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#HierarchicalClustering_Clustering","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"HierarchicalClustering","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"A model type for constructing a hierarchical clusterer, based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"HierarchicalClustering = @load HierarchicalClustering pkg=Clustering","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"Do model = HierarchicalClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HierarchicalClustering(linkage=...).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"Hierarchical Clustering is a clustering algorithm that organizes the data in a dendrogram based on distances between groups of points and computes cluster assignments by cutting the dendrogram at a given height. More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. The dendrogram and the dendrogram cutter are accessed from the machine report (see below).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"In MLJ or MLJBase, create a machine with","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"mach = machine(model)","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Hyper-parameters","page":"HierarchicalClustering","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"linkage = :single: linkage method (:single, :average, :complete, :ward, :ward_presquared)\nmetric = SqEuclidean: metric (see Distances.jl for available metrics)\nbranchorder = :r: branchorder (:r, :barjoseph, :optimal)\nh = nothing: height at which the dendrogram is cut\nk = 3: number of clusters.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"If both k and h are specified, it is guaranteed that the number of clusters is not less than k and their height is not above h.","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Operations","page":"HierarchicalClustering","title":"Operations","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Report","page":"HierarchicalClustering","title":"Report","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"After calling predict(mach), the fields of report(mach) are:","category":"page"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"dendrogram: the dendrogram that was computed when calling predict.\ncutter: a dendrogram cutter that can be called with a height h or a number of clusters k, to obtain a new assignment of the data points to clusters (see example below).","category":"page"},{"location":"models/HierarchicalClustering_Clustering/#Examples","page":"HierarchicalClustering","title":"Examples","text":"","category":"section"},{"location":"models/HierarchicalClustering_Clustering/","page":"HierarchicalClustering","title":"HierarchicalClustering","text":"using MLJ\n\nX, labels = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X\n\nHierarchicalClustering = @load HierarchicalClustering pkg=Clustering\nmodel = HierarchicalClustering(linkage = :complete)\nmach = machine(model)\n\n## compute and output cluster assignments for observations in `X`:\nyhat = predict(mach, X)\n\n## plot dendrogram:\nusing StatsPlots\nplot(report(mach).dendrogram)\n\n## make new predictions by cutting the dendrogram at another height\nreport(mach).cutter(h = 2.5)","category":"page"},{"location":"models/SMOTENC_Imbalance/#SMOTENC_Imbalance","page":"SMOTENC","title":"SMOTENC","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Initiate a SMOTENC model with the given hyper-parameters.","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"SMOTENC","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"A model type for constructing a smotenc, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"SMOTENC = @load SMOTENC pkg=Imbalance","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Do model = SMOTENC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTENC(k=...).","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"SMOTENC implements the SMOTENC algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.","category":"page"},{"location":"models/SMOTENC_Imbalance/#Training-data","page":"SMOTENC","title":"Training data","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"mach = machine(model)","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"model = SMOTENC()","category":"page"},{"location":"models/SMOTENC_Imbalance/#Hyperparameters","page":"SMOTENC","title":"Hyperparameters","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"k=5: Number of nearest neighbors to consider in the SMOTENC algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nknn_tree: Decides the tree used in KNN computations. Either \"Brute\" or \"Ball\". BallTree can be much faster but may lead to inaccurate results.\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/SMOTENC_Imbalance/#Transform-Inputs","page":"SMOTENC","title":"Transform Inputs","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/SMOTENC_Imbalance/#Transform-Outputs","page":"SMOTENC","title":"Transform Outputs","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/SMOTENC_Imbalance/#Operations","page":"SMOTENC","title":"Operations","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"transform(mach, X, y): resample the data X and y using SMOTENC, returning both the new and original observations","category":"page"},{"location":"models/SMOTENC_Imbalance/#Example","page":"SMOTENC","title":"Example","text":"","category":"section"},{"location":"models/SMOTENC_Imbalance/","page":"SMOTENC","title":"SMOTENC","text":"using MLJ\nusing ScientificTypes\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows = 100\nnum_continuous_feats = 3\n## want two categorical features with three and two possible values respectively\nnum_vals_per_category = [3, 2]\n\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, num_vals_per_category, rng=42) \njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\njulia> ScientificTypes.schema(X).scitypes\n(Continuous, Continuous, Continuous, Continuous, Continuous)\n## coerce nominal columns to a finite scitype (multiclass or ordered factor)\nX = coerce(X, :Column4=>Multiclass, :Column5=>Multiclass)\n\n## load SMOTE-NC\nSMOTENC = @load SMOTENC pkg=Imbalance\n\n## wrap the model in a machine\noversampler = SMOTENC(k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#EvoTreeCount_EvoTrees","page":"EvoTreeCount","title":"EvoTreeCount","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"EvoTreeCount(;kwargs...)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"A model type for constructing a EvoTreeCount, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeCount is used to perform Poisson probabilistic regression on count target.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Hyper-parameters","page":"EvoTreeCount","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.\neta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.\nL2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.\nlambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.\ngamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.\nmax_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.\nmin_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.\nrowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be ]0, 1].\ncolsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be ]0, 1].\nnbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.\nmonotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).\ntree_type=\"binary\" Tree structure to be used. One of:\nbinary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.\noblivious: A common splitting condition is imposed to all nodes of a given depth.\nrng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Internal-API","page":"EvoTreeCount","title":"Internal API","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Do config = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(max_depth=...).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Training-model","page":"EvoTreeCount","title":"Training model","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"A model is built using fit_evotree:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"model = fit_evotree(config; x_train, y_train, kwargs...)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Inference","page":"EvoTreeCount","title":"Inference","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Predictions are obtained using predict which returns a Vector of length nobs:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"EvoTrees.predict(model, X)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Alternatively, models act as a functor, returning predictions when called as a function with features as argument:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"model(X)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#MLJ","page":"EvoTreeCount","title":"MLJ","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"EvoTreeCount = @load EvoTreeCount pkg=EvoTrees","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Do model = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(loss=...).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Training-data","page":"EvoTreeCount","title":"Training data","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Count; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Operations","page":"EvoTreeCount","title":"Operations","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"predict(mach, Xnew): returns a vector of Poisson distributions given features Xnew having the same scitype as X above. Predictions are probabilistic.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"Specific metrics can also be predicted using:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"predict_mean(mach, Xnew)\npredict_mode(mach, Xnew)\npredict_median(mach, Xnew)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Fitted-parameters","page":"EvoTreeCount","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":":fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Report","page":"EvoTreeCount","title":"Report","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":":features: The names of the features encountered in training.","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/#Examples","page":"EvoTreeCount","title":"Examples","text":"","category":"section"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"## Internal API\nusing EvoTrees\nconfig = EvoTreeCount(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nx_train, y_train = randn(nobs, nfeats), rand(0:2, nobs)\nmodel = fit_evotree(config; x_train, y_train)\npreds = EvoTrees.predict(model, x_train)","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"using MLJ\nEvoTreeCount = @load EvoTreeCount pkg=EvoTrees\nmodel = EvoTreeCount(max_depth=5, nbins=32, nrounds=100)\nnobs, nfeats = 1_000, 5\nX, y = randn(nobs, nfeats), rand(0:2, nobs)\nmach = machine(model, X, y) |> fit!\npreds = predict(mach, X)\npreds = predict_mean(mach, X)\npreds = predict_mode(mach, X)\npreds = predict_median(mach, X)\n","category":"page"},{"location":"models/EvoTreeCount_EvoTrees/","page":"EvoTreeCount","title":"EvoTreeCount","text":"See also EvoTrees.jl.","category":"page"},{"location":"list_of_supported_models/#model_list","page":"List of Supported Models","title":"List of Supported Models","text":"","category":"section"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"For a list of models organized around function (\"classification\", \"regression\", etc.), see the Model Browser.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"MLJ provides access to a wide variety of machine learning models. We are always looking for help adding new models or testing existing ones. Currently available models are listed below; for the most up-to-date list, run using MLJ; models(). ","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"Indications of \"maturity\" in the table below are approximate, surjective, and possibly out-of-date. A decision to use or not use a model in a critical application should be based on a user's independent assessment.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"experimental: indicates the package is fairly new and/or is under active development; you can help by testing these packages and making them more robust,\nlow: indicate a package that has reached a roughly stable form in terms of interface and which is unlikely to contain serious bugs. It may be missing some functionality found in similar packages. It has not benefited from a high level of use\nmedium: indicates the package is fairly mature but may benefit from optimizations and/or extra features; you can help by suggesting either,\nhigh: indicates the package is very mature and functionalities are expected to have been fairly optimiser and tested.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"Package Interface Pkg Models Maturity Note\nBetaML.jl - DecisionTreeClassifier, RandomForestClassifier, NeuralNetworkClassifier, PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeRegressor, RandomForestRegressor, NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor, GaussianMixtureRegressor, MultitargetGaussianMixtureRegressor, KMeansClusterer, KMedoidsClusterer, GaussianMixtureClusterer, SimpleImputer, GaussianMixtureImputer, RandomForestImputer, GeneralImputer, AutoEncoder medium \nCatBoost.jl - CatBoostRegressor, CatBoostClassifier high \nClustering.jl MLJClusteringInterface.jl KMeans, KMedoids, DBSCAN, HierarchicalClustering high² \nDecisionTree.jl MLJDecisionTreeInterface.jl DecisionTreeClassifier, DecisionTreeRegressor, AdaBoostStumpClassifier, RandomForestClassifier, RandomForestRegressor high \nEvoTrees.jl - EvoTreeRegressor, EvoTreeClassifier, EvoTreeCount, EvoTreeGaussian, EvoTreeMLE medium tree-based gradient boosting models\nEvoLinear.jl - EvoLinearRegressor medium linear boosting models\nGLM.jl MLJGLMInterface.jl LinearRegressor, LinearBinaryClassifier, LinearCountRegressor medium² \nImbalance.jl - RandomOversampler, RandomWalkOversampler, ROSE, SMOTE, BorderlineSMOTE1, SMOTEN, SMOTENC, RandomUndersampler, ClusterUndersampler, ENNUndersampler, TomekUndersampler, low \nLIBSVM.jl MLJLIBSVMInterface.jl LinearSVC, SVC, NuSVC, NuSVR, EpsilonSVR, OneClassSVM high also via ScikitLearn.jl\nLightGBM.jl - LGBMClassifier, LGBMRegressor high \nFeatureSelector.jl - FeatureSelector, RecursiveFeatureElimination low \nFlux.jl MLJFlux.jl NeuralNetworkRegressor, NeuralNetworkClassifier, MultitargetNeuralNetworkRegressor, ImageClassifier low \nMLJBalancing.jl - BalancedBaggingClassifier low \nMLJLinearModels.jl - LinearRegressor, RidgeRegressor, LassoRegressor, ElasticNetRegressor, QuantileRegressor, HuberRegressor, RobustRegressor, LADRegressor, LogisticClassifier, MultinomialClassifier medium \nMLJModels.jl (built-in) - ConstantClassifier, ConstantRegressor, ContinuousEncoder, DeterministicConstantClassifier, DeterministicConstantRegressor, FillImputer, InteractionTransformer, OneHotEncoder, Standardizer, UnivariateBoxCoxTransformer, UnivariateDiscretizer, UnivariateFillImputer, UnivariateTimeTypeToContinuous, Standardizer, BinaryThreshholdPredictor medium \nMLJText.jl - TfidfTransformer, BM25Transformer, CountTransformer low \nMultivariateStats.jl MLJMultivariateStatsInterface.jl LinearRegressor, MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor, PCA, KernelPCA, ICA, LDA, BayesianLDA, SubspaceLDA, BayesianSubspaceLDA, FactorAnalysis, PPCA high \nNaiveBayes.jl MLJNaiveBayesInterface.jl GaussianNBClassifier, MultinomialNBClassifier, HybridNBClassifier low \nNearestNeighborModels.jl - KNNClassifier, KNNRegressor, MultitargetKNNClassifier, MultitargetKNNRegressor high \nOneRule.jl - OneRuleClassifier experimental \nOutlierDetectionNeighbors.jl - ABODDetector, COFDetector, DNNDetector, KNNDetector, LOFDetector medium \nOutlierDetectionNetworks.jl - AEDetector, DSADDetector, ESADDetector medium \nOutlierDetectionPython.jl - ABODDetector, CBLOFDetector, CDDetector, COFDetector, COPODDetector, ECODDetector, GMMDetector, HBOSDetector, IForestDetector, INNEDetector, KDEDetector, KNNDetector, LMDDDetector, LOCIDetector, LODADetector, LOFDetector, MCDDetector, OCSVMDetector, PCADetector, RODDetector, SODDetector, SOSDetector high \nParallelKMeans.jl - KMeans experimental \nPartialLeastSquaresRegressor.jl - PLSRegressor, KPLSRegressor experimental \nPartitionedLS.jl - PartLS low \nScikitLearn.jl MLJScikitLearnInterface.jl ARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressor high² \nSIRUS.jl - StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressor low \nSymbolicRegression.jl - MultitargetSRRegressor, SRRegressor experimental \nTSVD.jl MLJTSVDInterface.jl TSVDTransformer high \nXGBoost.jl MLJXGBoostInterface.jl XGBoostRegressor, XGBoostClassifier, XGBoostCount high ","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"Notes ","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"¹Models not in the MLJ registry are not included in integration tests. Consult package documentation to see how to load them. There may be issues loading these models simultaneously with other registered models.","category":"page"},{"location":"list_of_supported_models/","page":"List of Supported Models","title":"List of Supported Models","text":"²Some models are missing and assistance is welcome to complete the interface. Post a message on the Julia #mlj Slack channel if you would like to help, thanks!","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/#GaussianProcessClassifier_MLJScikitLearnInterface","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"","category":"section"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"GaussianProcessClassifier","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"A model type for constructing a Gaussian process classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"GaussianProcessClassifier = @load GaussianProcessClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"Do model = GaussianProcessClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessClassifier(kernel=...).","category":"page"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"GaussianProcessClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/GaussianProcessClassifier_MLJScikitLearnInterface/","page":"GaussianProcessClassifier","title":"GaussianProcessClassifier","text":"kernel = nothing\noptimizer = fmin_l_bfgs_b\nn_restarts_optimizer = 0\ncopy_X_train = true\nrandom_state = nothing\nmax_iter_predict = 100\nwarm_start = false\nmulti_class = one_vs_rest","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/#SpectralClustering_MLJScikitLearnInterface","page":"SpectralClustering","title":"SpectralClustering","text":"","category":"section"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"SpectralClustering","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"A model type for constructing a spectral clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"SpectralClustering = @load SpectralClustering pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"Do model = SpectralClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SpectralClustering(n_clusters=...).","category":"page"},{"location":"models/SpectralClustering_MLJScikitLearnInterface/","page":"SpectralClustering","title":"SpectralClustering","text":"Apply clustering to a projection of the normalized Laplacian. In practice spectral clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plane.","category":"page"},{"location":"models/BalancedModel_MLJBalancing/#BalancedModel_MLJBalancing","page":"BalancedModel","title":"BalancedModel","text":"","category":"section"},{"location":"models/BalancedModel_MLJBalancing/","page":"BalancedModel","title":"BalancedModel","text":"BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)\nBalancedModel(model; balancer1=balancer_model1, balancer2=balancer_model2, ...)","category":"page"},{"location":"models/BalancedModel_MLJBalancing/","page":"BalancedModel","title":"BalancedModel","text":"Given a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.","category":"page"},{"location":"models/BalancedModel_MLJBalancing/#Operation","page":"BalancedModel","title":"Operation","text":"","category":"section"},{"location":"models/BalancedModel_MLJBalancing/","page":"BalancedModel","title":"BalancedModel","text":"During training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.\nDuring prediction, the balancers have no effect.","category":"page"},{"location":"models/BalancedModel_MLJBalancing/#Arguments","page":"BalancedModel","title":"Arguments","text":"","category":"section"},{"location":"models/BalancedModel_MLJBalancing/","page":"BalancedModel","title":"BalancedModel","text":"model::Supervised: A classification model that implements the MLJModelInterface.\nbalancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.\nbalancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.\nand so on for an arbitrary number of balancers.","category":"page"},{"location":"models/BalancedModel_MLJBalancing/#Returns","page":"BalancedModel","title":"Returns","text":"","category":"section"},{"location":"models/BalancedModel_MLJBalancing/","page":"BalancedModel","title":"BalancedModel","text":"An instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.","category":"page"},{"location":"models/BalancedModel_MLJBalancing/#Example","page":"BalancedModel","title":"Example","text":"","category":"section"},{"location":"models/BalancedModel_MLJBalancing/","page":"BalancedModel","title":"BalancedModel","text":"using MLJ\nusing Imbalance\n\n## generate data\nX, y = Imbalance.generate_imbalanced_data(1000, 5; class_probs=[0.2, 0.3, 0.5])\n\n## prepare classification and balancing models\nSMOTENC = @load SMOTENC pkg=Imbalance verbosity=0\nTomekUndersampler = @load TomekUndersampler pkg=Imbalance verbosity=0\nLogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0\n\noversampler = SMOTENC(k=5, ratios=1.0, rng=42)\nundersampler = TomekUndersampler(min_ratios=0.5, rng=42)\nlogistic_model = LogisticClassifier()\n\n## wrap them in a BalancedModel\nbalanced_model = BalancedModel(model=logistic_model, balancer1=oversampler, balancer2=undersampler)\n\n## now this behaves as a unified model that can be trained, validated, fine-tuned, etc.\nmach = machine(balanced_model, X, y)\nfit!(mach)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#ElasticNetRegressor_MLJLinearModels","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"A model type for constructing a elastic net regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Elastic net is a linear model with objective function","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"$","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"where n is the number of observations.","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"If scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"$","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"|Xθ - y|₂²/2 + λ|θ|₂²/2 + γ|θ|₁ $","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":".","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#Training-data","page":"ElasticNetRegressor","title":"Training data","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"where:","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#Hyperparameters","page":"ElasticNetRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"lambda::Real: strength of the L2 regularization. Default: 1.0\ngamma::Real: strength of the L1 regularization. Default: 0.0\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad.\nIf solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/#Example","page":"ElasticNetRegressor","title":"Example","text":"","category":"section"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(ElasticNetRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/ElasticNetRegressor_MLJLinearModels/","page":"ElasticNetRegressor","title":"ElasticNetRegressor","text":"See also LassoRegressor.","category":"page"},{"location":"models/KMeans_Clustering/#KMeans_Clustering","page":"KMeans","title":"KMeans","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"KMeans","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"A model type for constructing a K-means clusterer, based on Clustering.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"KMeans = @load KMeans pkg=Clustering","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(k=...).","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"K-means is a classical method for clustering or vector quantization. It produces a fixed number of clusters, each associated with a center (also known as a prototype), and each data point is assigned to a cluster with the nearest center.","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"From a mathematical standpoint, K-means is a coordinate descent algorithm that solves the following optimization problem:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":":$","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"\\text{minimize} \\ \\sum{i=1}^n \\| \\mathbf{x}i - \\boldsymbol{\\mu}{zi} \\|^2 \\ \\text{w.r.t.} \\ (\\boldsymbol{\\mu}, z) :$","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Here, boldsymbolmu_k is the center of the k-th cluster, and z_i is an index of the cluster for i-th point mathbfx_i.","category":"page"},{"location":"models/KMeans_Clustering/#Training-data","page":"KMeans","title":"Training data","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"mach = machine(model, X)","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Here:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KMeans_Clustering/#Hyper-parameters","page":"KMeans","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"k=3: The number of centroids to use in clustering.\nmetric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.\ninit = :kmpp: One of the following options to indicate how cluster seeds should be initialized:\n:kmpp: KMeans++\n:kmenc: K-medoids initialization based on centrality\n:rand: random\nan instance of Clustering.SeedingAlgorithm from Clustering.jl\nan integer vector of length k that provides the indices of points to use as initial cluster centers.\nSee documentation of Clustering.jl.","category":"page"},{"location":"models/KMeans_Clustering/#Operations","page":"KMeans","title":"Operations","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.\ntransform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.","category":"page"},{"location":"models/KMeans_Clustering/#Fitted-parameters","page":"KMeans","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"centers: The coordinates of the cluster centers.","category":"page"},{"location":"models/KMeans_Clustering/#Report","page":"KMeans","title":"Report","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"The fields of report(mach) are:","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"assignments: The cluster assignments of each point in the training data.\ncluster_labels: The labels assigned to each cluster.","category":"page"},{"location":"models/KMeans_Clustering/#Examples","page":"KMeans","title":"Examples","text":"","category":"section"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"using MLJ\nKMeans = @load KMeans pkg=Clustering\n\ntable = load_iris()\ny, X = unpack(table, ==(:target), rng=123)\nmodel = KMeans(k=3)\nmach = machine(model, X) |> fit!\n\nyhat = predict(mach, X)\n@assert yhat == report(mach).assignments\n\ncompare = zip(yhat, y) |> collect;\ncompare[1:8] ## clusters align with classes\n\ncenter_dists = transform(mach, fitted_params(mach).centers')\n\n@assert center_dists[1][1] == 0.0\n@assert center_dists[2][2] == 0.0\n@assert center_dists[3][3] == 0.0","category":"page"},{"location":"models/KMeans_Clustering/","page":"KMeans","title":"KMeans","text":"See also KMedoids","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/#PassiveAggressiveClassifier_MLJScikitLearnInterface","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"","category":"section"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"PassiveAggressiveClassifier","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"A model type for constructing a passive aggressive classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"PassiveAggressiveClassifier = @load PassiveAggressiveClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"Do model = PassiveAggressiveClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveClassifier(C=...).","category":"page"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"PassiveAggressiveClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PassiveAggressiveClassifier_MLJScikitLearnInterface/","page":"PassiveAggressiveClassifier","title":"PassiveAggressiveClassifier","text":"C = 1.0\nfit_intercept = true\nmax_iter = 100\ntol = 0.001\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nshuffle = true\nverbose = 0\nloss = hinge\nn_jobs = nothing\nrandom_state = 0\nwarm_start = false\nclass_weight = nothing\naverage = false","category":"page"},{"location":"tuning_models/#Tuning-Models","page":"Tuning Models","title":"Tuning Models","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"MLJ provides several built-in and third-party options for optimizing a model's hyper-parameters. The quick-reference table below omits some advanced keyword options.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tuning strategy notes package to import package providing the core algorithm\nGrid(goal=nothing, resolution=10) shuffled by default; goal is upper bound for number of grid points MLJ.jl or MLJTuning.jl MLJTuning.jl\nRandomSearch(rng=GLOBAL_RNG) with customizable priors MLJ.jl or MLJTuning.jl MLJTuning.jl\nLatinHypercube(rng=GLOBAL_RNG) with discrete parameter support MLJ.jl or MLJTuning.jl LatinHypercubeSampling\nMLJTreeParzenTuning() See this example for usage TreeParzen.jl TreeParzen.jl (port to Julia of hyperopt)\nParticleSwarm(n_particles=3, rng=GLOBAL_RNG) Standard Kennedy-Eberhart algorithm, plus discrete parameter support MLJParticleSwarmOptimization.jl MLJParticleSwarmOptimization.jl\nAdaptiveParticleSwarm(n_particles=3, rng=GLOBAL_RNG) Zhan et al. variant with automated swarm coefficient updates, plus discrete parameter support MLJParticleSwarmOptimization.jl MLJParticleSwarmOptimization.jl\nExplicit() For an explicit list of models of varying type MLJ.jl or MLJTuning.jl MLJTuning.jl","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Below we illustrate hyperparameter optimization using the Grid, RandomSearch, LatinHypercube and Explicit tuning strategies.","category":"page"},{"location":"tuning_models/#Overview","page":"Tuning Models","title":"Overview","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"In MLJ model tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine called mach, calling fit!(mach) instigates a search for optimal model hyperparameters, within a specified range, and then uses all supplied data to train the best model. To predict using that model, one then calls predict(mach, Xnew). In this way, the wrapped model may be viewed as a \"self-tuning\" version of the unwrapped model. That is, wrapping the model simply transforms certain hyper-parameters into learned parameters.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"A corollary of the tuning-as-wrapper approach is that the evaluation of the performance of a TunedModel instance using evaluate! implies nested resampling. This approach is inspired by MLR. See also below.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"In MLJ, tuning is an iterative procedure, with an iteration parameter n, the total number of model instances to be evaluated. Accordingly, tuning can be controlled using MLJ's IteratedModel wrapper. After familiarizing oneself with the TunedModel wrapper described below, see Controlling model tuning for more on this advanced feature.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For a more in-depth overview of tuning in MLJ, or for implementation details, see the MLJTuning documentation. For a complete list of options see the TunedModel doc-string below.","category":"page"},{"location":"tuning_models/#Tuning-a-single-hyperparameter-using-a-grid-search-(regression-example)","page":"Tuning Models","title":"Tuning a single hyperparameter using a grid search (regression example)","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using MLJ\nX = MLJ.table(rand(100, 10));\ny = 2X.x1 - X.x2 + 0.05*rand(100);\nTree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0;\ntree = Tree()","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Let's tune min_purity_increase in the model above, using a grid-search. To do so we will use the simplest range object, a one-dimensional range object constructed using the range method:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"r = range(tree, :min_purity_increase, lower=0.001, upper=1.0, scale=:log);\nself_tuning_tree = TunedModel(\n model=tree,\n resampling=CV(nfolds=3),\n tuning=Grid(resolution=10),\n range=r,\n measure=rms\n);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Incidentally, a grid is generated internally \"over the range\" by calling the iterator method with an appropriate resolution:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"iterator(r, 5)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Non-numeric hyperparameters are handled a little differently:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"selector = FeatureSelector();\nr2 = range(selector, :features, values = [[:x1,], [:x1, :x2]]);\niterator(r2)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Unbounded ranges are also permitted. See the range and iterator docstrings below for details, and the sampler docstring for generating random samples from one-dimensional ranges (used internally by the RandomSearch strategy).","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Returning to the wrapped tree model:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"mach = machine(self_tuning_tree, X, y);\nfit!(mach, verbosity=0)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We can inspect the detailed results of the grid search with report(mach) or just retrieve the optimal model, as here:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"fitted_params(mach).best_model","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For more detailed information, we can look at report(mach), for example:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry = report(mach).best_history_entry","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Predicting on new input observations using the optimal model, trained on all the data bound to mach:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Xnew = MLJ.table(rand(3, 10));\npredict(mach, Xnew)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Or predicting on some subset of the observations bound to mach:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"test = 1:3\npredict(mach, rows=test)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For tuning using only a subset train of all observation indices, specify rows=train in the above fit! call. In that case, the above predict calls would be based on training the optimal model on all train rows.","category":"page"},{"location":"tuning_models/#A-probabilistic-classifier-example","page":"Tuning Models","title":"A probabilistic classifier example","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Tuning a classifier is not essentially different from tuning a regressor. A common gotcha however is to overlook the distinction between supervised models that make point predictions (subtypes of Deterministic) and those that make probabilistic predictions (subtypes of Probabilistic). The DecisionTreeRegressor model in the preceding illustration was deterministic, so this example will consider a probabilistic classifier:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"info(\"KNNClassifier\").prediction_type","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"X, y = @load_iris\nKNN = @load KNNClassifier verbosity=0\nknn = KNN()","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We'll tune the hyperparameter K in the model above, using a grid-search once more:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"K_range = range(knn, :K, lower=5, upper=20);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Since the model is probabilistic, we can choose either: (i) a probabilistic measure, such as brier_loss; or (ii) use a deterministic measure, such as misclassification_rate (which means predict_mean is called instead of predict under the hood).","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Case (i) - probabilistic measure:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_knn = TunedModel(\n model=knn,\n resampling = CV(nfolds=4, rng=1234),\n tuning = Grid(resolution=5),\n range = K_range,\n measure = BrierLoss()\n);\n\nmach = machine(self_tuning_knn, X, y);\nfit!(mach, verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Case (ii) - deterministic measure:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_knn = TunedModel(\n model=knn,\n resampling = CV(nfolds=4, rng=1234),\n tuning = Grid(resolution=5),\n range = K_range,\n measure = MisclassificationRate()\n)\n\nmach = machine(self_tuning_knn, X, y);\nfit!(mach, verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Let's inspect the best model and corresponding evaluation of the metric in case (ii):","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry = report(mach).best_history_entry","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry.model.K","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Recall that fitting mach also retrains the optimal model on all available data. The following is therefore an optimal model prediction based on all available data:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"predict(mach, rows=148:150)","category":"page"},{"location":"tuning_models/#Specifying-a-custom-measure","page":"Tuning Models","title":"Specifying a custom measure","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Users may specify a custom loss or scoring function, so long as it complies with the StatisticalMeasuresBase.jl API and implements the appropriate orientation trait (Score() or Loss()) from that package. For example, we suppose define a \"new\" scoring function custom_accuracy by","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"custom_accuracy(yhat, y) = mean(y .== yhat); # yhat - prediction, y - ground truth","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"In tuning, scores are maximised, while losses are minimised. So here we declare","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"import StatisticalMeasuresBase as SMB\nSMB.orientation(::typeof(custom_accuracy)) = SMB.Score()","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For full details on constructing custom measures, see StatisticalMeasuresBase.jl.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_knn = TunedModel(\n model=knn,\n resampling = CV(nfolds=4),\n tuning = Grid(resolution=5),\n range = K_range,\n measure = [custom_accuracy, MulticlassFScore()],\n operation = predict_mode\n);\n\nmach = machine(self_tuning_knn, X, y)\nfit!(mach, verbosity=0)\nentry = report(mach).best_history_entry","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"entry.model.K","category":"page"},{"location":"tuning_models/#Tuning-multiple-nested-hyperparameters","page":"Tuning Models","title":"Tuning multiple nested hyperparameters","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"The forest model below has another model, namely a DecisionTreeRegressor, as a hyperparameter:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tree = Tree() # defined above\nforest = EnsembleModel(model=tree)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Ranges for nested hyperparameters are specified using dot syntax. In this case, we will specify a goal for the total number of grid points:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"r1 = range(forest, :(model.n_subfeatures), lower=1, upper=9);\nr2 = range(forest, :bagging_fraction, lower=0.4, upper=1.0);\nself_tuning_forest = TunedModel(\n model=forest,\n tuning=Grid(goal=30),\n resampling=CV(nfolds=6),\n range=[r1, r2],\n measure=rms);\n\nX = MLJ.table(rand(100, 10));\ny = 2X.x1 - X.x2 + 0.05*rand(100);\n\nmach = machine(self_tuning_forest, X, y);\nfit!(mach, verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We can plot the grid search results:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using Plots\nplot(mach)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"(Image: )","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Instead of specifying a goal, we can declare a global resolution, which is overridden for a particular parameter by pairing its range with the resolution desired. In the next example, the default resolution=100 is applied to the r2 field, but a resolution of 3 is applied to the r1 field. Additionally, we ask that the grid points be randomly traversed and the total number of evaluations be limited to 25.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tuning = Grid(resolution=100, shuffle=true, rng=1234)\nself_tuning_forest = TunedModel(\n model=forest,\n tuning=tuning,\n resampling=CV(nfolds=6),\n range=[(r1, 3), r2],\n measure=rms,\n n=25\n);\nfit!(machine(self_tuning_forest, X, y), verbosity=0);","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For more options for a grid search, see Grid below.","category":"page"},{"location":"tuning_models/#Tuning-using-a-random-search","page":"Tuning Models","title":"Tuning using a random search","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Let's attempt to tune the same hyperparameters using a RandomSearch tuning strategy. By default, bounded numeric ranges like r1 and r2 are sampled uniformly (before rounding, in the case of the integer range r1). Positive unbounded ranges are sampled using a Gamma distribution by default, and all others using a (truncated) normal distribution.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"self_tuning_forest = TunedModel(\n model=forest,\n tuning=RandomSearch(),\n resampling=CV(nfolds=6),\n range=[r1, r2],\n measure=rms,\n n=25\n);\nX = MLJ.table(rand(100, 10));\ny = 2X.x1 - X.x2 + 0.05*rand(100);\nmach = machine(self_tuning_forest, X, y);\nfit!(mach, verbosity=0)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using Plots\nplot(mach)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"(Image: )","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"The prior distributions used for sampling each hyperparameter can be customized, as can the global fallbacks. See the RandomSearch doc-string below for details.","category":"page"},{"location":"tuning_models/#Tuning-using-Latin-hypercube-sampling","page":"Tuning Models","title":"Tuning using Latin hypercube sampling","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"One can also tune the hyperparameters using the LatinHypercube tuning strategy. This method uses a genetic-based optimization algorithm based on the inverse of the Audze-Eglais function, using the library LatinHypercubeSampling.jl.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"We'll work with the data X, y and ranges r1 and r2 defined above and instantiate a Latin hypercube resampling strategy:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"latin = LatinHypercube(gens=2, popsize=120)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Here gens is the number of generations to run the optimisation for and popsize is the population size in the genetic algorithm. For more on these and other LatinHypercube parameters refer to the LatinHypercubeSampling.jl documentation. Pay attention that gens and popsize are not to be confused with the iteration parameter n in the construction of a corresponding TunedModel instance, which specifies the total number of models to be evaluated, independent of the tuning strategy.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"For this illustration we'll add a third, nominal, hyper-parameter:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"r3 = range(forest, :(model.post_prune), values=[true, false]);\nself_tuning_forest = TunedModel(\n model=forest,\n tuning=latin,\n resampling=CV(nfolds=6),\n range=[r1, r2, r3],\n measure=rms,\n n=25\n);\nmach = machine(self_tuning_forest, X, y);\nfit!(mach, verbosity=0)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"using Plots\nplot(mach)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"(Image: )","category":"page"},{"location":"tuning_models/#explicit","page":"Tuning Models","title":"Comparing models of different type and nested cross-validation","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Instead of mutating hyperparameters of a fixed model, one can instead optimise over an explicit list of models, whose types are allowed to vary. As with other tuning strategies, evaluating the resulting TunedModel itself implies nested resampling (e.g., nested cross-validation) which we now examine in a bit more detail.","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()\nknn = (@load KNNClassifier pkg=NearestNeighborModels verbosity=0)()\nmodels = [tree, knn]\nnothing # hide","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"The following model is equivalent to the best in models by using 3-fold cross-validation:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"multi_model = TunedModel(\n models=models,\n resampling=CV(nfolds=3),\n measure=log_loss,\n check_measure=false\n)\nnothing # hide","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Note that there is no need to specify a tuning strategy or range but we do specify models (plural) instead of model. Evaluating multi_model implies nested cross-validation (each model gets evaluated 2 x 3 times):","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"X, y = make_blobs()\n\ne = evaluate(multi_model, X, y, resampling=CV(nfolds=2), measure=log_loss, verbosity=6)","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"Now, for example, we can get the best model for the first fold out of the two folds:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"e.report_per_fold[1].best_model","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"And the losses in the outer loop (these still have to be matched to the best performing model):","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"e.per_fold","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"It is also possible to get the results for the nested evaluations. For example, for the first fold of the outer loop and the second model:","category":"page"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"e.report_per_fold[2].history[1]","category":"page"},{"location":"tuning_models/#Reference","page":"Tuning Models","title":"Reference","text":"","category":"section"},{"location":"tuning_models/","page":"Tuning Models","title":"Tuning Models","text":"MLJBase.range\nMLJBase.iterator\nMLJBase.sampler\nDistributions.fit(::Type{D}, ::MLJBase.NumericRange) where D<:Distributions.Distribution\nMLJTuning.TunedModel\nMLJTuning.Grid\nMLJTuning.RandomSearch\nMLJTuning.LatinHypercube","category":"page"},{"location":"tuning_models/#Base.range","page":"Tuning Models","title":"Base.range","text":"r = range(model, :hyper; values=nothing)\n\nDefine a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.\n\nA nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.\n\nr = range(model, :hyper; upper=nothing, lower=nothing,\n scale=nothing, values=nothing)\n\nAssuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.\n\nNote that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.\n\nBy default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.\n\nA nested hyperparameter is specified using dot notation (see above).\n\nIf scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.\n\nIf values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).\n\nSee also: iterator, sampler\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#MLJBase.iterator","page":"Tuning Models","title":"MLJBase.iterator","text":"iterator([rng, ], r::NominalRange, [,n])\niterator([rng, ], r::NumericRange, n)\n\nReturn an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:\n\n(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:\n\nr.lower r.upper L U\nfinite finite r.lower r.upper\n-Inf finite r.upper - 2r.unit r.upper\nfinite Inf r.lower r.lower + 2r.unit\n-Inf Inf r.origin - r.unit r.origin + r.unit\n\n(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)\n\n(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).\n\n(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#Distributions.sampler","page":"Tuning Models","title":"Distributions.sampler","text":"sampler(r::NominalRange, probs::AbstractVector{<:Real})\nsampler(r::NominalRange)\nsampler(r::NumericRange{T}, d)\n\nConstruct an object s which can be used to generate random samples from a ParamRange object r (a one-dimensional range) using one of the following calls:\n\nrand(s) # for one sample\nrand(s, n) # for n samples\nrand(rng, s [, n]) # to specify an RNG\n\nThe argument probs can be any probability vector with the same length as r.values. The second sampler method above calls the first with a uniform probs vector.\n\nThe argument d can be either an arbitrary instance of UnivariateDistribution from the Distributions.jl package, or one of a Distributions.jl types for which fit(d, ::NumericRange) is defined. These include: Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight, Normal, Gamma, InverseGaussian, Logistic, LogNormal, Cauchy, Gumbel, Laplace, and Poisson; but see the doc-string for Distributions.fit for an up-to-date list.\n\nIf d is an instance, then sampling is from a truncated form of the supplied distribution d, the truncation bounds being r.lower and r.upper (the attributes r.origin and r.unit attributes are ignored). For discrete numeric ranges (T <: Integer) the samples are rounded.\n\nIf d is a type then a suitably truncated distribution is automatically generated using Distributions.fit(d, r).\n\nImportant. Values are generated with no regard to r.scale, except in the special case r.scale is a callable object f. In that case, f is applied to all values generated by rand as described above (prior to rounding, in the case of discrete numeric ranges).\n\nExamples\n\njulia> r = range(Char, :letter, values=collect(\"abc\"))\njulia> s = sampler(r, [0.1, 0.2, 0.7])\njulia> samples = rand(s, 1000);\njulia> StatsBase.countmap(samples)\nDict{Char,Int64} with 3 entries:\n 'a' => 107\n 'b' => 205\n 'c' => 688\n\njulia> r = range(Int, :k, lower=2, upper=6) # numeric but discrete\njulia> s = sampler(r, Normal)\njulia> samples = rand(s, 1000);\njulia> UnicodePlots.histogram(samples)\n ┌ ┐\n[2.0, 2.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 119\n[2.5, 3.0) ┤ 0\n[3.0, 3.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 296\n[3.5, 4.0) ┤ 0\n[4.0, 4.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 275\n[4.5, 5.0) ┤ 0\n[5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221\n[5.5, 6.0) ┤ 0\n[6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89\n └ ┘\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#StatsAPI.fit-Union{Tuple{D}, Tuple{Type{D}, NumericRange}} where D<:Distributions.Distribution","page":"Tuning Models","title":"StatsAPI.fit","text":"Distributions.fit(D, r::MLJBase.NumericRange)\n\nFit and return a distribution d of type D to the one-dimensional range r.\n\nOnly types D in the table below are supported.\n\nThe distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.\n\nDistribution type D Characterization of d0\nArcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight minimum(d) = r.lower, maximum(d) = r.upper\nNormal, Gamma, InverseGaussian, Logistic, LogNormal mean(d) = r.origin, std(d) = r.unit\nCauchy, Gumbel, Laplace, (Normal) Dist.location(d) = r.origin, Dist.scale(d) = r.unit\nPoisson Dist.mean(d) = r.unit\n\nHere Dist = Distributions.\n\n\n\n\n\n","category":"method"},{"location":"tuning_models/#MLJTuning.TunedModel","page":"Tuning Models","title":"MLJTuning.TunedModel","text":"tuned_model = TunedModel(; model=,\n tuning=RandomSearch(),\n resampling=Holdout(),\n range=nothing,\n measure=nothing,\n n=default_n(tuning, range),\n operation=nothing,\n other_options...)\n\nConstruct a model wrapper for hyper-parameter optimization of a supervised learner, specifying the tuning strategy and model whose hyper-parameters are to be mutated.\n\ntuned_model = TunedModel(; models=,\n resampling=Holdout(),\n measure=nothing,\n n=length(models),\n operation=nothing,\n other_options...)\n\nConstruct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.\n\nSee below for a complete list of options.\n\nTraining\n\nCalling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:\n\nInstigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.\nFit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.\n\nSearch space\n\nThe range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.\n\nThe number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.\n\nMeasures (metrics)\n\nIf more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.\n\nImportant. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.\n\nAccessing the fitted parameters and other training (tuning) outcomes\n\nA Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).\n\nOnce a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:\n\nkey value\nbest_model optimal model instance\nbest_fitted_params learned parameters of the optimal model\n\nThe named tuple report(mach) includes these keys/values:\n\nkey value\nbest_model optimal model instance\nbest_history_entry corresponding entry in the history, including performance estimate\nbest_report report generated by fitting the optimal model to all data\nhistory tuning strategy-specific history of all evaluations\n\nplus other key/value pairs specific to the tuning strategy.\n\nEach element of history is a property-accessible object with these properties:\n\nkey value\nmeasure vector of measures (metrics)\nmeasurement vector of measurements, one per measure\nper_fold vector of vectors of unaggregated per-fold measurements\nevaluation full PerformanceEvaluation/CompactPerformaceEvaluation object\n\nComplete list of key-word options\n\nmodel: Supervised model prototype that is cloned and mutated to generate models for evaluation\nmodels: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.\ntuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.\nresampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations\nmeasure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history\nweights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).\nclass_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).\nrepeats=1: for generating train/test sets multiple times in resampling (\"Monte Carlo\" resampling); see evaluate! for details\noperation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.\nrange: range object; tuning strategy documentation describes supported types\nselection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).\nn: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified\ntrain_best=true: whether to train the optimal model\nacceleration=default_resource(): mode of parallelization for tuning strategies that support this\nacceleration_resampling=CPU1(): mode of parallelization for resampling\ncheck_measure=true: whether to check measure is compatible with the specified model and operation)\ncache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.\ncompact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.\n\n\n\n\n\n","category":"function"},{"location":"tuning_models/#MLJTuning.Grid","page":"Tuning Models","title":"MLJTuning.Grid","text":"Grid(goal=nothing, resolution=10, rng=Random.GLOBAL_RNG, shuffle=true)\n\nInstantiate a Cartesian grid-based hyperparameter tuning strategy with a specified number of grid points as goal, or using a specified default resolution in each numeric dimension.\n\nSupported ranges:\n\nA single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in Grid search, the range field of a TunedModel instance can be:\n\nA single one-dimensional range - ie, ParamRange object - r, or pair of the form (r, res) where res specifies a resolution to override the default resolution.\nAny vector of objects of the above form\n\nTwo elements of a range vector may share the same field attribute, with the effect that their grids are combined, as in Example 3 below.\n\nParamRange objects are constructed using the range method.\n\nExample 1:\n\nrange(model, :hyper1, lower=1, origin=2, unit=1)\n\nExample 2:\n\n[(range(model, :hyper1, lower=1, upper=10), 15),\n range(model, :hyper2, lower=2, upper=4),\n range(model, :hyper3, values=[:ball, :tree])]\n\nExample 3:\n\n# a range generating the grid `[1, 2, 10, 20, 30]` for `:hyper1`:\n[range(model, :hyper1, values=[1, 2]),\n (range(model, :hyper1, lower= 10, upper=30), 3)]\n\nNote: All the field values of the ParamRange objects (:hyper1, :hyper2, :hyper3 in the preceding example) must refer to field names a of single model (the model specified during TunedModel construction).\n\nAlgorithm\n\nThis is a standard grid search with the following specifics: In all cases all values of each specified NominalRange are exhausted. If goal is specified, then all resolutions are ignored, and a global resolution is applied to the NumericRange objects that maximizes the number of grid points, subject to the restriction that this not exceed goal. (This assumes no field appears twice in the range vector.) Otherwise the default resolution and any parameter-specific resolutions apply.\n\nIn all cases the models generated are shuffled using rng, unless shuffle=false.\n\nSee also TunedModel, range.\n\n\n\n\n\n","category":"type"},{"location":"tuning_models/#MLJTuning.RandomSearch","page":"Tuning Models","title":"MLJTuning.RandomSearch","text":"RandomSearch(bounded=Distributions.Uniform,\n positive_unbounded=Distributions.Gamma,\n other=Distributions.Normal,\n rng=Random.GLOBAL_RNG)\n\nInstantiate a random search tuning strategy, for searching over Cartesian hyperparameter domains, with customizable priors in each dimension.\n\nSupported ranges\n\nA single one-dimensional range or vector of one-dimensioinal ranges can be specified. If not paired with a prior, then one is fitted, according to fallback distribution types specified by the tuning strategy hyperparameters. Specifically, in RandomSearch, the range field of a TunedModel instance can be:\n\na single one-dimensional range (ParamRange object) r\na pair of the form (r, d), with r as above and where d is:\na probability vector of the same length as r.values (r a NominalRange)\nany Distributions.UnivariateDistribution instance (r a NumericRange)\none of the subtypes of Distributions.UnivariateDistribution listed in the table below, for automatic fitting using Distributions.fit(d, r), a distribution whose support always lies between r.lower and r.upper (r a NumericRange)\nany pair of the form (field, s), where field is the (possibly nested) name of a field of the model to be tuned, and s an arbitrary sampler object for that field. This means only that rand(rng, s) is defined and returns valid values for the field.\nany vector of objects of the above form\n\nA range vector may contain multiple entries for the same model field, as in range = [(:lambda, s1), (:alpha, s), (:lambda, s2)]. In that case the entry used in each iteration is random.\n\ndistribution types for fitting to ranges of this type\nArcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight bounded\nGamma, InverseGaussian, Poisson positive (bounded or unbounded)\nNormal, Logistic, LogNormal, Cauchy, Gumbel, Laplace any\n\nParamRange objects are constructed using the range method.\n\nExamples\n\nusing Distributions\n\nrange1 = range(model, :hyper1, lower=0, upper=1)\n\nrange2 = [(range(model, :hyper1, lower=1, upper=10), Arcsine),\n range(model, :hyper2, lower=2, upper=Inf, unit=1, origin=3),\n (range(model, :hyper2, lower=2, upper=4), Normal(0, 3)),\n (range(model, :hyper3, values=[:ball, :tree]), [0.3, 0.7])]\n\n# uniform sampling of :(atom.λ) from [0, 1] without defining a NumericRange:\nstruct MySampler end\nBase.rand(rng::Random.AbstractRNG, ::MySampler) = rand(rng)\nrange3 = (:(atom.λ), MySampler())\n\nAlgorithm\n\nIn each iteration, a model is generated for evaluation by mutating the fields of a deep copy of model. The range vector is shuffled and the fields sampled according to the new order (repeated fields being mutated more than once). For a range entry of the form (field, s) the algorithm calls rand(rng, s) and mutates the field field of the model clone to have this value. For an entry of the form (r, d), s is substituted with sampler(r, d). If no d is specified, then sampling is uniform (with replacement) if r is a NominalRange, and is otherwise given by the defaults specified by the tuning strategy parameters bounded, positive_unbounded, and other, depending on the field values of the NumericRange object r.\n\nSee also TunedModel, range, sampler.\n\n\n\n\n\n","category":"type"},{"location":"tuning_models/#MLJTuning.LatinHypercube","page":"Tuning Models","title":"MLJTuning.LatinHypercube","text":"LatinHypercube(gens = 1,\n popsize = 100,\n ntour = 2,\n ptour = 0.8.,\n interSampleWeight = 1.0,\n ae_power = 2,\n periodic_ae = false,\n rng=Random.GLOBAL_RNG)\n\nInstantiate grid-based hyperparameter tuning strategy using the library LatinHypercubeSampling.jl.\n\nAn optimised Latin Hypercube sampling plan is created using a genetic based optimization algorithm based on the inverse of the Audze-Eglais function. The optimization is run for nGenerations and creates n models for evaluation, where n is specified by a corresponding TunedModel instance, as in\n\ntuned_model = TunedModel(model=...,\n tuning=LatinHypercube(...),\n range=...,\n measures=...,\n n=...)\n\n(See TunedModel for complete options.)\n\nTo use a periodic version of the Audze-Eglais function (to reduce clustering along the boundaries) specify periodic_ae = true.\n\nSupported ranges:\n\nA single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in LatinHypercubeSampling search, the range field of a TunedModel instance can be:\n\nA single one-dimensional range - ie, ParamRange object - r, constructed\n\nusing the range method.\n\nAny vector of objects of the above form\n\nBoth NumericRanges and NominalRanges are supported, and hyper-parameter values are sampled on a scale specified by the range (eg, r.scale = :log).\n\n\n\n\n\n","category":"type"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/#DummyClassifier_MLJScikitLearnInterface","page":"DummyClassifier","title":"DummyClassifier","text":"","category":"section"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"DummyClassifier","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"A model type for constructing a dummy classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"DummyClassifier = @load DummyClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"Do model = DummyClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyClassifier(strategy=...).","category":"page"},{"location":"models/DummyClassifier_MLJScikitLearnInterface/","page":"DummyClassifier","title":"DummyClassifier","text":"DummyClassifier is a classifier that makes predictions using simple rules.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#StableForestRegressor_SIRUS","page":"StableForestRegressor","title":"StableForestRegressor","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"StableForestRegressor","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"A model type for constructing a stable forest regressor, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"StableForestRegressor = @load StableForestRegressor pkg=SIRUS","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"Do model = StableForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestRegressor(rng=...).","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"StableForestRegressor implements the random forest regressor with a stabilized forest structure (Bénard et al., 2021).","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Training-data","page":"StableForestRegressor","title":"Training data","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"where","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Hyperparameters","page":"StableForestRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Fitted-parameters","page":"StableForestRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"fitresult: A StableForest object.","category":"page"},{"location":"models/StableForestRegressor_SIRUS/#Operations","page":"StableForestRegressor","title":"Operations","text":"","category":"section"},{"location":"models/StableForestRegressor_SIRUS/","page":"StableForestRegressor","title":"StableForestRegressor","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#ContinuousEncoder_MLJModels","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"ContinuousEncoder","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"If ftr is already Continuous retain it.\nIf ftr is Multiclass, one-hot encode it.\nIf ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.\nIf ftr is Count, replace it with coerce(ftr, Continuous).\nIf ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Training-data","page":"ContinuousEncoder","title":"Training data","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"mach = machine(model, X)","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"where","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Hyper-parameters","page":"ContinuousEncoder","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.\none_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Fitted-parameters","page":"ContinuousEncoder","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"features_to_keep: names of features that will not be dropped from the table\none_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding\none_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Report","page":"ContinuousEncoder","title":"Report","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"features_to_keep: names of input features that will not be dropped from the table\nnew_features: names of all output features","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/#Example","page":"ContinuousEncoder","title":"Example","text":"","category":"section"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"X = (name=categorical([\"Danesh\", \"Lee\", \"Mary\", \"John\"]),\n grade=categorical([\"A\", \"B\", \"A\", \"C\"], ordered=true),\n height=[1.85, 1.67, 1.5, 1.67],\n n_devices=[3, 2, 4, 3],\n comments=[\"the force\", \"be\", \"with you\", \"too\"])\n\njulia> schema(X)\n┌───────────┬──────────────────┐\n│ names │ scitypes │\n├───────────┼──────────────────┤\n│ name │ Multiclass{4} │\n│ grade │ OrderedFactor{3} │\n│ height │ Continuous │\n│ n_devices │ Count │\n│ comments │ Textual │\n└───────────┴──────────────────┘\n\nencoder = ContinuousEncoder(drop_last=true)\nmach = fit!(machine(encoder, X))\nW = transform(mach, X)\n\njulia> schema(W)\n┌──────────────┬────────────┐\n│ names │ scitypes │\n├──────────────┼────────────┤\n│ name__Danesh │ Continuous │\n│ name__John │ Continuous │\n│ name__Lee │ Continuous │\n│ grade │ Continuous │\n│ height │ Continuous │\n│ n_devices │ Continuous │\n└──────────────┴────────────┘\n\njulia> setdiff(schema(X).names, report(mach).features_to_keep) ## dropped features\n1-element Vector{Symbol}:\n :comments\n","category":"page"},{"location":"models/ContinuousEncoder_MLJModels/","page":"ContinuousEncoder","title":"ContinuousEncoder","text":"See also OneHotEncoder","category":"page"},{"location":"models/SVC_LIBSVM/#SVC_LIBSVM","page":"SVC","title":"SVC","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"SVC","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"A model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"SVC = @load SVC pkg=LIBSVM","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"Do model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/SVC_LIBSVM/#Training-data","page":"SVC","title":"Training data","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"where","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)\nw: a dictionary of class weights, keyed on levels(y).","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/SVC_LIBSVM/#Hyper-parameters","page":"SVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\ncost=1.0 (range (0, Inf)): the parameter denoted C in the cited reference; for greater regularization, decrease cost\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/SVC_LIBSVM/#Operations","page":"SVC","title":"Operations","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/SVC_LIBSVM/#Fitted-parameters","page":"SVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/SVC_LIBSVM/#Report","page":"SVC","title":"Report","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/SVC_LIBSVM/#Examples","page":"SVC","title":"Examples","text":"","category":"section"},{"location":"models/SVC_LIBSVM/#Using-a-built-in-kernel","page":"SVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"using MLJ\nimport LIBSVM\n\nSVC = @load SVC pkg=LIBSVM ## model type\nmodel = SVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/SVC_LIBSVM/#User-defined-kernels","page":"SVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = SVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/SVC_LIBSVM/#Incorporating-class-weights","page":"SVC","title":"Incorporating class weights","text":"","category":"section"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"In either scenario above, we can do:","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"weights = Dict(\"virginica\" => 1, \"versicolor\" => 20, \"setosa\" => 1)\nmach = machine(model, X, y, weights) |> fit!\n\njulia> yhat = predict(mach, Xnew)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"versicolor\"\n \"versicolor\"\n \"versicolor\"","category":"page"},{"location":"models/SVC_LIBSVM/","page":"SVC","title":"SVC","text":"See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.","category":"page"},{"location":"modifying_behavior/#Modifying-Behavior","page":"Modifying Behavior","title":"Modifying Behavior","text":"","category":"section"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"To modify behavior of MLJ you will need to clone the relevant component package (e.g., MLJBase.jl) - or a fork thereof - and modify your local julia environment to use your local clone in place of the official release. For example, you might proceed something like this:","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"using Pkg\nPkg.activate(\"my_MLJ_enf\", shared=true)\nPkg.develop(\"path/to/my/local/MLJBase\")","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"To test your local clone, do","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"Pkg.test(\"MLJBase\")","category":"page"},{"location":"modifying_behavior/","page":"Modifying Behavior","title":"Modifying Behavior","text":"For more on package management, see here.","category":"page"},{"location":"models/INNEDetector_OutlierDetectionPython/#INNEDetector_OutlierDetectionPython","page":"INNEDetector","title":"INNEDetector","text":"","category":"section"},{"location":"models/INNEDetector_OutlierDetectionPython/","page":"INNEDetector","title":"INNEDetector","text":"INNEDetector(n_estimators=200,\n max_samples=\"auto\",\n random_state=None)","category":"page"},{"location":"models/INNEDetector_OutlierDetectionPython/","page":"INNEDetector","title":"INNEDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.inne","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#COFDetector_OutlierDetectionNeighbors","page":"COFDetector","title":"COFDetector","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"COFDetector(k = 5,\n metric = Euclidean(),\n algorithm = :kdtree,\n leafsize = 10,\n reorder = true,\n parallel = false)","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Local outlier density based on chaining distance between graphs of neighbors, as described in [1].","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#Parameters","page":"COFDetector","title":"Parameters","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"k::Integer","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Number of neighbors (must be greater than 0).","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"metric::Metric","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"leafsize::Int","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"reorder::Bool","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"parallel::Bool","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#Examples","page":"COFDetector","title":"Examples","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"using OutlierDetection: COFDetector, fit, transform\ndetector = COFDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/COFDetector_OutlierDetectionNeighbors/#References","page":"COFDetector","title":"References","text":"","category":"section"},{"location":"models/COFDetector_OutlierDetectionNeighbors/","page":"COFDetector","title":"COFDetector","text":"[1] Tang, Jian; Chen, Zhixiang; Fu, Ada Wai-Chee; Cheung, David Wai-Lok (2002): Enhancing Effectiveness of Outlier Detections for Low Density Patterns.","category":"page"},{"location":"models/SMOTEN_Imbalance/#SMOTEN_Imbalance","page":"SMOTEN","title":"SMOTEN","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Initiate a SMOTEN model with the given hyper-parameters.","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"SMOTEN","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"A model type for constructing a smoten, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"SMOTEN = @load SMOTEN pkg=Imbalance","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Do model = SMOTEN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTEN(k=...).","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"SMOTEN implements the SMOTEN algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTEN: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.","category":"page"},{"location":"models/SMOTEN_Imbalance/#Training-data","page":"SMOTEN","title":"Training data","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"mach = machine(model)","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"model = SMOTEN()","category":"page"},{"location":"models/SMOTEN_Imbalance/#Hyperparameters","page":"SMOTEN","title":"Hyperparameters","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"k=5: Number of nearest neighbors to consider in the SMOTEN algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/SMOTEN_Imbalance/#Transform-Inputs","page":"SMOTEN","title":"Transform Inputs","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"X: A matrix of integers or a table with element scitypes that subtype Finite. That is, for table inputs each column should have either OrderedFactor or Multiclass as the element scitype.\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/SMOTEN_Imbalance/#Transform-Outputs","page":"SMOTEN","title":"Transform Outputs","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/SMOTEN_Imbalance/#Operations","page":"SMOTEN","title":"Operations","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"transform(mach, X, y): resample the data X and y using SMOTEN, returning both the new and original observations","category":"page"},{"location":"models/SMOTEN_Imbalance/#Example","page":"SMOTEN","title":"Example","text":"","category":"section"},{"location":"models/SMOTEN_Imbalance/","page":"SMOTEN","title":"SMOTEN","text":"using MLJ\nusing ScientificTypes\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows = 100\nnum_continuous_feats = 0\n## want two categorical features with three and two possible values respectively\nnum_vals_per_category = [3, 2]\n\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, num_vals_per_category, rng=42) \njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\njulia> ScientificTypes.schema(X).scitypes\n(Count, Count)\n\n## coerce to a finite scitype (multiclass or ordered factor)\nX = coerce(X, autotype(X, :few_to_finite))\n\n## load SMOTEN\nSMOTEN = @load SMOTEN pkg=Imbalance\n\n## wrap the model in a machine\noversampler = SMOTEN(k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#NeuralNetworkClassifier_BetaML","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"mutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#Parameters:","page":"NeuralNetworkClassifier","title":"Parameters:","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers. The last \"softmax\" layer is automatically added.\nloss: Loss (cost) function [def: BetaML.crossentropy]. Should always assume y and ŷ as matrices.\nwarning: Warning\nIf you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.\ndloss: Derivative of the loss function [def: BetaML.dcrossentropy, i.e. the derivative of the cross-entropy]. Use nothing for autodiff.\nepochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]\nbatch_size: Size of each individual batch [def: 16]\nopt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers\nshuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\ndescr: An optional title and/or description for this model\ncb: A call back function to provide information during training [def: BetaML.fitting_info]\ncategories: The categories to represent as columns. [def: nothing, i.e. unique training values].\nhandle_unknown: How to handle categories not seens in training or not present in the provided categories array? \"error\" (default) rises an error, \"infrequent\" adds a specific column for these categories.\nother_categories_name: Which value during prediction to assign to this \"other\" category (i.e. categories not seen on training or not present in the provided categories array? [def: nothing, i.e. typemax(Int64) for integer vectors and \"other\" for other types]. This setting is active only if handle_unknown=\"infrequent\" and in that case it MUST be specified if Y is neither integer or strings\nrng: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#Notes:","page":"NeuralNetworkClassifier","title":"Notes:","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"data must be numerical\nthe label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.","category":"page"},{"location":"models/NeuralNetworkClassifier_BetaML/#Example:","page":"NeuralNetworkClassifier","title":"Example:","text":"","category":"section"},{"location":"models/NeuralNetworkClassifier_BetaML/","page":"NeuralNetworkClassifier","title":"NeuralNetworkClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load NeuralNetworkClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Nn.NeuralNetworkClassifier\n\njulia> layers = [BetaML.DenseLayer(4,8,f=BetaML.relu),BetaML.DenseLayer(8,8,f=BetaML.relu),BetaML.DenseLayer(8,3,f=BetaML.relu),BetaML.VectorFunctionLayer(3,f=BetaML.softmax)];\n\njulia> model = modelType(layers=layers,opt_alg=BetaML.ADAM())\nNeuralNetworkClassifier(\n layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.376173352338049 0.7029289511758696 -0.5589563304592478 -0.21043274001651874; 0.044758889527899415 0.6687689636685921 0.4584331114653877 0.6820506583840453; … ; -0.26546358457167507 -0.28469736227283804 -0.164225549922154 -0.516785639164486; -0.5146043550684141 -0.0699113265130964 0.14959906603941908 -0.053706860039406834], [0.7003943613125758, -0.23990840466587576, -0.23823126271387746, 0.4018101580410387, 0.2274483050356888, -0.564975060667734, 0.1732063297031089, 0.11880299829896945], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.029467850439546583 0.4074661266592745 … 0.36775675246760053 -0.595524555448422; 0.42455597698371306 -0.2458082732997091 … -0.3324220683462514 0.44439454998610595; … ; -0.2890883863364267 -0.10109249362508033 … -0.0602680568207582 0.18177278845097555; -0.03432587226449335 -0.4301192922760063 … 0.5646018168286626 0.47269177680892693], [0.13777442835428688, 0.5473306726675433, 0.3781939472904011, 0.24021813428130567, -0.0714779477402877, -0.020386373530818958, 0.5465466618404464, -0.40339790713616525], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([0.6565120540082393 0.7139211611842745 … 0.07809812467915389 -0.49346311403373844; -0.4544472987041656 0.6502667641568863 … 0.43634608676548214 0.7213049952968921; 0.41212264783075303 -0.21993289366360613 … 0.25365007887755064 -0.5664469566269569], [-0.6911986792747682, -0.2149343209329364, -0.6347727539063817], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.VectorFunctionLayer{0}(fill(NaN), 3, 3, BetaML.Utils.softmax, BetaML.Utils.dsoftmax, nothing)], \n loss = BetaML.Utils.crossentropy, \n dloss = BetaML.Utils.dcrossentropy, \n epochs = 100, \n batch_size = 32, \n opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var\"#90#93\"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]), \n shuffle = true, \n descr = \"\", \n cb = BetaML.Nn.fitting_info, \n categories = nothing, \n handle_unknown = \"error\", \n other_categories_name = nothing, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> classes_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.575, versicolor=>0.213, virginica=>0.213)\n UnivariateFinite{Multiclass{3}}(setosa=>0.573, versicolor=>0.213, virginica=>0.213)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.236, versicolor=>0.236, virginica=>0.529)\n UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/#GradientBoostingRegressor_MLJScikitLearnInterface","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"","category":"section"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"GradientBoostingRegressor","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"GradientBoostingRegressor = @load GradientBoostingRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"Do model = GradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingRegressor(loss=...).","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.","category":"page"},{"location":"models/GradientBoostingRegressor_MLJScikitLearnInterface/","page":"GradientBoostingRegressor","title":"GradientBoostingRegressor","text":"HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#BayesianLDA_MultivariateStats","page":"BayesianLDA","title":"BayesianLDA","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"A model type for constructing a Bayesian LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA = @load BayesianLDA pkg=MultivariateStats","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(method=...).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"The Bayesian multiclass LDA algorithm learns a projection matrix as described in ordinary LDA. Predicted class posterior probability distributions are derived by applying Bayes' rule with a multivariate Gaussian class-conditional distribution. A prior class distribution can be specified by the user or inferred from training data class frequency.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"See also the package documentation. For more information about the algorithm, see Li, Zhu and Ogihara (2006): Using Discriminant Analysis for Multi-class Classification: An Experimental Investigation.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Training-data","page":"BayesianLDA","title":"Training data","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"Here:","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Hyper-parameters","page":"BayesianLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"method::Symbol=:gevd: choice of solver, one of :gevd or :whiten methods.\ncov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.\ncov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).\noutdim::Int=0: The output dimension, i.e., dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.\nregcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.\npriors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Operations","page":"BayesianLDA","title":"Operations","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Fitted-parameters","page":"BayesianLDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).\npriors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Report","page":"BayesianLDA","title":"Report","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The mean of the untransformed training data. A vector of length indim.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).\nclass_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).\nclass_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nSb: The between class scatter matrix.\nSw: The within class scatter matrix.","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/#Examples","page":"BayesianLDA","title":"Examples","text":"","category":"section"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"using MLJ\n\nBayesianLDA = @load BayesianLDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = BayesianLDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)","category":"page"},{"location":"models/BayesianLDA_MultivariateStats/","page":"BayesianLDA","title":"BayesianLDA","text":"See also LDA, SubspaceLDA, BayesianSubspaceLDA","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/#BinaryThresholdPredictor_MLJModels","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"","category":"section"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"BinaryThresholdPredictor(model; threshold=0.5)","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"By convention the positive class is the second class returned by levels(y), where y is the target.","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/#Example","page":"BinaryThresholdPredictor","title":"Example","text":"","category":"section"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Loading the data:","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"using MLJ, Random\nrng = Xoshiro(123)\n\ndiabetes = OpenML.load(43582)\noutcome, X = unpack(diabetes, ==(:Outcome), rng=rng);\ny = coerce(Int.(outcome), OrderedFactor);","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Choosing a probabilistic classifier:","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"EvoTreesClassifier = @load EvoTreesClassifier\nprob_predictor = EvoTreesClassifier()","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Wrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"point_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)\nXnew, _ = make_moons(3, rng=rng)\nmach = machine(point_predictor, X, y) |> fit!\npredict(mach, X)[1:3] ## [0, 0, 0]","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Estimating performance:","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"balanced = BalancedAccuracy(adjusted=true)\ne = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])\ne.measurement[1] ## 0.405 ± 0.089","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Wrapping in tuning strategy to learn threshold that maximizes balanced accuracy:","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"r = range(point_predictor, :threshold, lower=0.1, upper=0.9)\ntuned_point_predictor = TunedModel(\n point_predictor,\n tuning=RandomSearch(rng=rng),\n resampling=CV(nfolds=6),\n range = r,\n measure=balanced,\n n=30,\n)\nmach2 = machine(tuned_point_predictor, X, y) |> fit!\noptimized_point_predictor = report(mach2).best_model\noptimized_point_predictor.threshold ## 0.260\npredict(mach2, X)[1:3] ## [1, 1, 0]","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"Estimating the performance of the auto-thresholding model (nested resampling here):","category":"page"},{"location":"models/BinaryThresholdPredictor_MLJModels/","page":"BinaryThresholdPredictor","title":"BinaryThresholdPredictor","text":"e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])\ne.measurement[1] ## 0.477 ± 0.110","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/#GaussianMixtureClusterer_BetaML","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"","category":"section"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/#Hyperparameters:","page":"GaussianMixtureClusterer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::AbstractVector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the ?GMM module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the initialisation_strategy parameter is set to \"gived\". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector of n_classes mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def: [DiagonalGaussian() for i in 1:n_classes]]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nmaximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/GaussianMixtureClusterer_BetaML/#Example:","page":"GaussianMixtureClusterer","title":"Example:","text":"","category":"section"},{"location":"models/GaussianMixtureClusterer_BetaML/","page":"GaussianMixtureClusterer","title":"GaussianMixtureClusterer","text":"\njulia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load GaussianMixtureClusterer pkg = \"BetaML\" verbosity=0\nBetaML.GMM.GaussianMixtureClusterer\n\njulia> model = modelType()\nGaussianMixtureClusterer(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"kmeans\", \n maximum_iterations = 9223372036854775807, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(GaussianMixtureClusterer(n_classes = 3, …), …).\nIter. 1: Var. of the post 10.800150114964184 Log-likelihood -650.0186451891216\n\njulia> classes_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.17e-15, 3=>2.1900000000000003e-31)\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>1.25e-13, 3=>5.87e-31)\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.5e-15, 3=>1.55e-32)\n UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>6.93e-14, 3=>3.37e-31)\n ⋮\n UnivariateFinite{Multiclass{3}}(1=>5.39e-25, 2=>0.0167, 3=>0.983)\n UnivariateFinite{Multiclass{3}}(1=>7.5e-29, 2=>0.000106, 3=>1.0)\n UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)","category":"page"},{"location":"models/RandomForestClassifier_BetaML/#RandomForestClassifier_BetaML","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"","category":"section"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"mutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/RandomForestClassifier_BetaML/#Hyperparameters:","page":"RandomForestClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"n_trees::Int64\nmax_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimensions]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.\nβ::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour \"better\" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/RandomForestClassifier_BetaML/#Example-:","page":"RandomForestClassifier","title":"Example :","text":"","category":"section"},{"location":"models/RandomForestClassifier_BetaML/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load RandomForestClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Trees.RandomForestClassifier\n\njulia> model = modelType()\nRandomForestClassifier(\n n_trees = 30, \n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.gini, \n β = 0.0, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(RandomForestClassifier(n_trees = 30, …), …).\n\njulia> cat_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#DecisionTreeClassifier_DecisionTree","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"DecisionTreeClassifier","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"A model type for constructing a CART decision tree classifier, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"Do model = DecisionTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeClassifier(max_depth=...).","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"DecisionTreeClassifier implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): \"Classification and regression trees\". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Training-data","page":"DecisionTreeClassifier","title":"Training data","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"where","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Hyperparameters","page":"DecisionTreeClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: max number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=0: number of features to select at random (0 for all)\npost_prune=false: set to true for post-fit pruning\nmerge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold\ndisplay_depth=5: max depth to show when displaying the tree\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Operations","page":"DecisionTreeClassifier","title":"Operations","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.\npredict_mode(mach, Xnew): instead return the mode of each prediction above.","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Fitted-parameters","page":"DecisionTreeClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"raw_tree: the raw Node, Leaf or Root object returned by the core DecisionTree.jl algorithm\ntree: a visualizable, wrapped version of raw_tree implementing the AbstractTrees.jl interface; see \"Examples\" below\nencoding: dictionary of target classes keyed on integers used internally by DecisionTree.jl\nfeatures: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Report","page":"DecisionTreeClassifier","title":"Report","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"classes_seen: list of target classes actually observed in training\nprint_tree: alternative method to print the fitted tree, with single argument the tree depth; interpretation requires internal integer-class encoding (see \"Fitted parameters\" above).\nfeatures: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Accessor-functions","page":"DecisionTreeClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/#Examples","page":"DecisionTreeClassifier","title":"Examples","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"using MLJ\nDecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree\nmodel = DecisionTreeClassifier(max_depth=3, min_samples_split=3)\n\nX, y = @load_iris\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\nyhat = predict(mach, Xnew) ## probabilistic predictions\npredict_mode(mach, Xnew) ## point predictions\npdf.(yhat, \"virginica\") ## probabilities for the \"verginica\" class\n\njulia> tree = fitted_params(mach).tree\npetal_length < 2.45\n├─ setosa (50/50)\n└─ petal_width < 1.75\n ├─ petal_length < 4.95\n │ ├─ versicolor (47/48)\n │ └─ virginica (4/6)\n └─ petal_length < 4.85\n ├─ virginica (2/3)\n └─ virginica (43/43)\n\nusing Plots, TreeRecipe\nplot(tree) ## for a graphical representation of the tree\n\nfeature_importances(mach)","category":"page"},{"location":"models/DecisionTreeClassifier_DecisionTree/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeClassifier.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#DNNDetector_OutlierDetectionNeighbors","page":"DNNDetector","title":"DNNDetector","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"DNNDetector(d = 0,\n metric = Euclidean(),\n algorithm = :kdtree,\n leafsize = 10,\n reorder = true,\n parallel = false)","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"Anomaly score based on the number of neighbors in a hypersphere of radius d. Knorr et al. [1] directly converted the resulting outlier scores to labels, thus this implementation does not fully reflect the approach from the paper.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#Parameters","page":"DNNDetector","title":"Parameters","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"d::Real","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"The hypersphere radius used to calculate the global density of an instance.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"metric::Metric","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"This is one of the Metric types defined in the Distances.jl package. It is possible to define your own metrics by creating new types that are subtypes of Metric.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"algorithm::Symbol","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"One of (:kdtree, :balltree). In a kdtree, points are recursively split into groups using hyper-planes. Therefore a KDTree only works with axis aligned metrics which are: Euclidean, Chebyshev, Minkowski and Cityblock. A brutetree linearly searches all points in a brute force fashion and works with any Metric. A balltree recursively splits points into groups bounded by hyper-spheres and works with any Metric.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"static::Union{Bool, Symbol}","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"One of (true, false, :auto). Whether the input data for fitting and transform should be statically or dynamically allocated. If true, the data is statically allocated. If false, the data is dynamically allocated. If :auto, the data is dynamically allocated if the product of all dimensions except the last is greater than 100.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"leafsize::Int","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"Determines at what number of points to stop splitting the tree further. There is a trade-off between traversing the tree and having to evaluate the metric function for increasing number of points.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"reorder::Bool","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"While building the tree this will put points close in distance close in memory since this helps with cache locality. In this case, a copy of the original data will be made so that the original data is left unmodified. This can have a significant impact on performance and is by default set to true.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"parallel::Bool","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"Parallelize score and predict using all threads available. The number of threads can be set with the JULIA_NUM_THREADS environment variable. Note: fit is not parallel.","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#Examples","page":"DNNDetector","title":"Examples","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"using OutlierDetection: DNNDetector, fit, transform\ndetector = DNNDetector()\nX = rand(10, 100)\nmodel, result = fit(detector, X; verbosity=0)\ntest_scores = transform(detector, model, X)","category":"page"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/#References","page":"DNNDetector","title":"References","text":"","category":"section"},{"location":"models/DNNDetector_OutlierDetectionNeighbors/","page":"DNNDetector","title":"DNNDetector","text":"[1] Knorr, Edwin M.; Ng, Raymond T. (1998): Algorithms for Mining Distance-Based Outliers in Large Datasets.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#RidgeRegressor_MultivariateStats","page":"RidgeRegressor","title":"RidgeRegressor","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"A model type for constructing a ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(lambda=...).","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor adds a quadratic penalty term to least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. Options exist to specify a bias term, and to adjust the strength of the penalty term.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Training-data","page":"RidgeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Here:","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Hyper-parameters","page":"RidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.\nbias=true: Include the bias term if true, otherwise fit without bias term.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Operations","page":"RidgeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Fitted-parameters","page":"RidgeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"coefficients: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/#Examples","page":"RidgeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"using MLJ\n\nRidgeRegressor = @load RidgeRegressor pkg=MultivariateStats\npipe = Standardizer() |> RidgeRegressor(lambda=10)\n\nX, y = @load_boston\n\nmach = machine(pipe, X, y) |> fit!\nyhat = predict(mach, X)\ntraining_error = l1(yhat, y) |> mean","category":"page"},{"location":"models/RidgeRegressor_MultivariateStats/","page":"RidgeRegressor","title":"RidgeRegressor","text":"See also LinearRegressor, MultitargetLinearRegressor, MultitargetRidgeRegressor","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/#GradientBoostingClassifier_MLJScikitLearnInterface","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"","category":"section"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"GradientBoostingClassifier","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"A model type for constructing a gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"GradientBoostingClassifier = @load GradientBoostingClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"Do model = GradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingClassifier(loss=...).","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.","category":"page"},{"location":"models/GradientBoostingClassifier_MLJScikitLearnInterface/","page":"GradientBoostingClassifier","title":"GradientBoostingClassifier","text":"HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/#SGDClassifier_MLJScikitLearnInterface","page":"SGDClassifier","title":"SGDClassifier","text":"","category":"section"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"SGDClassifier","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"A model type for constructing a sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"SGDClassifier = @load SGDClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"Do model = SGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDClassifier(loss=...).","category":"page"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SGDClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SGDClassifier_MLJScikitLearnInterface/","page":"SGDClassifier","title":"SGDClassifier","text":"loss = hinge\npenalty = l2\nalpha = 0.0001\nl1_ratio = 0.15\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\nepsilon = 0.1\nn_jobs = nothing\nrandom_state = nothing\nlearning_rate = optimal\neta0 = 0.0\npower_t = 0.5\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nclass_weight = nothing\nwarm_start = false\naverage = false","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/#FeatureSelector_FeatureSelection","page":"FeatureSelector","title":"FeatureSelector","text":"","category":"section"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"FeatureSelector","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"A model type for constructing a feature selector, based on FeatureSelection.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"FeatureSelector = @load FeatureSelector pkg=FeatureSelection","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"Use this model to select features (columns) of a table, usually as part of a model Pipeline.","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/#Training-data","page":"FeatureSelector","title":"Training data","text":"","category":"section"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"mach = machine(model, X)","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"where","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"X: any table of input features, where \"table\" is in the sense of Tables.jl","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/#Hyper-parameters","page":"FeatureSelector","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"features: one of the following, with the behavior indicated:\n[] (empty, the default): filter out all features (columns) which were not encountered in training\nnon-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)\nfunction or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.\nignore: whether to ignore or keep specified features, as explained above","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/#Operations","page":"FeatureSelector","title":"Operations","text":"","category":"section"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/#Fitted-parameters","page":"FeatureSelector","title":"Fitted parameters","text":"","category":"section"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"features_to_keep: the features that will be selected","category":"page"},{"location":"models/FeatureSelector_FeatureSelection/#Example","page":"FeatureSelector","title":"Example","text":"","category":"section"},{"location":"models/FeatureSelector_FeatureSelection/","page":"FeatureSelector","title":"FeatureSelector","text":"using MLJ\n\nX = (ordinal1 = [1, 2, 3],\n ordinal2 = coerce([\"x\", \"y\", \"x\"], OrderedFactor),\n ordinal3 = [10.0, 20.0, 30.0],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = coerce([\"Your father\", \"he\", \"is\"], Multiclass));\n\nselector = FeatureSelector(features=[:ordinal3, ], ignore=true);\n\njulia> transform(fit!(machine(selector, X)), X)\n(ordinal1 = [1, 2, 3],\n ordinal2 = CategoricalValue{Symbol,UInt32}[\"x\", \"y\", \"x\"],\n ordinal4 = [-20.0, -30.0, -40.0],\n nominal = CategoricalValue{String,UInt32}[\"Your father\", \"he\", \"is\"],)\n","category":"page"},{"location":"models/PCA_MultivariateStats/#PCA_MultivariateStats","page":"PCA","title":"PCA","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"PCA","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"A model type for constructing a pca, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"PCA = @load PCA pkg=MultivariateStats","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Do model = PCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PCA(maxoutdim=...).","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Principal component analysis learns a linear projection onto a lower dimensional space while preserving most of the initial variance seen in the training data.","category":"page"},{"location":"models/PCA_MultivariateStats/#Training-data","page":"PCA","title":"Training data","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"mach = machine(model, X)","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Here:","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/PCA_MultivariateStats/#Hyper-parameters","page":"PCA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"maxoutdim=0: Together with variance_ratio, controls the output dimension outdim chosen by the model. Specifically, suppose that k is the smallest integer such that retaining the k most significant principal components accounts for variance_ratio of the total variance in the training data. Then outdim = min(outdim, maxoutdim). If maxoutdim=0 (default) then the effective maxoutdim is min(n, indim - 1) where n is the number of observations and indim the number of features in the training data.\nvariance_ratio::Float64=0.99: The ratio of variance preserved after the transformation\nmethod=:auto: The method to use to solve the problem. Choices are\n:svd: Support Vector Decomposition of the matrix.\n:cov: Covariance matrix decomposition.\n:auto: Use :cov if the matrices first dimension is smaller than its second dimension and otherwise use :svd\nmean=nothing: if nothing, centering will be computed and applied, if set to 0 no centering (data is assumed pre-centered); if a vector is passed, the centering is done with that vector.","category":"page"},{"location":"models/PCA_MultivariateStats/#Operations","page":"PCA","title":"Operations","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/PCA_MultivariateStats/#Fitted-parameters","page":"PCA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and output respectively.","category":"page"},{"location":"models/PCA_MultivariateStats/#Report","page":"PCA","title":"Report","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim = min(n, indim, maxoutdim) is the output dimension; here n is the number of observations.\ntprincipalvar: Total variance of the principal components.\ntresidualvar: Total residual variance.\ntvar: Total observation variance (principal + residual variance).\nmean: The mean of the untransformed training data, of length indim.\nprincipalvars: The variance of the principal components. An AbstractVector of length outdim\nloadings: The models loadings, weights for each variable used when calculating principal components. A matrix of size (indim, outdim) where indim and outdim are as defined above.","category":"page"},{"location":"models/PCA_MultivariateStats/#Examples","page":"PCA","title":"Examples","text":"","category":"section"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"using MLJ\n\nPCA = @load PCA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = PCA(maxoutdim=2)\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/PCA_MultivariateStats/","page":"PCA","title":"PCA","text":"See also KernelPCA, ICA, FactorAnalysis, PPCA","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#ENNUndersampler_Imbalance","page":"ENNUndersampler","title":"ENNUndersampler","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"Initiate a ENN undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"ENNUndersampler","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"A model type for constructing a enn undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"ENNUndersampler = @load ENNUndersampler pkg=Imbalance","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"Do model = ENNUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ENNUndersampler(k=...).","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"ENNUndersampler undersamples a dataset by removing (\"cleaning\") points that violate a certain condition such as having a different class compared to the majority of the neighbors as proposed in Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Training-data","page":"ENNUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by \tmach = machine(model)","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"For default values of the hyper-parameters, model can be constructed by \tmodel = ENNUndersampler()","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Hyperparameters","page":"ENNUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"k::Integer=5: Number of nearest neighbors to consider in the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class.\nkeep_condition::AbstractString=\"mode\": The condition that leads to cleaning a point upon violation. Takes one of \"exists\", \"mode\", \"only mode\" and \"all\"","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"- `\"exists\"`: the point has at least one neighbor from the same class\n- `\"mode\"`: the class of the point is one of the most frequent classes of the neighbors (there may be many)\n- `\"only mode\"`: the class of the point is the single most frequent class of the neighbors\n- `\"all\"`: the class of the point is the same as all the neighbors","category":"page"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.\nCan be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float minimum ratio for that class\nforce_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.\ntry_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Transform-Inputs","page":"ENNUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Transform-Outputs","page":"ENNUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Operations","page":"ENNUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"transform(mach, X, y): resample the data X and y using ENNUndersampler, returning the undersampled versions","category":"page"},{"location":"models/ENNUndersampler_Imbalance/#Example","page":"ENNUndersampler","title":"Example","text":"","category":"section"},{"location":"models/ENNUndersampler_Imbalance/","page":"ENNUndersampler","title":"ENNUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n min_sep=0.01, stds=[3.0 3.0 3.0], class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y; ref=\"minority\")\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load ENN model type:\nENNUndersampler = @load ENNUndersampler pkg=Imbalance\n\n## underample the majority classes to sizes relative to the minority class:\nundersampler = ENNUndersampler(min_ratios=0.5, rng=42)\nmach = machine(undersampler)\nX_under, y_under = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 24 (240.0%) ","category":"page"},{"location":"acceleration_and_parallelism/#Acceleration-and-Parallelism","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"","category":"section"},{"location":"acceleration_and_parallelism/#User-facing-interface","page":"Acceleration and Parallelism","title":"User-facing interface","text":"","category":"section"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"To enable composable, extensible acceleration of core MLJ methods, ComputationalResources.jl is utilized to provide some basic types and functions to make implementing acceleration easy. However, ambitious users or package authors have the option to define their own types to be passed as resources to acceleration, which must be <:ComputationalResources.AbstractResource.","category":"page"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"Methods which support some form of acceleration support the acceleration keyword argument, which can be passed a \"resource\" from ComputationalResources. For example, passing acceleration=CPUProcesses() will utilize Distributed's multiprocessing functionality to accelerate the computation, while acceleration=CPUThreads() will use Julia's PARTR threading model to perform acceleration.","category":"page"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"The default computational resource is CPU1(), which is simply serial processing via CPU. The default resource can be changed as in this example: MLJ.default_resource(CPUProcesses()). The argument must always have type <:ComputationalResource.AbstractResource. To inspect the current default, use MLJ.default_resource().","category":"page"},{"location":"acceleration_and_parallelism/","page":"Acceleration and Parallelism","title":"Acceleration and Parallelism","text":"note: Note\nYou cannot use CPUThreads() with models wrapping python code.","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/#MiniBatchKMeans_MLJScikitLearnInterface","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"","category":"section"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"MiniBatchKMeans","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"A model type for constructing a Mini-Batch K-Means clustering., based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"MiniBatchKMeans = @load MiniBatchKMeans pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"Do model = MiniBatchKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MiniBatchKMeans(n_clusters=...).","category":"page"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/#Hyper-parameters","page":"MiniBatchKMeans","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MiniBatchKMeans_MLJScikitLearnInterface/","page":"MiniBatchKMeans","title":"MiniBatchKMeans","text":"n_clusters = 8\nmax_iter = 100\nbatch_size = 100\nverbose = 0\ncompute_labels = true\nrandom_state = nothing\ntol = 0.0\nmax_no_improvement = 10\ninit_size = nothing\nn_init = 3\ninit = k-means++\nreassignment_ratio = 0.01","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#TomekUndersampler_Imbalance","page":"TomekUndersampler","title":"TomekUndersampler","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"Initiate a tomek undersampling model with the given hyper-parameters.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"TomekUndersampler","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"A model type for constructing a tomek undersampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"TomekUndersampler = @load TomekUndersampler pkg=Imbalance","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"Do model = TomekUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TomekUndersampler(min_ratios=...).","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"TomekUndersampler undersamples by removing any point that is part of a tomek link in the data. As defined in, Ivan Tomek. Two modifications of cnn. IEEE Trans. Systems, Man and Cybernetics, 6:769–772, 1976.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Training-data","page":"TomekUndersampler","title":"Training data","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"For default values of the hyper-parameters, model can be constructed by model = TomekUndersampler()","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Hyperparameters","page":"TomekUndersampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.\nCan be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class\nCan be a dictionary mapping each class label to the float minimum ratio for that class\nforce_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.\ntry_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Transform-Inputs","page":"TomekUndersampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Transform-Outputs","page":"TomekUndersampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively\ny_under: An abstract vector of labels corresponding to X_under","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Operations","page":"TomekUndersampler","title":"Operations","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"transform(mach, X, y): resample the data X and y using TomekUndersampler, returning both the new and original observations","category":"page"},{"location":"models/TomekUndersampler_Imbalance/#Example","page":"TomekUndersampler","title":"Example","text":"","category":"section"},{"location":"models/TomekUndersampler_Imbalance/","page":"TomekUndersampler","title":"TomekUndersampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n min_sep=0.01, stds=[3.0 3.0 3.0], class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y; ref=\"minority\")\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (173.7%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (252.6%) \n\n## load TomekUndersampler model type:\nTomekUndersampler = @load TomekUndersampler pkg=Imbalance\n\n## Underample the majority classes to sizes relative to the minority class:\ntomek_undersampler = TomekUndersampler(min_ratios=1.0, rng=42)\nmach = machine(tomek_undersampler)\nX_under, y_under = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(y_under; ref=\"minority\")\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 22 (115.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 36 (189.5%)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#OneRuleClassifier_OneRule","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"OneRuleClassifier","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"OneRuleClassifier = @load OneRuleClassifier pkg=OneRule","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"Do model = OneRuleClassifier() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"OneRuleClassifier implements the OneRule method for classification by Robert Holte (\"Very simple classification rules perform well on most commonly used datasets\" in: Machine Learning 11.1 (1993), pp. 63-90). ","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"For more information see:\n\n- Witten, Ian H., Eibe Frank, and Mark A. Hall. \n Data Mining Practical Machine Learning Tools and Techniques Third Edition. \n Morgan Kaufmann, 2017, pp. 93-96.\n- [Machine Learning - (One|Simple) Rule](https://datacadamia.com/data_mining/one_rule)\n- [OneRClassifier - One Rule for Classification](http://rasbt.github.io/mlxtend/user_guide/classifier/OneRClassifier/)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Training-data","page":"OneRuleClassifier","title":"Training data","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Multiclass, OrderedFactor, or <:Finite; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Hyper-parameters","page":"OneRuleClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"This classifier has no hyper-parameters.","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Operations","page":"OneRuleClassifier","title":"Operations","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"predict(mach, Xnew): return (deterministic) predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Fitted-parameters","page":"OneRuleClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"tree: the tree (a OneTree) returned by the core OneTree.jl algorithm\nall_classes: all classes (i.e. levels) of the target (used also internally to transfer levels-information to predict)","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Report","page":"OneRuleClassifier","title":"Report","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"tree: The OneTree created based on the training data\nnrules: The number of rules tree contains\nerror_rate: fraction of wrongly classified instances\nerror_count: number of wrongly classified instances\nclasses_seen: list of target classes actually observed in training\nfeatures: the names of the features encountered in training","category":"page"},{"location":"models/OneRuleClassifier_OneRule/#Examples","page":"OneRuleClassifier","title":"Examples","text":"","category":"section"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"using MLJ\n\nORClassifier = @load OneRuleClassifier pkg=OneRule\n\norc = ORClassifier()\n\noutlook = [\"sunny\", \"sunny\", \"overcast\", \"rainy\", \"rainy\", \"rainy\", \"overcast\", \"sunny\", \"sunny\", \"rainy\", \"sunny\", \"overcast\", \"overcast\", \"rainy\"]\ntemperature = [\"hot\", \"hot\", \"hot\", \"mild\", \"cool\", \"cool\", \"cool\", \"mild\", \"cool\", \"mild\", \"mild\", \"mild\", \"hot\", \"mild\"]\nhumidity = [\"high\", \"high\", \"high\", \"high\", \"normal\", \"normal\", \"normal\", \"high\", \"normal\", \"normal\", \"normal\", \"high\", \"normal\", \"high\"]\nwindy = [\"false\", \"true\", \"false\", \"false\", \"false\", \"true\", \"true\", \"false\", \"false\", \"false\", \"true\", \"true\", \"false\", \"true\"]\n\nweather_data = (outlook = outlook, temperature = temperature, humidity = humidity, windy = windy)\nplay_data = [\"no\", \"no\", \"yes\", \"yes\", \"yes\", \"no\", \"yes\", \"no\", \"yes\", \"yes\", \"yes\", \"yes\", \"yes\", \"no\"]\n\nweather = coerce(weather_data, Textual => Multiclass)\nplay = coerce(play, Multiclass)\n\nmach = machine(orc, weather, play)\nfit!(mach)\n\nyhat = MLJ.predict(mach, weather) ## in a real context 'new' `weather` data would be used\none_tree = fitted_params(mach).tree\nreport(mach).error_rate","category":"page"},{"location":"models/OneRuleClassifier_OneRule/","page":"OneRuleClassifier","title":"OneRuleClassifier","text":"See also OneRule.jl.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#MultinomialNBClassifier_NaiveBayes","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"A model type for constructing a multinomial naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"The multinomial naive Bayes classifier is often applied when input features consist of a counts (scitype Count) and when observations for a fixed target class are generated from a multinomial distribution with fixed probability vector, but whose sample length varies from observation to observation. For example, features might represent word counts in text documents being classified by sentiment.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Training-data","page":"MultinomialNBClassifier","title":"Training data","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Here:","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Count; check the column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Hyper-parameters","page":"MultinomialNBClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"alpha=1: Lindstone smoothing in estimation of multinomial probability vectors from training histograms (default corresponds to Laplacian smoothing).","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Operations","page":"MultinomialNBClassifier","title":"Operations","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.\npredict_mode(mach, Xnew): Return the mode of above predictions.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Fitted-parameters","page":"MultinomialNBClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"c_counts: A dictionary containing the observed count of each input class.\nx_counts: A dictionary containing the categorical counts of each input class.\nx_totals: The sum of each count (input feature), ungrouped.\nn_obs: The total number of observations in the training data.","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/#Examples","page":"MultinomialNBClassifier","title":"Examples","text":"","category":"section"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"using MLJ\nimport TextAnalysis\n\nCountTransformer = @load CountTransformer pkg=MLJText\nMultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes\n\ntokenized_docs = TextAnalysis.tokenize.([\n \"I am very mad. You never listen.\",\n \"You seem to be having trouble? Can I help you?\",\n \"Our boss is mad at me. I hope he dies.\",\n \"His boss wants to help me. She is nice.\",\n \"Thank you for your help. It is nice working with you.\",\n \"Never do that again! I am so mad. \",\n])\n\nsentiment = [\n \"negative\",\n \"positive\",\n \"negative\",\n \"positive\",\n \"positive\",\n \"negative\",\n]\n\nmach1 = machine(CountTransformer(), tokenized_docs) |> fit!\n\n## matrix of counts:\nX = transform(mach1, tokenized_docs)\n\n## to ensure scitype(y) <: AbstractVector{<:OrderedFactor}:\ny = coerce(sentiment, OrderedFactor)\n\nclassifier = MultinomialNBClassifier()\nmach2 = machine(classifier, X, y)\nfit!(mach2, rows=1:4)\n\n## probabilistic predictions:\ny_prob = predict(mach2, rows=5:6) ## distributions\npdf.(y_prob, \"positive\") ## probabilities for \"positive\"\nlog_loss(y_prob, y[5:6])\n\n## point predictions:\nyhat = mode.(y_prob) ## or `predict_mode(mach2, rows=5:6)`","category":"page"},{"location":"models/MultinomialNBClassifier_NaiveBayes/","page":"MultinomialNBClassifier","title":"MultinomialNBClassifier","text":"See also GaussianNBClassifier","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#ProbabilisticNuSVC_LIBSVM","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"ProbabilisticNuSVC","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"A model type for constructing a probabilistic ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Do model = ProbabilisticNuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticNuSVC(kernel=...).","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"This model is identical to NuSVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to total computation time.","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Platt, John (1999): \"Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.\"","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Training-data","page":"ProbabilisticNuSVC","title":"Training data","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"where","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Hyper-parameters","page":"ProbabilisticNuSVC","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\nnu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Operations","page":"ProbabilisticNuSVC","title":"Operations","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Fitted-parameters","page":"ProbabilisticNuSVC","title":"Fitted parameters","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\nencoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Report","page":"ProbabilisticNuSVC","title":"Report","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"The fields of report(mach) are:","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Examples","page":"ProbabilisticNuSVC","title":"Examples","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#Using-a-built-in-kernel","page":"ProbabilisticNuSVC","title":"Using a built-in kernel","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"using MLJ\nimport LIBSVM\n\nProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM ## model type\nmodel = ProbabilisticNuSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nX, y = @load_iris ## table, vector\nmach = machine(model, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\n\njulia> probs = predict(mach, Xnew)\n3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.00313, versicolor=>0.0247, virginica=>0.972)\n UnivariateFinite{Multiclass{3}}(setosa=>0.000598, versicolor=>0.0155, virginica=>0.984)\n UnivariateFinite{Multiclass{3}}(setosa=>2.27e-6, versicolor=>2.73e-6, virginica=>1.0)\n\njulia> yhat = mode.(probs)\n3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:\n \"virginica\"\n \"virginica\"\n \"virginica\"","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/#User-defined-kernels","page":"ProbabilisticNuSVC","title":"User-defined kernels","text":"","category":"section"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = ProbabilisticNuSVC(kernel=k)\nmach = machine(model, X, y) |> fit!\n\nprobs = predict(mach, Xnew)","category":"page"},{"location":"models/ProbabilisticNuSVC_LIBSVM/","page":"ProbabilisticNuSVC","title":"ProbabilisticNuSVC","text":"See also the classifiers NuSVC, SVC, ProbabilisticSVC and LinearSVC. And see LIVSVM.jl and the original C implementation. documentation.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#UnivariateFillImputer_MLJModels","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"UnivariateFillImputer","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"For imputing missing values in tabular data, use FillImputer instead.","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Training-data","page":"UnivariateFillImputer","title":"Training data","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"mach = machine(model, x)","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"where","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Hyper-parameters","page":"UnivariateFillImputer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values\ncount_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values\nfinite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Operations","page":"UnivariateFillImputer","title":"Operations","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Fitted-parameters","page":"UnivariateFillImputer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"filler: the fill value to be imputed in all new data","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/#Examples","page":"UnivariateFillImputer","title":"Examples","text":"","category":"section"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"using MLJ\nimputer = UnivariateFillImputer()\n\nx_continuous = [1.0, 2.0, missing, 3.0]\nx_multiclass = coerce([\"y\", \"n\", \"y\", missing, \"y\"], Multiclass)\nx_count = [1, 1, 1, 2, missing, 3, 3]\n\nmach = machine(imputer, x_continuous)\nfit!(mach)\n\njulia> fitted_params(mach)\n(filler = 2.0,)\n\njulia> transform(mach, [missing, missing, 101.0])\n3-element Vector{Float64}:\n 2.0\n 2.0\n 101.0\n\nmach2 = machine(imputer, x_multiclass) |> fit!\n\njulia> transform(mach2, x_multiclass)\n5-element CategoricalArray{String,1,UInt32}:\n \"y\"\n \"n\"\n \"y\"\n \"y\"\n \"y\"\n\nmach3 = machine(imputer, x_count) |> fit!\n\njulia> transform(mach3, [missing, missing, 5])\n3-element Vector{Int64}:\n 2\n 2\n 5","category":"page"},{"location":"models/UnivariateFillImputer_MLJModels/","page":"UnivariateFillImputer","title":"UnivariateFillImputer","text":"For imputing tabular data, use FillImputer.","category":"page"},{"location":"models/RandomForestImputer_BetaML/#RandomForestImputer_BetaML","page":"RandomForestImputer","title":"RandomForestImputer","text":"","category":"section"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"mutable struct RandomForestImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/RandomForestImputer_BetaML/#Hyperparameters:","page":"RandomForestImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"n_trees::Int64: Number of (decision) trees in the forest [def: 30]\nmax_depth::Union{Nothing, Int64}: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: nothing, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Union{Nothing, Int64}: The maximum number of (random) features to consider at each partitioning [def: nothing, i.e. square root of the data dimension]\nforced_categorical_cols::Vector{Int64}: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]\nsplitting_criterion::Union{Nothing, Function}: Either gini, entropy or variance. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: nothing, i.e. gini for categorical labels (classification task) and variance for numerical labels(regression task)]. It can be an anonymous function.\nrecursive_passages::Int64: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/RandomForestImputer_BetaML/#Example:","page":"RandomForestImputer","title":"Example:","text":"","category":"section"},{"location":"models/RandomForestImputer_BetaML/","page":"RandomForestImputer","title":"RandomForestImputer","text":"julia> using MLJ\n\njulia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;\n\njulia> modelType = @load RandomForestImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.RandomForestImputer\n\njulia> model = modelType(n_trees=40)\nRandomForestImputer(\n n_trees = 40, \n max_depth = nothing, \n min_gain = 0.0, \n min_records = 2, \n max_features = nothing, \n forced_categorical_cols = Int64[], \n splitting_criterion = nothing, \n recursive_passages = 1, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(RandomForestImputer(n_trees = 40, …), …).\n\njulia> X_full = transform(mach) |> MLJ.matrix\n9×2 Matrix{Float64}:\n 1.0 10.5\n 1.5 10.3909\n 1.8 8.0\n 1.7 15.0\n 3.2 40.0\n 2.88375 8.66125\n 3.3 38.0\n 3.98125 -2.3\n 5.2 -2.4","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#BayesianSubspaceLDA_MultivariateStats","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"BayesianSubspaceLDA","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"A model type for constructing a Bayesian subspace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"Do model = BayesianSubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianSubspaceLDA(normalize=...).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"The Bayesian multiclass subspace linear discriminant analysis algorithm learns a projection matrix as described in SubspaceLDA. The posterior class probability distribution is derived as in BayesianLDA.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Training-data","page":"BayesianSubspaceLDA","title":"Training data","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"Here:","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Hyper-parameters","page":"BayesianSubspaceLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Operations","page":"BayesianSubspaceLDA","title":"Operations","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\npredict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Fitted-parameters","page":"BayesianSubspaceLDA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"classes: The classes seen during model fitting.\nprojection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).\npriors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Report","page":"BayesianSubspaceLDA","title":"Report","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"indim: The dimension of the input space i.e the number of training features.\noutdim: The dimension of the transformed space the model is projected to.\nmean: The overall mean of the training data.\nnclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)\nexplained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/#Examples","page":"BayesianSubspaceLDA","title":"Examples","text":"","category":"section"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"using MLJ\n\nBayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = BayesianSubspaceLDA()\nmach = machine(model, X, y) |> fit!\n\nXproj = transform(mach, X)\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)","category":"page"},{"location":"models/BayesianSubspaceLDA_MultivariateStats/","page":"BayesianSubspaceLDA","title":"BayesianSubspaceLDA","text":"See also LDA, BayesianLDA, SubspaceLDA","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#DecisionTreeRegressor_DecisionTree","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"DecisionTreeRegressor","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"A model type for constructing a CART decision tree regressor, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"Do model = DecisionTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeRegressor(max_depth=...).","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"DecisionTreeRegressor implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): \"Classification and regression trees\". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Training-data","page":"DecisionTreeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"where","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Hyperparameters","page":"DecisionTreeRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: max number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=0: number of features to select at random (0 for all)\npost_prune=false: set to true for post-fit pruning\nmerge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Operations","page":"DecisionTreeRegressor","title":"Operations","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Fitted-parameters","page":"DecisionTreeRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"tree: the tree or stump object returned by the core DecisionTree.jl algorithm\nfeatures: the names of the features encountered in training","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Report","page":"DecisionTreeRegressor","title":"Report","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Accessor-functions","page":"DecisionTreeRegressor","title":"Accessor functions","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/#Examples","page":"DecisionTreeRegressor","title":"Examples","text":"","category":"section"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"using MLJ\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nmodel = DecisionTreeRegressor(max_depth=3, min_samples_split=3)\n\nX, y = make_regression(100, 4; rng=123) ## synthetic data\nmach = machine(model, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2; rng=123)\nyhat = predict(mach, Xnew) ## new predictions\n\njulia> fitted_params(mach).tree\nx1 < 0.2758\n├─ x2 < 0.9137\n│ ├─ x1 < -0.9582\n│ │ ├─ 0.9189256882087312 (0/12)\n│ │ └─ -0.23180616021065256 (0/38)\n│ └─ -1.6461153800037722 (0/9)\n└─ x1 < 1.062\n ├─ x2 < -0.4969\n │ ├─ -0.9330755147107384 (0/5)\n │ └─ -2.3287967825015548 (0/17)\n └─ x2 < 0.4598\n ├─ -2.931299926506291 (0/11)\n └─ -4.726518740473489 (0/8)\n\nfeature_importances(mach) ## get feature importances","category":"page"},{"location":"models/DecisionTreeRegressor_DecisionTree/","page":"DecisionTreeRegressor","title":"DecisionTreeRegressor","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeRegressor.","category":"page"},{"location":"models/IForestDetector_OutlierDetectionPython/#IForestDetector_OutlierDetectionPython","page":"IForestDetector","title":"IForestDetector","text":"","category":"section"},{"location":"models/IForestDetector_OutlierDetectionPython/","page":"IForestDetector","title":"IForestDetector","text":"IForestDetector(n_estimators = 100,\n max_samples = \"auto\",\n max_features = 1.0\n bootstrap = false,\n random_state = nothing,\n verbose = 0,\n n_jobs = 1)","category":"page"},{"location":"models/IForestDetector_OutlierDetectionPython/","page":"IForestDetector","title":"IForestDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.iforest","category":"page"},{"location":"models/RODDetector_OutlierDetectionPython/#RODDetector_OutlierDetectionPython","page":"RODDetector","title":"RODDetector","text":"","category":"section"},{"location":"models/RODDetector_OutlierDetectionPython/","page":"RODDetector","title":"RODDetector","text":"RODDetector(parallel_execution = false)","category":"page"},{"location":"models/RODDetector_OutlierDetectionPython/","page":"RODDetector","title":"RODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.rod","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/#RandomForestClassifier_MLJScikitLearnInterface","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"","category":"section"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A model type for constructing a random forest classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier = @load RandomForestClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(n_estimators=...).","category":"page"},{"location":"models/RandomForestClassifier_MLJScikitLearnInterface/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/#EnsembleModel_MLJEnsembles","page":"EnsembleModel","title":"EnsembleModel","text":"","category":"section"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"EnsembleModel(model,\n atomic_weights=Float64[],\n bagging_fraction=0.8,\n n=100,\n rng=GLOBAL_RNG,\n acceleration=CPU1(),\n out_of_bag_measure=[])","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).","category":"page"},{"location":"models/EnsembleModel_MLJEnsembles/","page":"EnsembleModel","title":"EnsembleModel","text":"Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.","category":"page"},{"location":"about_mlj/#About-MLJ","page":"About MLJ","title":"About MLJ","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing over 180 machine learning models written in Julia and other languages. In particular MLJ wraps a large number of scikit-learn models.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ is released under the MIT license.","category":"page"},{"location":"about_mlj/#Lightning-tour","page":"About MLJ","title":"Lightning tour","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"For help learning to use MLJ, see Learning MLJ.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"A self-contained notebook and julia script of this demonstration is also available here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"The first code snippet below creates a new Julia environment MLJ_tour and installs just those packages needed for the tour. See Installation for more on creating a Julia environment for use with MLJ.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Julia installation instructions are here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"using Pkg\nPkg.activate(\"MLJ_tour\", shared=true)\nPkg.add(\"MLJ\")\nPkg.add(\"MLJIteration\")\nPkg.add(\"EvoTrees\")","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"In MLJ a model is just a container for hyper-parameters, and that's all. Here we will apply several kinds of model composition before binding the resulting \"meta-model\" to data in a machine for evaluation using cross-validation.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Loading and instantiating a gradient tree-boosting model:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"using MLJ\nBooster = @load EvoTreeRegressor # loads code defining a model type\nbooster = Booster(max_depth=2) # specify hyper-parameter at construction\nbooster.nrounds = 50 # or mutate afterwards","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"This model is an example of an iterative model. As it stands, the number of iterations nrounds is fixed.","category":"page"},{"location":"about_mlj/#Composition-1:-Wrapping-the-model-to-make-it-\"self-iterating\"","page":"About MLJ","title":"Composition 1: Wrapping the model to make it \"self-iterating\"","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Let's create a new model that automatically learns the number of iterations, using the NumberSinceBest(3) criterion, as applied to an out-of-sample l1 loss:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"using MLJIteration\niterated_booster = IteratedModel(model=booster,\n resampling=Holdout(fraction_train=0.8),\n controls=[Step(2), NumberSinceBest(3), NumberLimit(300)],\n measure=l1,\n retrain=true)","category":"page"},{"location":"about_mlj/#Composition-2:-Preprocess-the-input-features","page":"About MLJ","title":"Composition 2: Preprocess the input features","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Combining the model with categorical feature encoding:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"pipe = ContinuousEncoder() |> iterated_booster","category":"page"},{"location":"about_mlj/#Composition-3:-Wrapping-the-model-to-make-it-\"self-tuning\"","page":"About MLJ","title":"Composition 3: Wrapping the model to make it \"self-tuning\"","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"First, we define a hyper-parameter range for optimization of a (nested) hyper-parameter:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"max_depth_range = range(pipe,\n :(deterministic_iterated_model.model.max_depth),\n lower = 1,\n upper = 10)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Now we can wrap the pipeline model in an optimization strategy to make it \"self-tuning\":","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"self_tuning_pipe = TunedModel(model=pipe,\n tuning=RandomSearch(),\n ranges=max_depth_range,\n resampling=CV(nfolds=3, rng=456),\n measure=l1,\n acceleration=CPUThreads(),\n n=50)","category":"page"},{"location":"about_mlj/#Binding-to-data-and-evaluating-performance","page":"About MLJ","title":"Binding to data and evaluating performance","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Loading a selection of features and labels from the Ames House Price dataset:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"X, y = @load_reduced_ames","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Evaluating the \"self-tuning\" pipeline model's performance using 5-fold cross-validation (implies multiple layers of nested resampling):","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> evaluate(self_tuning_pipe, X, y,\n measures=[l1, l2],\n resampling=CV(nfolds=5, rng=123),\n acceleration=CPUThreads(),\n verbosity=2)\nPerformanceEvaluation object with these fields:\n measure, measurement, operation, per_fold,\n per_observation, fitted_params_per_fold,\n report_per_fold, train_test_pairs\nExtract:\n┌───────────────┬─────────────┬───────────┬───────────────────────────────────────────────┐\n│ measure │ measurement │ operation │ per_fold │\n├───────────────┼─────────────┼───────────┼───────────────────────────────────────────────┤\n│ LPLoss(p = 1) │ 17200.0 │ predict │ [16500.0, 17100.0, 16300.0, 17500.0, 18900.0] │\n│ LPLoss(p = 2) │ 6.83e8 │ predict │ [6.14e8, 6.64e8, 5.98e8, 6.37e8, 9.03e8] │\n└───────────────┴─────────────┴───────────┴───────────────────────────────────────────────┘","category":"page"},{"location":"about_mlj/#Key-goals","page":"About MLJ","title":"Key goals","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Offer a consistent way to use, compose and tune machine learning models in Julia,\nPromote the improvement of the Julia ML/Stats ecosystem by making it easier to use models from a wide range of packages,\nUnlock performance gains by exploiting Julia's support for parallelism, automatic differentiation, GPU, optimization etc.","category":"page"},{"location":"about_mlj/#Key-features","page":"About MLJ","title":"Key features","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Data agnostic, train most models on any data X supported by the Tables.jl interface (needs Tables.istable(X) == true).\nExtensive, state-of-the-art, support for model composition (pipelines, stacks and, more generally, learning networks). See more below.\nConvenient syntax to tune and evaluate (composite) models.\nConsistent interface to handle probabilistic predictions.\nExtensible tuning interface, to support a growing number of optimization strategies, and designed to play well with model composition.\nOptions to accelerate model evaluation and tuning with multithreading and/or distributed processing.","category":"page"},{"location":"about_mlj/#Model-composability","page":"About MLJ","title":"Model composability","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"The generic model composition API's provided by other toolboxes we have surveyed share one or more of the following shortcomings, which do not exist in MLJ:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Composite models do not inherit all the behavior of ordinary models.\nComposition is limited to linear (non-branching) pipelines.\nSupervised components in a linear pipeline can only occur at the end of the pipeline.\nOnly static (unlearned) target transformations/inverse transformations are supported.\nHyper-parameters in homogeneous model ensembles cannot be coupled.\nModel stacking, with out-of-sample predictions for base learners, cannot be implemented (using the generic API alone).\nHyper-parameters and/or learned parameters of component models are not easily inspected or manipulated (by tuning algorithms, for example)\nComposite models cannot implement multiple operations, for example, both a predict and transform method (as in clustering models) or both a transform and inverse_transform method.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Some of these features are demonstrated in this notebook","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"For more information see the MLJ design paper or our detailed paper on the composition interface.","category":"page"},{"location":"about_mlj/#Getting-help-and-reporting-problems","page":"About MLJ","title":"Getting help and reporting problems","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Users are encouraged to provide feedback on their experience using MLJ and to report issues.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"For a query to have maximum exposure to maintainers and users, start a discussion thread at Julia Discourse Machine Learning and tag your issue \"mlj\". Queries can also be posted as issues, or on the #mlj slack workspace in the Julia Slack channel.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Bugs, suggestions, and feature requests can be posted here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Users are also welcome to join the #mlj Julia slack channel to ask questions and make suggestions.","category":"page"},{"location":"about_mlj/#Installation","page":"About MLJ","title":"Installation","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Initially, it is recommended that MLJ and associated packages be installed in a new environment to avoid package conflicts. You can do this with","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> using Pkg; Pkg.activate(\"my_MLJ_env\", shared=true)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Installing MLJ is also done with the package manager:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> Pkg.add(\"MLJ\")","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"Optional: To test your installation, run","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> Pkg.test(\"MLJ\")","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"It is important to note that MLJ is essentially a big wrapper providing unified access to model-providing packages. For this reason, one generally needs to add further packages to your environment to make model-specific code available. This happens automatically when you use MLJ's interactive load command @iload, as in","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"julia> Tree = @iload DecisionTreeClassifier # load type\njulia> tree = Tree() # instance","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"where you will also be asked to choose a providing package, for more than one provide a DecisionTreeClassifier model. For more on identifying the name of an applicable model, see Model Search. For non-interactive loading of code (e.g., from a module or function) see Loading Model Code.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"It is recommended that you start with models from more mature packages such as DecisionTree.jl, ScikitLearn.jl or XGBoost.jl.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ is supported by several satellite packages (MLJTuning, MLJModelInterface, etc) which the general user is not required to install directly. Developers can learn more about these here.","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"See also the alternative installation instructions for Modifying Behavior.","category":"page"},{"location":"about_mlj/#Funding","page":"About MLJ","title":"Funding","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"MLJ was initially created as a Tools, Practices and Systems project at the Alan Turing Institute in 2019. Current funding is provided by a New Zealand Strategic Science Investment Fund awarded to the University of Auckland.","category":"page"},{"location":"about_mlj/#Citing-MLJ","page":"About MLJ","title":"Citing MLJ","text":"","category":"section"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"An overview of MLJ design:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"(Image: DOI)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"@article{Blaom2020,\n doi = {10.21105/joss.02704},\n url = {https://doi.org/10.21105/joss.02704},\n year = {2020},\n publisher = {The Open Journal},\n volume = {5},\n number = {55},\n pages = {2704},\n author = {Anthony D. Blaom and Franz Kiraly and Thibaut Lienart and Yiannis Simillides and Diego Arenas and Sebastian J. Vollmer},\n title = {{MLJ}: A Julia package for composable machine learning},\n journal = {Journal of Open Source Software}\n}","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"An in-depth view of MLJ's model composition design:","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"(Image: arXiv)","category":"page"},{"location":"about_mlj/","page":"About MLJ","title":"About MLJ","text":"@misc{blaom2020flexible,\n title={Flexible model composition in machine learning and its implementation in {MLJ}},\n author={Anthony D. Blaom and Sebastian J. Vollmer},\n year={2020},\n eprint={2012.15505},\n archivePrefix={arXiv},\n primaryClass={cs.LG}\n}","category":"page"},{"location":"models/PPCA_MultivariateStats/#PPCA_MultivariateStats","page":"PPCA","title":"PPCA","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"PPCA","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"A model type for constructing a probabilistic PCA model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"PPCA = @load PPCA pkg=MultivariateStats","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Do model = PPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PPCA(maxoutdim=...).","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Probabilistic principal component analysis is a dimension-reduction algorithm which represents a constrained form of the Gaussian distribution in which the number of free parameters can be restricted while still allowing the model to capture the dominant correlations in a data set. It is expressed as the maximum likelihood solution of a probabilistic latent variable model. For details, see Bishop (2006): C. M. Pattern Recognition and Machine Learning.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Training-data","page":"PPCA","title":"Training data","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"mach = machine(model, X)","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Here:","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/PPCA_MultivariateStats/#Hyper-parameters","page":"PPCA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.\nmethod::Symbol=:ml: The method to use to solve the problem, one of :ml, :em, :bayes.\nmaxiter::Int=1000: The maximum number of iterations.\ntol::Real=1e-6: The convergence tolerance.\nmean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Operations","page":"PPCA","title":"Operations","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Fitted-parameters","page":"PPCA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a principal component.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Report","page":"PPCA","title":"Report","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data.\ntvat: The variance of the components.\nloadings: The model's loadings matrix. A matrix of size (indim, outdim) where indim and outdim as as defined above.","category":"page"},{"location":"models/PPCA_MultivariateStats/#Examples","page":"PPCA","title":"Examples","text":"","category":"section"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"using MLJ\n\nPPCA = @load PPCA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nmodel = PPCA(maxoutdim=2)\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/PPCA_MultivariateStats/","page":"PPCA","title":"PPCA","text":"See also KernelPCA, ICA, FactorAnalysis, PCA","category":"page"},{"location":"models/BM25Transformer_MLJText/#BM25Transformer_MLJText","page":"BM25Transformer","title":"BM25Transformer","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"BM25Transformer","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"A model type for constructing a b m25 transformer, based on MLJText.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"BM25Transformer = @load BM25Transformer pkg=MLJText","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Do model = BM25Transformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BM25Transformer(max_doc_freq=...).","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of Okapi BM25 document-word statistics. The BM25 scoring function uses both term frequency (TF) and inverse document frequency (IDF, defined below), as in TfidfTransformer, but additionally adjusts for the probability that a user will consider a search result relevant based, on the terms in the search query and those in each document.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"References:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"http://ethen8181.github.io/machine-learning/search/bm25_intro.html\nhttps://en.wikipedia.org/wiki/Okapi_BM25\nhttps://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html","category":"page"},{"location":"models/BM25Transformer_MLJText/#Training-data","page":"BM25Transformer","title":"Training data","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"mach = machine(model, X)","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Here:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:\nA vector of abstract strings (tokens), e.g., [\"I\", \"like\", \"Sam\", \".\", \"Sam\", \"is\", \"nice\", \".\"] (scitype AbstractVector{Textual})\nA dictionary of counts, indexed on abstract strings, e.g., Dict(\"I\"=>1, \"Sam\"=>2, \"Sam is\"=>1) (scitype Multiset{Textual}})\nA dictionary of counts, indexed on plain ngrams, e.g., Dict((\"I\",)=>1, (\"Sam\",)=>2, (\"I\", \"Sam\")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/BM25Transformer_MLJText/#Hyper-parameters","page":"BM25Transformer","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.\nmin_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.\nκ=2: The term frequency saturation characteristic. Higher values represent slower saturation. What we mean by saturation is the degree to which a term occurring extra times adds to the overall score.\nβ=0.075: Amplifies the particular document length compared to the average length. The bigger β is, the more document length is amplified in terms of the overall score. The default value is 0.75, and the bounds are restricted between 0 and 1.\nsmooth_idf=true: Control which definition of IDF to use (see above).","category":"page"},{"location":"models/BM25Transformer_MLJText/#Operations","page":"BM25Transformer","title":"Operations","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"transform(mach, Xnew): Based on the vocabulary, IDF, and mean word counts learned in training, return the matrix of BM25 scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.","category":"page"},{"location":"models/BM25Transformer_MLJText/#Fitted-parameters","page":"BM25Transformer","title":"Fitted parameters","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"vocab: A vector containing the string used in the transformer's vocabulary.\nidf_vector: The transformer's calculated IDF vector.\nmean_words_in_docs: The mean number of words in each document.","category":"page"},{"location":"models/BM25Transformer_MLJText/#Examples","page":"BM25Transformer","title":"Examples","text":"","category":"section"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"BM25Transformer accepts a variety of inputs. The example below transforms tokenized documents:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"using MLJ\nimport TextAnalysis\n\nBM25Transformer = @load BM25Transformer pkg=MLJText\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\nbm25_transformer = BM25Transformer()\n\njulia> tokenized_docs = TextAnalysis.tokenize.(docs)\n2-element Vector{Vector{String}}:\n [\"Hi\", \"my\", \"name\", \"is\", \"Sam\", \".\"]\n [\"How\", \"are\", \"you\", \"today\", \"?\"]\n\nmach = machine(bm25_transformer, tokenized_docs)\nfit!(mach)\n\nfitted_params(mach)\n\ntfidf_mat = transform(mach, tokenized_docs)","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"Alternatively, one can provide documents pre-parsed as ngrams counts:","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"using MLJ\nimport TextAnalysis\n\ndocs = [\"Hi my name is Sam.\", \"How are you today?\"]\ncorpus = TextAnalysis.Corpus(TextAnalysis.NGramDocument.(docs, 1, 2))\nngram_docs = TextAnalysis.ngrams.(corpus)\n\njulia> ngram_docs[1]\nDict{AbstractString, Int64} with 11 entries:\n \"is\" => 1\n \"my\" => 1\n \"name\" => 1\n \".\" => 1\n \"Hi\" => 1\n \"Sam\" => 1\n \"my name\" => 1\n \"Hi my\" => 1\n \"name is\" => 1\n \"Sam .\" => 1\n \"is Sam\" => 1\n\nbm25_transformer = BM25Transformer()\nmach = machine(bm25_transformer, ngram_docs)\nMLJ.fit!(mach)\nfitted_params(mach)\n\ntfidf_mat = transform(mach, ngram_docs)","category":"page"},{"location":"models/BM25Transformer_MLJText/","page":"BM25Transformer","title":"BM25Transformer","text":"See also TfidfTransformer, CountTransformer","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/#DeterministicConstantClassifier_MLJModels","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"","category":"section"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"DeterministicConstantClassifier","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"A model type for constructing a deterministic constant classifier, based on MLJModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"DeterministicConstantClassifier = @load DeterministicConstantClassifier pkg=MLJModels","category":"page"},{"location":"models/DeterministicConstantClassifier_MLJModels/","page":"DeterministicConstantClassifier","title":"DeterministicConstantClassifier","text":"Do model = DeterministicConstantClassifier() to construct an instance with default hyper-parameters. ","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/#RidgeRegressor_MLJScikitLearnInterface","page":"RidgeRegressor","title":"RidgeRegressor","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"A model type for constructing a ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor = @load RidgeRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(alpha=...).","category":"page"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"RidgeRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJScikitLearnInterface/","page":"RidgeRegressor","title":"RidgeRegressor","text":"alpha = 1.0\nfit_intercept = true\ncopy_X = true\nmax_iter = 1000\ntol = 0.0001\nsolver = auto\nrandom_state = nothing","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/#MultiTaskLassoRegressor_MLJScikitLearnInterface","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"","category":"section"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"MultiTaskLassoRegressor","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"A model type for constructing a multi-target lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"MultiTaskLassoRegressor = @load MultiTaskLassoRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"Do model = MultiTaskLassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoRegressor(alpha=...).","category":"page"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskLassoRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskLassoRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoRegressor","title":"MultiTaskLassoRegressor","text":"alpha = 1.0\nfit_intercept = true\nmax_iter = 1000\ntol = 0.0001\ncopy_X = true\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/#BaggingClassifier_MLJScikitLearnInterface","page":"BaggingClassifier","title":"BaggingClassifier","text":"","category":"section"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"BaggingClassifier","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"A model type for constructing a bagging ensemble classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"BaggingClassifier = @load BaggingClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"Do model = BaggingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingClassifier(estimator=...).","category":"page"},{"location":"models/BaggingClassifier_MLJScikitLearnInterface/","page":"BaggingClassifier","title":"BaggingClassifier","text":"A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.","category":"page"},{"location":"models/GeneralImputer_BetaML/#GeneralImputer_BetaML","page":"GeneralImputer","title":"GeneralImputer","text":"","category":"section"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"mutable struct GeneralImputer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options]), train!(m,X,Y) and predict(m,X).","category":"page"},{"location":"models/GeneralImputer_BetaML/#Hyperparameters:","page":"GeneralImputer","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"cols_to_impute::Union{String, Vector{Int64}}: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords \"auto\" (default) or \"all\". With \"auto\" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use \"all\" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.\nestimator::Any: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be a cols_to_impute-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default: nothing, i.e. use BetaML random forests, handling classification and regression jobs automatically].\nmissing_supported::Union{Bool, Vector{Bool}}: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default: false]\nfit_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.fit!]\npredict_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.predict]\nrecursive_passages::Int64: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.","category":"page"},{"location":"models/GeneralImputer_BetaML/#Examples-:","page":"GeneralImputer","title":"Examples :","text":"","category":"section"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Using BetaML models:","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"julia> using MLJ;\njulia> import BetaML ## The library from which to get the individual estimators to be used for each column imputation\njulia> X = [\"a\" 8.2;\n \"a\" missing;\n \"a\" 7.8;\n \"b\" 21;\n \"b\" 18;\n \"c\" -0.9;\n missing 20;\n \"c\" -1.8;\n missing -2.3;\n \"c\" -2.4] |> table ;\njulia> modelType = @load GeneralImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.GeneralImputer\njulia> model = modelType(estimator=BetaML.DecisionTreeEstimator(),recursive_passages=2);\njulia> mach = machine(model, X);\njulia> fit!(mach);\n[ Info: Training machine(GeneralImputer(cols_to_impute = auto, …), …).\njulia> X_full = transform(mach) |> MLJ.matrix\n10×2 Matrix{Any}:\n \"a\" 8.2\n \"a\" 8.0\n \"a\" 7.8\n \"b\" 21\n \"b\" 18\n \"c\" -0.9\n \"b\" 20\n \"c\" -1.8\n \"c\" -2.3\n \"c\" -2.4","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"Using third party packages (in this example DecisionTree):","category":"page"},{"location":"models/GeneralImputer_BetaML/","page":"GeneralImputer","title":"GeneralImputer","text":"julia> using MLJ;\njulia> import DecisionTree ## An example of external estimators to be used for each column imputation\njulia> X = [\"a\" 8.2;\n \"a\" missing;\n \"a\" 7.8;\n \"b\" 21;\n \"b\" 18;\n \"c\" -0.9;\n missing 20;\n \"c\" -1.8;\n missing -2.3;\n \"c\" -2.4] |> table ;\njulia> modelType = @load GeneralImputer pkg = \"BetaML\" verbosity=0\nBetaML.Imputation.GeneralImputer\njulia> model = modelType(estimator=[DecisionTree.DecisionTreeClassifier(),DecisionTree.DecisionTreeRegressor()], fit_function=DecisionTree.fit!,predict_function=DecisionTree.predict,recursive_passages=2);\njulia> mach = machine(model, X);\njulia> fit!(mach);\n[ Info: Training machine(GeneralImputer(cols_to_impute = auto, …), …).\njulia> X_full = transform(mach) |> MLJ.matrix\n10×2 Matrix{Any}:\n \"a\" 8.2\n \"a\" 7.51111\n \"a\" 7.8\n \"b\" 21\n \"b\" 18\n \"c\" -0.9\n \"b\" 20\n \"c\" -1.8\n \"c\" -2.3\n \"c\" -2.4","category":"page"},{"location":"third_party_packages/#Third-Party-Packages","page":"Third Party Packages","title":"Third Party Packages","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"A list of third-party packages with integration with MLJ.","category":"page"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"Last updated December 2020.","category":"page"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"Pull requests to update this list are very welcome. Otherwise, you may post an issue requesting this here.","category":"page"},{"location":"third_party_packages/#Packages-providing-models-in-the-MLJ-model-registry","page":"Third Party Packages","title":"Packages providing models in the MLJ model registry","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"See List of Supported Models","category":"page"},{"location":"third_party_packages/#Providing-unregistered-models:","page":"Third Party Packages","title":"Providing unregistered models:","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"SossMLJ.jl\nTimeSeriesClassification","category":"page"},{"location":"third_party_packages/#Packages-providing-other-kinds-of-functionality:","page":"Third Party Packages","title":"Packages providing other kinds of functionality:","text":"","category":"section"},{"location":"third_party_packages/","page":"Third Party Packages","title":"Third Party Packages","text":"MLJParticleSwarmOptimization.jl (hyper-parameter optimization strategy)\nTreeParzen.jl (hyper-parameter optimization strategy)\nShapley.jl (feature ranking / interpretation)\nShapML.jl (feature ranking / interpretation)\nFairness.jl (FAIRness metrics)\nOutlierDetection.jl (provides the ProbabilisticDetector wrapper and other outlier detection meta-functionality)\nConformalPrediction.jl (predictive uncertainty quantification through conformal prediction)","category":"page"},{"location":"learning_networks/#Learning-Networks","page":"Learning Networks","title":"Learning Networks","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Below is a practical guide to the MLJ implementation of learning networks, which have been described more abstractly in the article:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Anthony D. Blaom and Sebastian J. Voller (2020): Flexible model composition in machine learning and its implementation in MLJ. Preprint, arXiv:2012.15505.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Learning networks, an advanced but powerful MLJ feature, are \"blueprints\" for combining models in flexible ways, beyond ordinary linear pipelines and simple model ensembles. They are simple transformations of your existing workflows which can be \"exported\" to define new, re-usable composite model types (models which typically have other models as hyperparameters).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Pipeline models (see Pipeline), and model stacks (see Stack) are both implemented internally as exported learning networks.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"note: Note\nWhile learning networks can be used for complex machine learning workflows, their main purpose is for defining new stand-alone model types, which behave just like any other model type: Instances can be evaluated, tuned, inserted into pipelines, etc. In serious applications, users are encouraged to export their learning networks, as explained under Exporting a learning network as a new model type below, after testing the network, using a small training dataset.","category":"page"},{"location":"learning_networks/#Learning-networks-by-example","page":"Learning Networks","title":"Learning networks by example","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Learning networks are best explained by way of example.","category":"page"},{"location":"learning_networks/#Lazy-computation","page":"Learning Networks","title":"Lazy computation","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The core idea of a learning network is delayed or lazy computation. Instead of","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X = 4\nY = 3\nZ = 2*X\nW = Y + Z\nW","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"we can do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\n\nX = source(4)\nY = source(3)\nZ = 2*X\nW = Y + Z\nW()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In the first computation X, Y, Z and W are all bound to ordinary data. In the second, they are bound to objects called nodes. The special nodes X and Y constitute \"entry points\" for data, and are called source nodes. As the terminology suggests, we can imagine these objects as part of a \"network\" (a directed acyclic graph) which can aid conceptualization (but is less useful in more complicated examples):","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"(Image: )","category":"page"},{"location":"learning_networks/#The-origin-of-a-node","page":"Learning Networks","title":"The origin of a node","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The source nodes on which a given node depends are called the origins of the node:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"os = origins(W)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X in os","category":"page"},{"location":"learning_networks/#Re-using-a-network","page":"Learning Networks","title":"Re-using a network","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The advantage of lazy evaluation is that we can change data at a source node to repeat the calculation with new data. One way to do this (discouraged in practice) is to use rebind!:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"rebind!(X, 6) # demonstration only!\nZ()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"However, if a node has a unique origin, then one instead calls the node on the new data one would like to rebind to that origin:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"origins(Z)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z(6)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z(4)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"This has the advantage that you don't need to locate the origin and rebind data directly, and the unique-origin restriction turns out to be sufficient for the applications to learning we have in mind.","category":"page"},{"location":"learning_networks/#node_overloading","page":"Learning Networks","title":"Overloading functions for use on nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Several built-in function like * and + above are overloaded in MLJBase to work on nodes, as illustrated above. Others that work out-of-the-box include: MLJBase.matrix, MLJBase.table, vcat, hcat, mean, median, mode, first, last, as well as broadcasted versions of log, exp, mean, mode and median. A function like sqrt is not overloaded, so that Q = sqrt(Z) will throw an error. Instead, we do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q = node(sqrt, Z)\nZ()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"You can learn more about the node function under More on defining new nodes","category":"page"},{"location":"learning_networks/#A-network-that-learns","page":"Learning Networks","title":"A network that learns","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"To incorporate learning in a network of nodes MLJ:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Allows binding of machines to nodes instead of data\nGenerates \"operation\" nodes when calling an operation like predict or transform on a machine and node input data. Such nodes point to both a machine (storing learned parameters) and the node from which to fetch data for applying the operation (which, unlike the nodes seen so far, depend on learned parameters to generate output).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"For an example of a learning network that actually learns, we first synthesize some training data X, y, and production data Xnew:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nX, y = make_blobs(cluster_std=10.0, rng=123) # `X` is a table, `y` a vector\nXnew, _ = make_blobs(3) # `Xnew` is a table with the same number of columns\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We choose a model do some dimension reduction, and another to perform classification:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"pca = (@load PCA pkg=MultivariateStats verbosity=0)()\ntree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"To make our learning lazy, we wrap the training data as source nodes:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Xs = source(X)\nys = source(y)\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"And, finally, proceed as we would in an ordinary MLJ workflow, with the exception that there is no need to fit! our machines, as training will be carried out lazily later:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mach1 = machine(pca, Xs)\nx = transform(mach1, Xs) # defines a new node because `Xs` is a node\n\nmach2 = machine(tree, x, ys)\nyhat = predict(mach2, x) # defines a new node because `x` is a node","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Note that mach1 and mach2 are not themselves nodes. They point to the nodes they need to call to get training data and they are in turn pointed to by other nodes. In fact, an interesting implementation detail is that an \"ordinary\" machine is not actually bound directly to data, but bound to data wrapped in source nodes.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"machine(pca, Xnew).args[1] # `Xnew` is ordinary data","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Before calling a node, we need to fit! the node, to trigger training of all the machines on which it depends:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"fit!(yhat) # can include same keyword options for `fit!(::Machine, ...)`\nyhat()[1:2] # or `yhat(rows=2)`","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"This last represents the prediction on the training data, because that's what resides at our source nodes. However, yhat has the unique origin X (because \"training edges\" in the complete associated directed graph are excluded for this purpose). We can therefore call yhat on our production data to get the corresponding predictions:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"yhat(Xnew)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Training is smart, in the sense that mutating a hyper-parameter of some component model does not force retraining of upstream machines:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"tree.max_depth = 1\nfit!(yhat)\nyhat(Xnew)","category":"page"},{"location":"learning_networks/#Multithreaded-training","page":"Learning Networks","title":"Multithreaded training","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"A more complicated learning network may contain machines that can be trained in parallel. In that case, a call like the following may speed up training:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"tree.max_depth = 2\nfit!(yhat, acceleration=CPUThreads())\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Currently, only CPU1() (default) and CPUThreads() are supported here.","category":"page"},{"location":"learning_networks/#Exporting-a-learning-network-as-a-new-model-type","page":"Learning Networks","title":"Exporting a learning network as a new model type","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Once a learning network has been tested, typically on some small dummy data set, it is ready to be exported as a new, stand-alone, re-usable model type (unattached to any data). We demonstrate the process by way of examples of increasing complexity:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Example A - Mini-pipeline\nMore on replacing models with symbols\nExample B - Multiple operations: transform and inverse transform\nExample C - Blending predictions and exposing internal network state in reports\nExample D - Multiple nodes pointing to the same machine\nExample E - Coupling component model hyper-parameters\nMore on defining new nodes\nExample F - Wrapping a model in a data-dependent tuning strategy","category":"page"},{"location":"learning_networks/#Example-A-Mini-pipeline","page":"Learning Networks","title":"Example A - Mini-pipeline","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"First we export the simple learning network defined above. (This is for illustration purposes; in practice using the Pipeline syntax model1 |> model2 syntax is more convenient.)","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We need a type with two fields, one for the preprocessor (pca in the network above) and one for the classifier (tree in the network above).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The DecisionTreeClassifier type of tree has supertype Probabilistic, because it makes probabilistic predictions, and we assume any other classifier we want to swap out will be the same.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"supertype(typeof(tree))","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In particular, our composite model will also need Probabilistic as supertype. In fact, we must give it the intermediate supertype ProbabilisticNetworkComposite <: Probabilistic, so that we additionally flag it as an exported learning network model type:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mutable struct CompositeA <: ProbabilisticNetworkComposite\n preprocessor\n classifier\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The common alternatives are DeterministicNetworkComposite and UnsupervisedNetworkComposite. But all options can be viewed as follows:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJBase\nNetworkComposite","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We next make our learning network model-generic by substituting each model instance with the corresponding symbol representing a property (field) of the new model struct:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mach1 = machine(:preprocessor, Xs) # <---- `pca` swapped out for `:preprocessor`\nx = transform(mach1, Xs)\nmach2 = machine(:classifier, x, ys) # <---- `tree` swapped out for `:classifier`\nyhat = predict(mach2, x)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Incidentally, this network can be used as before except we must provide an instance of CompositeA in our fit! calls, to indicate what actual models the symbols are being substituted with:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"composite_a = CompositeA(pca, ConstantClassifier())\nfit!(yhat, composite=composite_a)\nyhat(Xnew)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In this case :preprocessor is being substituted by pca, and :classifier by ConstantClassifier() for training.","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Literally copy and paste the learning network above into the definition of a method called prefit, as shown below (if you have implemented your own MLJ model, you will notice this has the same signature as MLJModelInterface.fit):","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"import MLJBase\nfunction MLJBase.prefit(composite::CompositeA, verbosity, X, y)\n\n # the learning network from above:\n Xs = source(X)\n ys = source(y)\n mach1 = machine(:preprocessor, Xs)\n x = transform(mach1, Xs)\n mach2 = machine(:classifier, x, ys)\n yhat = predict(mach2, x)\n\n verbosity > 0 && @info \"I'm a noisy fellow!\"\n\n # return \"learning network interface\":\n return (; predict=yhat)\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"That's it.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Generally, prefit always returns a learning network interface; see MLJBase.prefit for what this means in general. In this example, the interface dictates that calling predict(mach, Xnew) on a machine mach bound to some instance of CompositeA should internally call yhat(Xnew).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's our new composite model type CompositeA in action, combining standardization with KNN classification:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nX, y = @load_iris\n\nknn = (@load KNNClassifier pkg=NearestNeighborModels verbosity=0)()\ncomposite_a = CompositeA(Standardizer(), knn)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mach = machine(composite_a, X, y) |> fit!\npredict(mach, X)[1:2]","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"report(mach).preprocessor","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"fitted_params(mach).classifier","category":"page"},{"location":"learning_networks/#More-on-replacing-models-with-symbols","page":"Learning Networks","title":"More on replacing models with symbols","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Only the first argument model in some expression machine(model, ...) can be replaced with a symbol. These replacements function as hooks for exposing reports and fitted parameters of component models in the report and fitted parameters of the composite model, but these replacements are not absolutely necessary. For example, instead of the line mach1 = machine(:preprocessor, Xs) in the prefit definition, we can do mach1 = machine(composite.preprocessor, Xs). However, report and fittted_params will not include items for the :preprocessor component model in that case.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"If a component model is not explicitly bound to data in a machine (for example, because it is first wrapped in TunedModel) then there are ways to explicitly expose associated fitted parameters or report items. See Example F below.","category":"page"},{"location":"learning_networks/#Example-B-Multiple-operations:-transform-and-inverse-transform","page":"Learning Networks","title":"Example B - Multiple operations: transform and inverse transform","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a second mini-pipeline example composing two transformers which both implement inverse transform. We show how to implement an inverse_transform for the composite model too.","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-2","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeB <: DeterministicNetworkComposite\n transformer1\n transformer2\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-2","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::CompositeB, verbosity, X)\n Xs = source(X)\n\n mach1 = machine(:transformer1, Xs)\n X1 = transform(mach1, Xs)\n mach2 = machine(:transformer2, X1)\n X2 = transform(mach2, X1)\n\n W1 = inverse_transform(mach2, Xs)\n W2 = inverse_transform(mach1, W1)\n\n # the learning network interface:\n return (; transform=X2, inverse_transform=W2)\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a demonstration:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X = rand(100)\n\ncomposite_b = CompositeB(UnivariateBoxCoxTransformer(), Standardizer())\nmach = machine(composite_b, X) |> fit!\nW = transform(mach, X)\n@assert inverse_transform(mach, W) ≈ X","category":"page"},{"location":"learning_networks/#Example-C-Blending-predictions-and-exposing-internal-network-state-in-reports","page":"Learning Networks","title":"Example C - Blending predictions and exposing internal network state in reports","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The code below defines a new composite model type CompositeC that predicts by taking the weighted average of two regressors, and additionally exposes, in the model's report, a measure of disagreement between the two models at time of training. In addition to the two regressors, the new model has two other fields:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"mix, controlling the weighting\nacceleration, for the mode of acceleration for training the model (e.g., CPUThreads()).","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-3","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeC <: DeterministicNetworkComposite\n regressor1\n regressor2\n mix::Float64\n acceleration\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-3","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::CompositeC, verbosity, X, y)\n\n Xs = source(X)\n ys = source(y)\n\n mach1 = machine(:regressor1, Xs, ys)\n mach2 = machine(:regressor2, Xs, ys)\n\n yhat1 = predict(mach1, Xs)\n yhat2 = predict(mach2, Xs)\n\n # node to return disagreement between the regressor predictions:\n disagreement = node((y1, y2) -> l2(y1, y2) |> mean, yhat1, yhat2)\n\n # get the weighted average the predictions of the regressors:\n λ = composite.mix\n yhat = (1 - λ)*yhat1 + λ*yhat2\n\n # the learning network interface:\n return (\n predict = yhat,\n report= (; training_disagreement=disagreement),\n acceleration = composite.acceleration,\n )\n\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a demonstration:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, y = make_regression() # a table and a vector\n\nknn = (@load KNNRegressor pkg=NearestNeighborModels verbosity=0)()\ntree = (@load DecisionTreeRegressor pkg=DecisionTree verbosity=0)()\ncomposite_c = CompositeC(knn, tree, 0.2, CPUThreads())\nmach = machine(composite_c, X, y) |> fit!\nXnew, _ = make_regression(3)\npredict(mach, Xnew)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"report(mach)","category":"page"},{"location":"learning_networks/#Example-D-Multiple-nodes-pointing-to-the-same-machine","page":"Learning Networks","title":"Example D - Multiple nodes pointing to the same machine","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"When incorporating learned target transformations (such as a standardization) in supervised learning, it is desirable to apply the inverse transformation to predictions, to return them to the original scale. This means re-using learned parameters from an earlier part of your workflow. This poses no problem here, as the next example demonstrates.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The model type CompositeD defined below applies a preprocessing transformation to input data X (e.g., standardization), learns a transformation for the target y (e.g., an optimal Box-Cox transformation), predicts new target values using a regressor (e.g., Ridge regression), and then inverse-transforms those predictions to restore them to the original scale. (This represents a model we could alternatively build using the TransformedTargetModel wrapper and a Pipeline.)","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-4","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeD <: DeterministicNetworkComposite\n preprocessor\n target_transformer\n regressor\n acceleration\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-4","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Notice that both of the nodes z and yhat in the wrapped learning network point to the same machine (learned parameters) mach2.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::CompositeD, verbosity, X, y)\n\n Xs = source(X)\n ys = source(y)\n\n mach1 = machine(:preprocessor, Xs)\n W = transform(mach1, Xs)\n\n mach2 = machine(:target_transformer, ys)\n z = transform(mach2, ys)\n\n mach3 =machine(:regressor, W, z)\n zhat = predict(mach3, W)\n\n yhat = inverse_transform(mach2, zhat)\n\n # the learning network interface:\n return (\n predict = yhat,\n acceleration = composite.acceleration,\n )\n\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The flow of information in the wrapped learning network is visualized below.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"(Image: )","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's an application of our new composite to the Boston dataset:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, y = @load_boston\n\nstand = Standardizer()\nbox = UnivariateBoxCoxTransformer()\nridge = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)(lambda=92)\ncomposite_d = CompositeD(stand, box, ridge, CPU1())\nevaluate(composite_d, X, y, resampling=CV(nfolds=5), measure=l2, verbosity=0)","category":"page"},{"location":"learning_networks/#Example-E-Coupling-component-model-hyper-parameters","page":"Learning Networks","title":"Example E - Coupling component model hyper-parameters","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"The composite model in this example combines a clustering model used to reduce the dimension of the feature space (KMeans or KMedoids from Clustering.jl) with ridge regression, but has the following \"coupling\" of the hyperparameters: The amount of ridge regularization depends on the number of specified clusters k, with less regularization for a greater number of clusters. It includes a user-specified coupling coefficient c, and exposes the solver hyper-parameter of the ridge regressor. (Neither the clusterer nor ridge regressor are themselves hyperparameters of the composite.)","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-5","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct CompositeE <: DeterministicNetworkComposite\n clusterer # `:kmeans` or `:kmedoids`\n k::Int # number of clusters\n solver # a ridge regression parameter we want to expose\n c::Float64 # a \"coupling\" coefficient\nend","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-5","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels verbosity=0\nKMeans = @load KMeans pkg=Clustering verbosity=0\nKMedoids = @load KMedoids pkg=Clustering verbosity=0\n\nfunction MLJBase.prefit(composite::CompositeE, verbosity, X, y)\n\n Xs = source(X)\n ys = source(y)\n\n k = composite.k\n solver = composite.solver\n c = composite.c\n\n clusterer = composite.clusterer == :kmeans ? KMeans(; k) : KMedoids(; k)\n mach1 = machine(clusterer, Xs)\n Xsmall = transform(mach1, Xs)\n\n # the coupling - ridge regularization depends on the number of\n # clusters `k` and the coupling coefficient `c`:\n lambda = exp(-c/k)\n\n ridge = RidgeRegressor(; lambda, solver)\n mach2 = machine(ridge, Xsmall, ys)\n yhat = predict(mach2, Xsmall)\n\n return (predict=yhat,)\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's an application to the Boston dataset in which we optimize the coupling coefficient (see Tuning Models for more on hyper-parameter optimization):","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, y = @load_boston # a table and a vector\n\ncomposite_e = CompositeE(:kmeans, 3, nothing, 0.5)\nr = range(composite_e, :c, lower = -2, upper=2, scale=x->10^x)\ntuned_composite_e = TunedModel(\n composite_e,\n range=r,\n tuning=RandomSearch(rng=123),\n measure=l2,\n resampling=CV(nfolds=6),\n n=100,\n)\nmach = machine(tuned_composite_e, X, y) |> fit!\nreport(mach).best_model","category":"page"},{"location":"learning_networks/#More-on-defining-new-nodes","page":"Learning Networks","title":"More on defining new nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Overloading ordinary functions for nodes has already been discussed above. Here's another example:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"divide(x, y) = x/y\n\nX = source(2)\nY = source(3)\n\nZ = node(divide, X, Y)\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"This means Z() returns divide(X(), Y()), which is divide(2, 3) in this case:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Z()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"We cannot call Z with arguments (e.g., Z(2)) because it does not have a unique origin.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In all the node examples so far, the first argument of node is a function, and all other arguments are nodes - one node for each argument of the function. A node constructed in this way is called a static node. A dynamic node, which directly depends on the outcome of a training event, is constructed by giving a machine as the second argument, to be passed as the first argument of the function in a node call. For example, we can do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Xs = source(rand(4))\nmach = machine(Standardizer(), Xs)\nN = node(transform, mach, Xs) |> fit!\nnothing # hide","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Then N has the following calling properties:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N() returns transform(mach, Xs())\nN(Xnew) returns transform(mach, Xs(Xnew)); here Xs(Xnew) is just Xnew because Xs is just a source node.)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N(rand(2))","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In fact, this is precisely how the transform method is internally overloaded to work, when called with a node argument (to return a node instead of data). That is, internally there exists code that amounts to the definition","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"transform(mach, X::AbstractNode) = node(transform, mach, X)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here AbstractNode is the common super-type of Node and Source.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"It sometimes useful to create dynamic nodes with no node arguments, as in","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Xs = source(rand(10))\nmach = machine(Standardizer(), Xs)\nN = node(fitted_params, mach) |> fit!\nN()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Static nodes can have also have zero node arguments. These may be viewed as \"constant\" nodes:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"N = Node(()-> 42)\nN()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Example F below demonstrates the use of static and dynamic nodes. For more details, see the node docstring.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"There is also an experimental macro @node. If Z is an AbstractNode (Z = source(16), say) then instead of","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q = node(sqrt, Z)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"one can do","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Q = @node sqrt(Z)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"(so that Q() == 4). Here's a more complicated application of @node to row-shuffle a table:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ, Random\nX = (x1 = [1, 2, 3, 4, 5],\n x2 = [:one, :two, :three, :four, :five])\nrows(X) = 1:nrows(X)\n\nXs = source(X)\nrs = @node rows(Xs)\nW = @node selectrows(Xs, @node shuffle(rs))\n\nW()","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Important. An argument not in global scope is assumed by @node to be a node or source.","category":"page"},{"location":"learning_networks/#Example-F-Wrapping-a-model-in-a-data-dependent-tuning-strategy","page":"Learning Networks","title":"Example F - Wrapping a model in a data-dependent tuning strategy","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"When the regularization parameter of a Lasso model is optimized, one commonly searches over a parameter range depending on properties of the training data. Indeed, Lasso (and, more generally, elastic net) implementations commonly provide a method to carry out this data-dependent optimization automatically, using cross-validation. The following example shows how to transform the LassoRegressor model type from MLJLinearModels.jl into a self-tuning model type LassoCVRegressor using the commonly implemented data-dependent tuning strategy. A new dimensionless hyperparameter epsilon controls the lower bound on the parameter range.","category":"page"},{"location":"learning_networks/#Step-1-Define-a-new-model-struct-6","page":"Learning Networks","title":"Step 1 - Define a new model struct","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"using MLJ\nimport MLJBase\n\nmutable struct LassoCVRegressor <: DeterministicNetworkComposite\n lasso # the atomic lasso model (`lasso.lambda` is ignored)\n epsilon::Float64 # controls lower bound of `lasso.lambda` in tuning\n resampling # resampling strategy for optimization of `lambda`\nend\n\n# keyword constructor for convenience:\nLassoRegressor = @load LassoRegressor pkg=MLJLinearModels verbosity=0\nLassoCVRegressor(;\n lasso=LassoRegressor(),\n epsilon=0.001,\n resampling=CV(nfolds=6),\n) = LassoCVRegressor(\n lasso,\n epsilon,\n resampling,\n)\nnothing # hide","category":"page"},{"location":"learning_networks/#Step-2-Wrap-the-learning-network-in-prefit-6","page":"Learning Networks","title":"Step 2 - Wrap the learning network in prefit","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"In this case, there is no model -> :symbol replacement that makes sense here, because the model is getting wrapped by TunedModel before being bound to nodes in a machine. However, we can expose the the learned lasso coefs and intercept using fitted parameter nodes; and expose the optimal lambda, and range searched, using report nodes (as previously demonstrated in Example C).","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"function MLJBase.prefit(composite::LassoCVRegressor, verbosity, X, y)\n\n λ_max = maximum(abs.(MLJ.matrix(X)'y))\n\n Xs = source(X)\n ys = source(y)\n\n r = range(\n composite.lasso,\n :lambda,\n lower=composite.epsilon*λ_max,\n upper=λ_max,\n scale=:log10,\n )\n\n lambda_range = node(()->r) # a \"constant\" report node\n\n tuned_lasso = TunedModel(\n composite.lasso,\n tuning=Grid(shuffle=false),\n range = r,\n measure = l2,\n resampling=composite.resampling,\n )\n mach = machine(tuned_lasso, Xs, ys)\n\n R = node(report, mach) # `R()` returns `report(mach)`\n lambda = node(r -> r.best_model.lambda, R) # a report node\n\n F = node(fitted_params, mach) # `F()` returns `fitted_params(mach)`\n coefs = node(f->f.best_fitted_params.coefs, F) # a fitted params node\n intercept = node(f->f.best_fitted_params.intercept, F) # a fitted params node\n\n yhat = predict(mach, Xs)\n\n return (\n predict=yhat,\n fitted_params=(; coefs, intercept),\n report=(; lambda, lambda_range),\n )\n\nend","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Here's a demonstration:","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"X, _ = make_regression(1000, 3, rng=123)\ny = X.x2 - X.x2 + 0.005*X.x3 + 0.05*rand(1000)\nlasso_cv = LassoCVRegressor(epsilon=1e-5)\nmach = machine(lasso_cv, X, y) |> fit!\nreport(mach)","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"fitted_params(mach)","category":"page"},{"location":"learning_networks/#The-learning-network-API","page":"Learning Networks","title":"The learning network API","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Two new julia types are part of learning networks: Source and Node, which share a common abstract supertype AbstractNode.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Formally, a learning network defines two labeled directed acyclic graphs (DAG's) whose nodes are Node or Source objects, and whose labels are Machine objects. We obtain the first DAG from directed edges of the form N1 - N2 whenever N1 is an argument of N2 (see below). Only this DAG is relevant when calling a node, as discussed in the examples above and below. To form the second DAG (relevant when calling or calling fit! on a node) one adds edges for which N1 is training argument of the machine which labels N1. We call the second, larger DAG, the completed learning network (but note only edges of the smaller network are explicitly drawn in diagrams, for simplicity).","category":"page"},{"location":"learning_networks/#Source-nodes","page":"Learning Networks","title":"Source nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Only source nodes can reference concrete data. A Source object has a single field, data.","category":"page"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"MLJBase.Source\nsource(X)\nrebind!\nsources\norigins","category":"page"},{"location":"learning_networks/#MLJBase.Source","page":"Learning Networks","title":"MLJBase.Source","text":"Source\n\nType for a learning network source node. Constructed using source, as in source() or source(rand(2,3)).\n\nSee also source, Node.\n\n\n\n\n\n","category":"type"},{"location":"learning_networks/#MLJBase.source-Tuple{Any}","page":"Learning Networks","title":"MLJBase.source","text":"Xs = source(X=nothing)\n\nDefine, a learning network Source object, wrapping some input data X, which can be nothing for purposes of exporting the network as stand-alone model. For training and testing the unexported network, appropriate vectors, tables, or other data containers are expected.\n\nThe calling behaviour of a Source object is this:\n\nXs() = X\nXs(rows=r) = selectrows(X, r) # eg, X[r,:] for a DataFrame\nXs(Xnew) = Xnew\n\nSee also: MLJBase.prefit, sources, origins, node.\n\n\n\n\n\n","category":"method"},{"location":"learning_networks/#MLJBase.rebind!","page":"Learning Networks","title":"MLJBase.rebind!","text":"rebind!(s, X)\n\nAttach new data X to an existing source node s. Not a public method.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/#MLJBase.sources","page":"Learning Networks","title":"MLJBase.sources","text":"sources(N::AbstractNode)\n\nA vector of all sources referenced by calls N() and fit!(N). These are the sources of the ancestor graph of N when including training edges.\n\nNot to be confused with origins(N), in which training edges are excluded.\n\nSee also: origins, source.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/#MLJBase.origins","page":"Learning Networks","title":"MLJBase.origins","text":"origins(N)\n\nReturn a list of all origins of a node N accessed by a call N(). These are the source nodes of ancestor graph of N if edges corresponding to training arguments are excluded. A Node object cannot be called on new data unless it has a unique origin.\n\nNot to be confused with sources(N) which refers to the same graph but without the training edge deletions.\n\nSee also: node, source.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/#Nodes","page":"Learning Networks","title":"Nodes","text":"","category":"section"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"Node\nnode","category":"page"},{"location":"learning_networks/#MLJBase.Node","page":"Learning Networks","title":"MLJBase.Node","text":"Node{T<:Union{Machine,Nothing}}\n\nType for nodes in a learning network that are not Source nodes.\n\nThe key components of a Node are:\n\nAn operation, which will either be static (a fixed function) or dynamic (such as predict or transform).\nA Machine object, on which to dispatch the operation (nothing if the operation is static). The training arguments of the machine are generally other nodes, including Source nodes.\nUpstream connections to other nodes, called its arguments, possibly including Source nodes, one for each data argument of the operation (typically there's just one).\n\nWhen a node N is called, as in N(), it applies the operation on the machine (if there is one) together with the outcome of calls to its node arguments, to compute the return value. For details on a node's calling behavior, see node.\n\nSee also node, Source, origins, sources, fit!.\n\n\n\n\n\n","category":"type"},{"location":"learning_networks/#MLJBase.node","page":"Learning Networks","title":"MLJBase.node","text":"J = node(f, mach::Machine, args...)\n\nDefines a dynamic Node object J wrapping a dynamic operation f (predict, predict_mean, transform, etc), a nodal machine mach and arguments args. Its calling behaviour, which depends on the outcome of training mach (and, implicitly, on training outcomes affecting its arguments) is this:\n\nJ() = f(mach, args[1](), args[2](), ..., args[n]())\nJ(rows=r) = f(mach, args[1](rows=r), args[2](rows=r), ..., args[n](rows=r))\nJ(X) = f(mach, args[1](X), args[2](X), ..., args[n](X))\n\nGenerally n=1 or n=2 in this latter case.\n\npredict(mach, X::AbsractNode, y::AbstractNode)\npredict_mean(mach, X::AbstractNode, y::AbstractNode)\npredict_median(mach, X::AbstractNode, y::AbstractNode)\npredict_mode(mach, X::AbstractNode, y::AbstractNode)\ntransform(mach, X::AbstractNode)\ninverse_transform(mach, X::AbstractNode)\n\nShortcuts for J = node(predict, mach, X, y), etc.\n\nCalling a node is a recursive operation which terminates in the call to a source node (or nodes). Calling nodes on new data X fails unless the number of such nodes is one.\n\nSee also: Node, @node, source, origins.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"@node","category":"page"},{"location":"learning_networks/#MLJBase.@node","page":"Learning Networks","title":"MLJBase.@node","text":"@node f(...)\n\nConstruct a new node that applies the function f to some combination of nodes, sources and other arguments.\n\nImportant. An argument not in global scope is assumed to be a node or source.\n\nExamples\n\njulia> X = source(π)\njulia> W = @node sin(X)\njulia> W()\n0\n\njulia> X = source(1:10)\njulia> Y = @node selectrows(X, 3:4)\njulia> Y()\n3:4\n\njulia> Y([\"one\", \"two\", \"three\", \"four\"])\n2-element Array{Symbol,1}:\n \"three\"\n \"four\"\n\njulia> X1 = source(4)\njulia> X2 = source(5)\njulia> add(a, b, c) = a + b + c\njulia> N = @node add(X1, 1, X2)\njulia> N()\n10\n\n\nSee also node\n\n\n\n\n\n","category":"macro"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"MLJBase.prefit","category":"page"},{"location":"learning_networks/#MLJBase.prefit","page":"Learning Networks","title":"MLJBase.prefit","text":"MLJBase.prefit(model, verbosity, data...)\n\nReturns a learning network interface (see below) for a learning network with source nodes that wrap data.\n\nA user overloads MLJBase.prefit when exporting a learning network as a new stand-alone model type, of which model above will be an instance. See the MLJ reference manual for details.\n\nA learning network interface is a named tuple declaring certain interface points in a learning network, to be used when \"exporting\" the network as a new stand-alone model type. Examples are\n\n (predict=yhat,)\n (transform=Xsmall, acceleration=CPUThreads())\n (predict=yhat, transform=W, report=(loss=loss_node,))\n\nHere yhat, Xsmall, W and loss_node are nodes in the network.\n\nThe keys of the learning network interface always one of the following:\n\nThe name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See \"Operation keys\" below.\n:report, for exposing results of calling a node with no arguments in the composite model report. See \"Including report nodes\" below.\n:fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See \"Including fitted parameter nodes\" below.\n:acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.\n\nOperation keys\n\nIf the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).\n\nIncluding report nodes\n\nIf the key is :report, then the corresponding value must be a named tuple\n\n (k1=n1, k2=n2, ...)\n\nwhose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the \"Learning Networks\" section of the MLJ manual.\n\nIncluding fitted parameter nodes\n\nIf the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.\n\n\n\n\n\n","category":"function"},{"location":"learning_networks/","page":"Learning Networks","title":"Learning Networks","text":"See more on fitting nodes at fit! and fit_only!.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#MultitargetKNNClassifier_NearestNeighborModels","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"MultitargetKNNClassifier","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"A model type for constructing a multitarget K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"MultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Do model = MultitargetKNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNClassifier(K=...).","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Multi-target K-Nearest Neighbors Classifier (MultitargetKNNClassifier) is a variation of KNNClassifier that assumes the target variable is vector-valued with Multiclass or OrderedFactor components. (Target data must be presented as a table, however.)","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Training-data","page":"MultitargetKNNClassifier","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"OR","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Here:","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\nyis the target, which can be any table of responses whose element scitype is either<:Finite(<:Multiclassor<:OrderedFactorwill do); check the columns scitypes withschema(y). Each column ofy` is assumed to belong to a common categorical pool.\nw is the observation weights which can either be nothing(default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Hyper-parameters","page":"MultitargetKNNClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.\noutput_type::Type{<:MultiUnivariateFinite}=DictTable : One of (ColumnTable, DictTable). The type of table type to use for predictions. Setting to ColumnTable might improve performance for narrow tables while setting to DictTable improves performance for wide tables.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Operations","page":"MultitargetKNNClassifier","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are either a ColumnTable or DictTable of UnivariateFiniteVector columns depending on the value set for the output_type parameter discussed above. The probabilistic predictions are uncalibrated.\npredict_mode(mach, Xnew): Return the modes of each column of the table of probabilistic predictions returned above.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Fitted-parameters","page":"MultitargetKNNClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/#Examples","page":"MultitargetKNNClassifier","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"using MLJ, StableRNGs\n\n## set rng for reproducibility\nrng = StableRNG(10)\n\n## Dataset generation\nn, p = 10, 3\nX = table(randn(rng, n, p)) ## feature table\nfruit, color = categorical([\"apple\", \"orange\"]), categorical([\"blue\", \"green\"])\ny = [(fruit = rand(rng, fruit), color = rand(rng, color)) for _ in 1:n] ## target_table\n## Each column in y has a common categorical pool as expected\nselectcols(y, :fruit) ## categorical array\nselectcols(y, :color) ## categorical array\n\n## Load MultitargetKNNClassifier\nMultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels\n\n## view possible kernels\nNearestNeighborModels.list_kernels()\n\n## MultitargetKNNClassifier instantiation\nmodel = MultitargetKNNClassifier(K=3, weights = NearestNeighborModels.Inverse())\n\n## wrap model and required data in an MLJ machine and fit\nmach = machine(model, X, y) |> fit!\n\n## predict\ny_hat = predict(mach, X)\nlabels = predict_mode(mach, X)\n","category":"page"},{"location":"models/MultitargetKNNClassifier_NearestNeighborModels/","page":"MultitargetKNNClassifier","title":"MultitargetKNNClassifier","text":"See also KNNClassifier","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/#AdaBoostRegressor_MLJScikitLearnInterface","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"","category":"section"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"AdaBoostRegressor","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"A model type for constructing a AdaBoost ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"AdaBoostRegressor = @load AdaBoostRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"Do model = AdaBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostRegressor(estimator=...).","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.","category":"page"},{"location":"models/AdaBoostRegressor_MLJScikitLearnInterface/","page":"AdaBoostRegressor","title":"AdaBoostRegressor","text":"This class implements the algorithm known as AdaBoost.R2.","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/#KMeans_MLJScikitLearnInterface","page":"KMeans","title":"KMeans","text":"","category":"section"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"KMeans","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"A model type for constructing a k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"KMeans = @load KMeans pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(n_clusters=...).","category":"page"},{"location":"models/KMeans_MLJScikitLearnInterface/","page":"KMeans","title":"KMeans","text":"K-Means algorithm: find K centroids corresponding to K clusters in the data.","category":"page"},{"location":"models/UnivariateStandardizer_MLJModels/#UnivariateStandardizer_MLJModels","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"","category":"section"},{"location":"models/UnivariateStandardizer_MLJModels/","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"UnivariateStandardizer()","category":"page"},{"location":"models/UnivariateStandardizer_MLJModels/","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"Transformer type for standardizing (whitening) single variable data.","category":"page"},{"location":"models/UnivariateStandardizer_MLJModels/","page":"UnivariateStandardizer","title":"UnivariateStandardizer","text":"This model may be deprecated in the future. Consider using Standardizer, which handles both tabular and univariate data.","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/#OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"OrthogonalMatchingPursuitRegressor","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"A model type for constructing a orthogonal matching pursuit regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"OrthogonalMatchingPursuitRegressor = @load OrthogonalMatchingPursuitRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"Do model = OrthogonalMatchingPursuitRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitRegressor(n_nonzero_coefs=...).","category":"page"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"OrthogonalMatchingPursuitRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitRegressor","title":"OrthogonalMatchingPursuitRegressor","text":"n_nonzero_coefs = nothing\ntol = nothing\nfit_intercept = true\nprecompute = auto","category":"page"},{"location":"learning_curves/#Learning-Curves","page":"Learning Curves","title":"Learning Curves","text":"","category":"section"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot but generates the data needed to do so.","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"using MLJ\nX, y = @load_boston;\n\natom = (@load RidgeRegressor pkg=MLJLinearModels)()\nensemble = EnsembleModel(model=atom, n=1000)\nmach = machine(ensemble, X, y)\n\nr_lambda = range(ensemble, :(model.lambda), lower=1e-1, upper=100, scale=:log10)\ncurve = MLJ.learning_curve(mach;\n range=r_lambda,\n resampling=CV(nfolds=3),\n measure=l1)","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"using Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"CV estimate of RMS error\")","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"(Image: )","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"If the range hyperparameter is the number of iterations in some iterative model, learning_curve will not restart the training from scratch for each new value, unless a non-holdout resampling strategy is specified (and provided the model implements an appropriate update method). To obtain multiple curves (that are distinct) you will need to pass the name of the model random number generator, rng_name, and specify the random number generators to be used using rngs=... (an integer automatically generates the number specified):","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"atom.lambda = 7.3\nr_n = range(ensemble, :n, lower=1, upper=50)\ncurves = MLJ.learning_curve(mach;\n range=r_n,\n measure=l1,\n verbosity=0,\n rng_name=:rng,\n rngs=4)","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"plot(curves.parameter_values,\n curves.measurements,\n xlab=curves.parameter_name,\n ylab=\"Holdout estimate of RMS error\")","category":"page"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"(Image: )","category":"page"},{"location":"learning_curves/#API-reference","page":"Learning Curves","title":"API reference","text":"","category":"section"},{"location":"learning_curves/","page":"Learning Curves","title":"Learning Curves","text":"MLJTuning.learning_curve","category":"page"},{"location":"learning_curves/#MLJTuning.learning_curve","page":"Learning Curves","title":"MLJTuning.learning_curve","text":"curve = learning_curve(mach; resolution=30,\n resampling=Holdout(),\n repeats=1,\n measure=default_measure(machine.model),\n rows=nothing,\n weights=nothing,\n operation=nothing,\n range=nothing,\n acceleration=default_resource(),\n acceleration_grid=CPU1(),\n rngs=nothing,\n rng_name=nothing)\n\nGiven a supervised machine mach, returns a named tuple of objects suitable for generating a plot of performance estimates, as a function of the single hyperparameter specified in range. The tuple curve has the following keys: :parameter_name, :parameter_scale, :parameter_values, :measurements.\n\nTo generate multiple curves for a model with a random number generator (RNG) as a hyperparameter, specify the name, rng_name, of the (possibly nested) RNG field, and a vector rngs of RNG's, one for each curve. Alternatively, set rngs to the number of curves desired, in which case RNG's are automatically generated. The individual curve computations can be distributed across multiple processes using acceleration=CPUProcesses() or acceleration=CPUThreads(). See the second example below for a demonstration.\n\nX, y = @load_boston;\natom = @load RidgeRegressor pkg=MultivariateStats\nensemble = EnsembleModel(atom=atom, n=1000)\nmach = machine(ensemble, X, y)\nr_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)\ncurve = learning_curve(mach; range=r_lambda, resampling=CV(), measure=mav)\nusing Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"CV estimate of RMS error\")\n\nIf using a Holdout() resampling strategy (with no shuffling) and if the specified hyperparameter is the number of iterations in some iterative model (and that model has an appropriately overloaded MLJModelInterface.update method) then training is not restarted from scratch for each increment of the parameter, ie the model is trained progressively.\n\natom.lambda=200\nr_n = range(ensemble, :n, lower=1, upper=250)\ncurves = learning_curve(mach; range=r_n, verbosity=0, rng_name=:rng, rngs=3)\nplot!(curves.parameter_values,\n curves.measurements,\n xlab=curves.parameter_name,\n ylab=\"Holdout estimate of RMS error\")\n\n\n\nlearning_curve(model::Supervised, X, y; kwargs...)\nlearning_curve(model::Supervised, X, y, w; kwargs...)\n\nPlot a learning curve (or curves) directly, without first constructing a machine.\n\nSummary of key-word options\n\nresolution - number of points generated from range (number model evaluations); default is 30\nacceleration - parallelization option for passing to evaluate!; an instance of CPU1, CPUProcesses or CPUThreads from the ComputationalResources.jl; default is default_resource()\nacceleration_grid - parallelization option for distributing each performancde evaluation\nrngs - for specifying random number generator(s) to be passed to the model (see above)\nrng_name - name of the model hyper-parameter representing a random number generator (see above); possibly nested\n\nOther key-word options are documented at TunedModel.\n\n\n\n\n\n","category":"function"},{"location":"models/EvoLinearRegressor_EvoLinear/#EvoLinearRegressor_EvoLinear","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"EvoLinearRegressor(; kwargs...)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"A model type for constructing a EvoLinearRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Keyword-arguments","page":"EvoLinearRegressor","title":"Keyword arguments","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"loss=:mse: loss function to be minimised. Can be one of:\n:mse\n:logistic\n:poisson\n:gamma\n:tweedie\nnrounds=10: maximum number of training rounds.\neta=1: Learning rate. Typically in the range [1e-2, 1].\nL1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.\nL2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.\nrng=123: random seed. Not used at the moment.\nupdater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.\ndevice=:cpu: Only :cpu is supported at the moment.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Internal-API","page":"EvoLinearRegressor","title":"Internal API","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Do config = EvoLinearRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"EvoLinearRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Training-model","page":"EvoLinearRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"A model is built using fit:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"config = EvoLinearRegressor()\nm = fit(config; x, y, w)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Inference","page":"EvoLinearRegressor","title":"Inference","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument. ","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"preds = m(x)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#MLJ-Interface","page":"EvoLinearRegressor","title":"MLJ Interface","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"From MLJ, the type can be imported using:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"EvoLinearRegressor = @load EvoLinearRegressor pkg=EvoLinear","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoLinearRegressor(loss=...).","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Training-model-2","page":"EvoLinearRegressor","title":"Training model","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where: ","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Operations","page":"EvoLinearRegressor","title":"Operations","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"predict(mach, Xnew): return predictions of the target given","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"features Xnew having the same scitype as X above. Predictions are deterministic.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Fitted-parameters","page":"EvoLinearRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":":fitresult: the EvoLinearModel object returned by EvoLnear.jl fitting algorithm.","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/#Report","page":"EvoLinearRegressor","title":"Report","text":"","category":"section"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/EvoLinearRegressor_EvoLinear/","page":"EvoLinearRegressor","title":"EvoLinearRegressor","text":":coef: Vector of coefficients (βs) associated to each of the features.\n:bias: Value of the bias.\n:names: Names of each of the features.","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/#KernelPerceptronClassifier_BetaML","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"","category":"section"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"mutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/#Hyperparameters:","page":"KernelPerceptronClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"kernel::Function: Kernel function to employ. See ?radial_kernel or ?polynomial_kernel (once loaded the BetaML package) for details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radial_kernel]\nepochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 100]\ninitial_errors::Union{Nothing, Vector{Vector{Int64}}}: Initial distribution of the number of errors errors [def: nothing, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as (n_classes * (n_classes - 1)) / 2\nshuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/KernelPerceptronClassifier_BetaML/#Example:","page":"KernelPerceptronClassifier","title":"Example:","text":"","category":"section"},{"location":"models/KernelPerceptronClassifier_BetaML/","page":"KernelPerceptronClassifier","title":"KernelPerceptronClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load KernelPerceptronClassifier pkg = \"BetaML\"\n[ Info: For silent loading, specify `verbosity=0`. \nimport BetaML ✔\nBetaML.Perceptron.KernelPerceptronClassifier\n\njulia> model = modelType()\nKernelPerceptronClassifier(\n kernel = BetaML.Utils.radial_kernel, \n epochs = 100, \n initial_errors = nothing, \n shuffle = true, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> est_classes = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)\n UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.245, virginica=>0.665)\n UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)","category":"page"},{"location":"model_search/#model_search","page":"Model Search","title":"Model Search","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"MLJ has a model registry, allowing the user to search models and their properties, without loading all the packages containing model code. In turn, this allows one to efficiently find all models solving a given machine learning task. The task itself is specified with the help of the matching method, and the search executed with the models methods, as detailed below.","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"For commonly encountered problems with model search, see also Preparing Data.","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"A table of all models is also given at List of Supported Models.","category":"page"},{"location":"model_search/#Model-metadata","page":"Model Search","title":"Model metadata","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Terminology. In this section the word \"model\" refers to a metadata entry in the model registry, as opposed to an actual model struct that such an entry represents. One can obtain such an entry with the info command:","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"info(\"PCA\")","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"So a \"model\" in the present context is just a named tuple containing metadata, and not an actual model type or instance. If two models with the same name occur in different packages, the package name must be specified, as in info(\"LinearRegressor\", pkg=\"GLM\").","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Model document strings can be retreived, without importing the defining code, using the doc function:","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")","category":"page"},{"location":"model_search/#General-model-queries","page":"Model Search","title":"General model queries","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"We list all models (named tuples) using models(), and list the models for which code is already loaded with localmodels():","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"localmodels()\nlocalmodels()[2]","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"One can search for models containing specified strings or regular expressions in their docstring attributes, as in","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models(\"forest\")","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"or by specifying a filter (Bool-valued function):","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"filter(model) = model.is_supervised &&\n model.input_scitype >: MLJ.Table(Continuous) &&\n model.target_scitype >: AbstractVector{<:Multiclass{3}} &&\n model.prediction_type == :deterministic\nmodels(filter)","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Multiple test arguments may be passed to models, which are applied conjunctively.","category":"page"},{"location":"model_search/#Matching-models-to-data","page":"Model Search","title":"Matching models to data","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Common searches are streamlined with the help of the matching command, defined as follows:","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"matching(model, X, y) == true exactly when model is supervised and admits inputs and targets with the scientific types of X and y, respectively\nmatching(model, X) == true exactly when model is unsupervised and admits inputs with the scientific types of X.","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"So, to search for all supervised probabilistic models handling input X and target y, one can define the testing function task by","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"task(model) = matching(model, X, y) && model.prediction_type == :probabilistic","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"And execute the search with","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models(task)","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"Also defined are Bool-valued callable objects matching(model), matching(X, y) and matching(X), with obvious behavior. For example, matching(X, y)(model) = matching(model, X, y).","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"So, to search for all models compatible with input X and target y, for example, one executes","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models(matching(X, y))","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"while the preceding search can also be written","category":"page"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models() do model\n matching(model, X, y) &&\n model.prediction_type == :probabilistic\nend","category":"page"},{"location":"model_search/#API","page":"Model Search","title":"API","text":"","category":"section"},{"location":"model_search/","page":"Model Search","title":"Model Search","text":"models\nlocalmodels","category":"page"},{"location":"model_search/#MLJModels.models","page":"Model Search","title":"MLJModels.models","text":"models(; wrappers=false)\n\nList all models in the MLJ registry. Here and below model means the registry metadata entry for a genuine model type (a proxy for types whose defining code may not be loaded). To include wrappers and other composite models, such as TunedModel and Stack, specify wrappers=true.\n\nmodels(filters...; wrappers=false)\n\nList all models m for which filter(m) is true, for each filter in filters.\n\nmodels(matching(X, y); wrappers=false)\n\nList all supervised models compatible with training data X, y.\n\nmodels(matching(X); wrappers=false)\n\nList all unsupervised models compatible with training data X.\n\nExample\n\nIf\n\ntask(model) = model.is_supervised && model.is_probabilistic\n\nthen models(task) lists all supervised models making probabilistic predictions.\n\nSee also: localmodels.\n\n\n\n\n\nmodels(needle::Union{AbstractString,Regex}; wrappers=false)\n\nList all models whole name or docstring matches a given needle.\n\n\n\n\n\n","category":"function"},{"location":"model_search/#MLJModels.localmodels","page":"Model Search","title":"MLJModels.localmodels","text":"localmodels(; modl=Main, wrappers=false)\nlocalmodels(filters...; modl=Main, wrappers=false)\nlocalmodels(needle::Union{AbstractString,Regex}; modl=Main, wrappers=false)\n\nList all models currently available to the user from the module modl without importing a package, and which additional pass through the specified filters. Here a filter is a Bool-valued function on models.\n\nUse load_path to get the path to some model returned, as in these examples:\n\nms = localmodels()\nmodel = ms[1]\nload_path(model)\n\nSee also models, load_path.\n\n\n\n\n\n","category":"function"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/#HistGradientBoostingClassifier_MLJScikitLearnInterface","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"","category":"section"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"HistGradientBoostingClassifier","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"A model type for constructing a hist gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"HistGradientBoostingClassifier = @load HistGradientBoostingClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"Do model = HistGradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingClassifier(loss=...).","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.","category":"page"},{"location":"models/HistGradientBoostingClassifier_MLJScikitLearnInterface/","page":"HistGradientBoostingClassifier","title":"HistGradientBoostingClassifier","text":"HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#LinearBinaryClassifier_GLM","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"LinearBinaryClassifier","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"A model type for constructing a linear binary classifier, based on GLM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"LinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"Do model = LinearBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearBinaryClassifier(fit_intercept=...).","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"LinearBinaryClassifier is a generalized linear model, specialised to the case of a binary target variable, with a user-specified link function. Options exist to specify an intercept or offset feature.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Training-data","page":"LinearBinaryClassifier","title":"Training data","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"Here","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor(2) or <:Multiclass(2); check the scitype with schema(y)\nw: is a vector of Real per-observation weights","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Hyper-parameters","page":"LinearBinaryClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)\nlink=GLM.LogitLink: The function which links the linear prediction function to the probability of a particular outcome or class. This must have type GLM.Link01. Options include GLM.LogitLink(), GLM.ProbitLink(), CloglogLink(),CauchitLink()`.\noffsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.\nmaxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.\natol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nrtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.\nminstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.\nreport_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Operations","page":"LinearBinaryClassifier","title":"Operations","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Fitted-parameters","page":"LinearBinaryClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"features: The names of the features used during model fitting.\ncoef: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Report","page":"LinearBinaryClassifier","title":"Report","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares\ndof_residual: The degrees of freedom for residuals, when meaningful.\nstderror: The standard errors of the coefficients.\nvcov: The estimated variance-covariance matrix of the coefficient estimates.\ncoef_table: Table which displays coefficients and summarizes their significance and confidence intervals.\nglm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/#Examples","page":"LinearBinaryClassifier","title":"Examples","text":"","category":"section"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"using MLJ\nimport GLM ## namespace must be available\n\nLinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM\nclf = LinearBinaryClassifier(fit_intercept=false, link=GLM.ProbitLink())\n\nX, y = @load_crabs\n\nmach = machine(clf, X, y) |> fit!\n\nXnew = (;FL = [8.1, 24.8, 7.2],\n RW = [5.1, 25.7, 6.4],\n CL = [15.9, 46.7, 14.3],\n CW = [18.7, 59.7, 12.2],\n BD = [6.2, 23.6, 8.4],)\n\nyhat = predict(mach, Xnew) ## probabilistic predictions\npdf(yhat, levels(y)) ## probability matrix\np_B = pdf.(yhat, \"B\")\nclass_labels = predict_mode(mach, Xnew)\n\nfitted_params(mach).features\nfitted_params(mach).coef\nfitted_params(mach).intercept\n\nreport(mach)","category":"page"},{"location":"models/LinearBinaryClassifier_GLM/","page":"LinearBinaryClassifier","title":"LinearBinaryClassifier","text":"See also LinearRegressor, LinearCountRegressor","category":"page"},{"location":"models/SOSDetector_OutlierDetectionPython/#SOSDetector_OutlierDetectionPython","page":"SOSDetector","title":"SOSDetector","text":"","category":"section"},{"location":"models/SOSDetector_OutlierDetectionPython/","page":"SOSDetector","title":"SOSDetector","text":"SOSDetector(perplexity = 4.5,\n metric = \"minkowski\",\n eps = 1e-5)","category":"page"},{"location":"models/SOSDetector_OutlierDetectionPython/","page":"SOSDetector","title":"SOSDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sos","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/#BayesianQDA_MLJScikitLearnInterface","page":"BayesianQDA","title":"BayesianQDA","text":"","category":"section"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"BayesianQDA","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"A model type for constructing a Bayesian quadratic discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"BayesianQDA = @load BayesianQDA pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"Do model = BayesianQDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianQDA(priors=...).","category":"page"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/#Hyper-parameters","page":"BayesianQDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianQDA_MLJScikitLearnInterface/","page":"BayesianQDA","title":"BayesianQDA","text":"priors = nothing\nreg_param = 0.0\nstore_covariance = false\ntol = 0.0001","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/#XGBoostClassifier_XGBoost","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"","category":"section"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"XGBoostClassifier","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"A model type for constructing a eXtreme Gradient Boosting Classifier, based on XGBoost.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"Do model = XGBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostClassifier(test=...).","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"Univariate classification using xgboost.","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/#Training-data","page":"XGBoostClassifier","title":"Training data","text":"","category":"section"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"m = machine(model, X, y)","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"where","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.\ny: is an AbstractVector Finite target.","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"Train using fit!(m, rows=...).","category":"page"},{"location":"models/XGBoostClassifier_XGBoost/#Hyper-parameters","page":"XGBoostClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/XGBoostClassifier_XGBoost/","page":"XGBoostClassifier","title":"XGBoostClassifier","text":"See https://xgboost.readthedocs.io/en/stable/parameter.html.","category":"page"},{"location":"models/LODADetector_OutlierDetectionPython/#LODADetector_OutlierDetectionPython","page":"LODADetector","title":"LODADetector","text":"","category":"section"},{"location":"models/LODADetector_OutlierDetectionPython/","page":"LODADetector","title":"LODADetector","text":"LODADetector(n_bins = 10,\n n_random_cuts = 100)","category":"page"},{"location":"models/LODADetector_OutlierDetectionPython/","page":"LODADetector","title":"LODADetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.loda","category":"page"},{"location":"models/RandomOversampler_Imbalance/#RandomOversampler_Imbalance","page":"RandomOversampler","title":"RandomOversampler","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Initiate a random oversampling model with the given hyper-parameters.","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"RandomOversampler","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"A model type for constructing a random oversampler, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"RandomOversampler = @load RandomOversampler pkg=Imbalance","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Do model = RandomOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomOversampler(ratios=...).","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"RandomOversampler implements naive oversampling by repeating existing observations with replacement.","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Training-data","page":"RandomOversampler","title":"Training data","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Likewise, there is no need to fit!(mach). ","category":"page"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"For default values of the hyper-parameters, model can be constructed by model = RandomOverSampler()","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Hyperparameters","page":"RandomOversampler","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"ratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Transform-Inputs","page":"RandomOversampler","title":"Transform Inputs","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Transform-Outputs","page":"RandomOversampler","title":"Transform Outputs","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Operations","page":"RandomOversampler","title":"Operations","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"transform(mach, X, y): resample the data X and y using RandomOversampler, returning both the new and original observations","category":"page"},{"location":"models/RandomOversampler_Imbalance/#Example","page":"RandomOversampler","title":"Example","text":"","category":"section"},{"location":"models/RandomOversampler_Imbalance/","page":"RandomOversampler","title":"RandomOversampler","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 100, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (39.6%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33 (68.8%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) \n\n## load RandomOversampler\nRandomOversampler = @load RandomOversampler pkg=Imbalance\n\n## wrap the model in a machine\noversampler = RandomOversampler(ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) ","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/#DummyRegressor_MLJScikitLearnInterface","page":"DummyRegressor","title":"DummyRegressor","text":"","category":"section"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"DummyRegressor","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"A model type for constructing a dummy regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"DummyRegressor = @load DummyRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"Do model = DummyRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyRegressor(strategy=...).","category":"page"},{"location":"models/DummyRegressor_MLJScikitLearnInterface/","page":"DummyRegressor","title":"DummyRegressor","text":"DummyRegressor is a regressor that makes predictions using simple rules.","category":"page"},{"location":"models/PegasosClassifier_BetaML/#PegasosClassifier_BetaML","page":"PegasosClassifier","title":"PegasosClassifier","text":"","category":"section"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"mutable struct PegasosClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"The gradient-based linear \"pegasos\" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/PegasosClassifier_BetaML/#Hyperparameters:","page":"PegasosClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]\ninitial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]\nlearning_rate::Function: Learning rate [def: (epoch -> 1/sqrt(epoch))]\nlearning_rate_multiplicative::Float64: Multiplicative term of the learning rate [def: 0.5]\nepochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]\nshuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\nforce_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]\nreturn_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/PegasosClassifier_BetaML/#Example:","page":"PegasosClassifier","title":"Example:","text":"","category":"section"},{"location":"models/PegasosClassifier_BetaML/","page":"PegasosClassifier","title":"PegasosClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load PegasosClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Perceptron.PegasosClassifier\n\njulia> model = modelType()\nPegasosClassifier(\n initial_coefficients = nothing, \n initial_constant = nothing, \n learning_rate = BetaML.Perceptron.var\"#71#73\"(), \n learning_rate_multiplicative = 0.5, \n epochs = 1000, \n shuffle = true, \n force_origin = false, \n return_mean_hyperplane = false, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n\njulia> est_classes = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>0.817, versicolor=>0.153, virginica=>0.0301)\n UnivariateFinite{Multiclass{3}}(setosa=>0.791, versicolor=>0.177, virginica=>0.0318)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.5, virginica=>0.246)\n UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/#TheilSenRegressor_MLJScikitLearnInterface","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"","category":"section"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"TheilSenRegressor","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"A model type for constructing a Theil-Sen regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"TheilSenRegressor = @load TheilSenRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"Do model = TheilSenRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TheilSenRegressor(fit_intercept=...).","category":"page"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"TheilSenRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/TheilSenRegressor_MLJScikitLearnInterface/","page":"TheilSenRegressor","title":"TheilSenRegressor","text":"fit_intercept = true\ncopy_X = true\nmax_subpopulation = 10000\nn_subsamples = nothing\nmax_iter = 300\ntol = 0.001\nrandom_state = nothing\nn_jobs = nothing\nverbose = false","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/#MultiTaskLassoCVRegressor_MLJScikitLearnInterface","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"","category":"section"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"MultiTaskLassoCVRegressor","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"A model type for constructing a multi-target lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"MultiTaskLassoCVRegressor = @load MultiTaskLassoCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"Do model = MultiTaskLassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoCVRegressor(eps=...).","category":"page"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"MultiTaskLassoCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/","page":"MultiTaskLassoCVRegressor","title":"MultiTaskLassoCVRegressor","text":"eps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nmax_iter = 300\ntol = 0.0001\ncopy_X = true\ncv = 5\nverbose = false\nn_jobs = 1\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"evaluating_model_performance/#Evaluating-Model-Performance","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"In addition to hold-out and cross-validation, the user can specify an explicit list of train/test pairs of row indices for resampling, or define new resampling strategies.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"For simultaneously evaluating multiple models, see Comparing models of different type and nested cross-validation.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"For externally logging the outcomes of performance evaluation experiments, see Logging Workflows","category":"page"},{"location":"evaluating_model_performance/#Evaluating-against-a-single-measure","page":"Evaluating Model Performance","title":"Evaluating against a single measure","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"using MLJ\nX = (a=rand(12), b=rand(12), c=rand(12));\ny = X.a + 2X.b + 0.05*rand(12);\nmodel = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)()\ncv = CV(nfolds=3)\nevaluate(model, X, y, resampling=cv, measure=l2, verbosity=0)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Alternatively, instead of applying evaluate to a model + data, one may call evaluate! on an existing machine wrapping the model in data:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"mach = machine(model, X, y)\nevaluate!(mach, resampling=cv, measure=l2, verbosity=0)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"(The latter call is a mutating call as the learned parameters stored in the machine potentially change. )","category":"page"},{"location":"evaluating_model_performance/#Multiple-measures","page":"Evaluating Model Performance","title":"Multiple measures","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Multiple measures are specified as a vector:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"evaluate!(\n mach,\n resampling=cv,\n measures=[l1, rms, rmslp1],\n verbosity=0,\n)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Custom measures can also be provided.","category":"page"},{"location":"evaluating_model_performance/#Specifying-weights","page":"Evaluating Model Performance","title":"Specifying weights","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Per-observation weights can be passed to measures. If a measure does not support weights, the weights are ignored:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"holdout = Holdout(fraction_train=0.8)\nweights = [1, 1, 2, 1, 1, 2, 3, 1, 1, 2, 3, 1];\nevaluate!(\n mach,\n resampling=CV(nfolds=3),\n measure=[l2, rsquared],\n weights=weights,\n)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"In classification problems, use class_weights=... to specify a class weight dictionary.","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.evaluate!\nMLJBase.evaluate\nMLJBase.PerformanceEvaluation","category":"page"},{"location":"evaluating_model_performance/#MLJBase.evaluate!","page":"Evaluating Model Performance","title":"MLJBase.evaluate!","text":"evaluate!(mach; resampling=CV(), measure=nothing, options...)\n\nEstimate the performance of a machine mach wrapping a supervised model in data, using the specified resampling strategy (defaulting to 6-fold cross-validation) and measure, which can be a single measure or vector. Returns a PerformanceEvaluation object.\n\nAvailable resampling strategies are CV, Holdout, InSample, StratifiedCV and TimeSeriesCV. If resampling is not an instance of one of these, then a vector of tuples of the form (train_rows, test_rows) is expected. For example, setting\n\nresampling = [((1:100), (101:200)),\n ((101:200), (1:100))]\n\ngives two-fold cross-validation using the first 200 rows of data.\n\nAny measure conforming to the StatisticalMeasuresBase.jl API can be provided, assuming it can consume multiple observations.\n\nAlthough evaluate! is mutating, mach.model and mach.args are not mutated.\n\nAdditional keyword options\n\nrows - vector of observation indices from which both train and test folds are constructed (default is all observations)\noperation/operations=nothing - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified. For example, predict_mode will be used for a Multiclass target, if model is a probabilistic predictor, but measure is expects literal (point) target predictions. Operations actually applied can be inspected from the operation field of the object returned.\nweights - per-sample Real weights for measures that support them (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).\nclass_weights - dictionary of Real per-class weights for use with measures that support these, in classification problems (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).\nrepeats::Int=1: set to a higher value for repeated (Monte Carlo) resampling. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.\nacceleration=CPU1(): acceleration/parallelization option; can be any instance of CPU1, (single-threaded computation), CPUThreads (multi-threaded computation) or CPUProcesses (multi-process computation); default is default_resource(). These types are owned by ComputationalResources.jl.\nforce=false: set to true to force cold-restart of each training event\nverbosity::Int=1 logging level; can be negative\ncheck_measure=true: whether to screen measures for possible incompatibility with the model. Will not catch all incompatibilities.\nper_observation=true: whether to calculate estimates for individual observations; if false the per_observation field of the returned object is populated with missings. Setting to false may reduce compute time and allocations.\nlogger - a logger object (see MLJBase.log_evaluation)\ncompact=false - if true, the returned evaluation object excludes these fields: fitted_params_per_fold, report_per_fold, train_test_rows.\n\nSee also evaluate, PerformanceEvaluation, CompactPerformanceEvaluation.\n\n\n\n\n\n","category":"function"},{"location":"evaluating_model_performance/#MLJModelInterface.evaluate","page":"Evaluating Model Performance","title":"MLJModelInterface.evaluate","text":"some meta-models may choose to implement the evaluate operations\n\n\n\n\n\n","category":"function"},{"location":"evaluating_model_performance/#MLJBase.PerformanceEvaluation","page":"Evaluating Model Performance","title":"MLJBase.PerformanceEvaluation","text":"PerformanceEvaluation <: AbstractPerformanceEvaluation\n\nType of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model, and store other information ancillary to the computation.\n\nIf evaluate or evaluate! is called with the compact=true option, then a CompactPerformanceEvaluation object is returned instead.\n\nWhen evaluate/evaluate! is called, a number of train/test pairs (\"folds\") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.\n\nWhen displayed, a PerformanceEvaluation object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).\n\nFields\n\nThese fields are part of the public API of the PerformanceEvaluation struct.\n\nmodel: model used to create the performance evaluation. In the case a tuning model, this is the best model found.\nmeasure: vector of measures (metrics) used to evaluate performance\nmeasurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())\noperation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.\nper_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.\nper_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.\nfitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.\nreport_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.\ntrain_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.\nresampling: the user-specified resampling strategy to generate the train/test pairs (or literal train/test pairs if that was directly specified).\nrepeats: the number of times the resampling strategy was repeated.\n\nSee also CompactPerformanceEvaluation.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/#User-specified-train/test-sets","page":"Evaluating Model Performance","title":"User-specified train/test sets","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Users can either provide an explicit list of train/test pairs of row indices for resampling, as in this example:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"fold1 = 1:6; fold2 = 7:12;\nevaluate!(\n mach,\n resampling = [(fold1, fold2), (fold2, fold1)],\n measures=[l1, l2],\n verbosity=0,\n)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Or the user can define their own re-usable ResamplingStrategy objects; see Custom resampling strategies below.","category":"page"},{"location":"evaluating_model_performance/#Built-in-resampling-strategies","page":"Evaluating Model Performance","title":"Built-in resampling strategies","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.Holdout","category":"page"},{"location":"evaluating_model_performance/#MLJBase.Holdout","page":"Evaluating Model Performance","title":"MLJBase.Holdout","text":"holdout = Holdout(; fraction_train=0.7, shuffle=nothing, rng=nothing)\n\nInstantiate a Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.\n\ntrain_test_pairs(holdout, rows)\n\nReturns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.CV","category":"page"},{"location":"evaluating_model_performance/#MLJBase.CV","page":"Evaluating Model Performance","title":"MLJBase.CV","text":"cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning.\n\ntrain_test_pairs(cv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.StratifiedCV","category":"page"},{"location":"evaluating_model_performance/#MLJBase.StratifiedCV","page":"Evaluating Model Performance","title":"MLJBase.StratifiedCV","text":"stratified_cv = StratifiedCV(; nfolds=6,\n shuffle=false,\n rng=Random.GLOBAL_RNG)\n\nStratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).\n\ntrain_test_pairs(stratified_cv, rows, y)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.\n\nUnlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.\n\nThe stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJBase.TimeSeriesCV","category":"page"},{"location":"evaluating_model_performance/#MLJBase.TimeSeriesCV","page":"Evaluating Model Performance","title":"MLJBase.TimeSeriesCV","text":"tscv = TimeSeriesCV(; nfolds=4)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.\n\ntrain_test_pairs(tscv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.\n\nThe first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.\n\nExamples\n\njulia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)\n3-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:\n (1:4, 5:6)\n (1:6, 7:8)\n (1:8, 9:10)\n\njulia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)();\n\njulia> data = @load_sunspots;\n\njulia> X = (lag1 = data.sunspot_number[2:end-1],\n lag2 = data.sunspot_number[1:end-2]);\n\njulia> y = data.sunspot_number[3:end];\n\njulia> tscv = TimeSeriesCV(nfolds=3);\n\njulia> evaluate(model, X, y, resampling=tscv, measure=rmse, verbosity=0)\n┌───────────────────────────┬───────────────┬────────────────────┐\n│ _.measure │ _.measurement │ _.per_fold │\n├───────────────────────────┼───────────────┼────────────────────┤\n│ RootMeanSquaredError @753 │ 21.7 │ [25.4, 16.3, 22.4] │\n└───────────────────────────┴───────────────┴────────────────────┘\n_.per_observation = [missing]\n_.fitted_params_per_fold = [ … ]\n_.report_per_fold = [ … ]\n_.train_test_rows = [ … ]\n\n\n\n\n\n","category":"type"},{"location":"evaluating_model_performance/#Custom-resampling-strategies","page":"Evaluating Model Performance","title":"Custom resampling strategies","text":"","category":"section"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"To define a new resampling strategy, make relevant parameters of your strategy the fields of a new type MyResamplingStrategy <: MLJ.ResamplingStrategy, and implement one of the following methods:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows)\nMLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, y)\nMLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, X, y)","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Each method takes a vector of indices rows and returns a vector [(t1, e1), (t2, e2), ... (tk, ek)] of train/test pairs of row indices selected from rows. Here X, y are the input and target data (ignored in simple strategies, such as Holdout and CV).","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"Here is the code for the Holdout strategy as an example:","category":"page"},{"location":"evaluating_model_performance/","page":"Evaluating Model Performance","title":"Evaluating Model Performance","text":"struct Holdout <: ResamplingStrategy\n fraction_train::Float64\n shuffle::Bool\n rng::Union{Int,AbstractRNG}\n\n function Holdout(fraction_train, shuffle, rng)\n 0 < fraction_train < 1 ||\n error(\"`fraction_train` must be between 0 and 1.\")\n return new(fraction_train, shuffle, rng)\n end\nend\n\n# Keyword Constructor\nfunction Holdout(; fraction_train::Float64=0.7, shuffle=nothing, rng=nothing)\n if rng isa Integer\n rng = MersenneTwister(rng)\n end\n if shuffle === nothing\n shuffle = ifelse(rng===nothing, false, true)\n end\n if rng === nothing\n rng = Random.GLOBAL_RNG\n end\n return Holdout(fraction_train, shuffle, rng)\nend\n\nfunction train_test_pairs(holdout::Holdout, rows)\n train, test = partition(rows, holdout.fraction_train,\n shuffle=holdout.shuffle, rng=holdout.rng)\n return [(train, test),]\nend","category":"page"},{"location":"models/IteratedModel_MLJIteration/#IteratedModel_MLJIteration","page":"IteratedModel","title":"IteratedModel","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"IteratedModel(model;\n controls=MLJIteration.DEFAULT_CONTROLS,\n resampling=Holdout(),\n measure=nothing,\n retrain=false,\n advanced_options...,\n)","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Wrap the specified supervised model in the specified iteration controls. Here model should support iteration, which is true if (iteration_parameter(model) is different from nothing.","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Available controls: Step(), Info(), Warn(), Error(), Callback(), WithLossDo(), WithTrainingLossesDo(), WithNumberDo(), Data(), Disjunction(), GL(), InvalidValue(), Never(), NotANumber(), NumberLimit(), NumberSinceBest(), PQ(), Patience(), Threshold(), TimeLimit(), Warmup(), WithIterationsDo(), WithEvaluationDo(), WithFittedParamsDo(), WithReportDo(), WithMachineDo(), WithModelDo(), CycleLearningRate() and Save().","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"important: Important\nTo make out-of-sample losses available to the controls, the wrapped model is only trained on part of the data, as iteration proceeds. The user may want to force retraining on all data after controlled iteration has finished by specifying retrain=true. See also \"Training\", and the retrain option, under \"Extended help\" below.","category":"page"},{"location":"models/IteratedModel_MLJIteration/#Extended-help","page":"IteratedModel","title":"Extended help","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/#Options","page":"IteratedModel","title":"Options","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"controls=Any[IterationControl.Step(1), EarlyStopping.Patience(5), EarlyStopping.GL(2.0), EarlyStopping.TimeLimit(Dates.Millisecond(108000)), EarlyStopping.InvalidValue()]: Controls are summarized at https://JuliaAI.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.\nresampling=Holdout(fraction_train=0.7): The default resampling holds back 30% of data for computing an out-of-sample estimate of performance (the \"loss\") for loss-based controls such as WithLossDo. Specify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). If the model does not report a training loss, you can use resampling=InSample() instead. Otherwise, resampling must have type Holdout or be a vector with one element of the form (train_indices, test_indices).\nmeasure=nothing: StatisticalMeasures.jl compatible measure for estimating model performance (the \"loss\", but the orientation is immaterial - i.e., this could be a score). Inferred by default. Ignored if resampling=nothing.\nretrain=false: If retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected (\"learned\"). That is, the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped. This is typically desired if wrapping the iterated model further, or when inserting in a pipeline or other composite model. If retrain=false (default) and resampling isa Holdout, then iterated_model behaves like the original model trained on a subset of the provided data.\nweights=nothing: per-observation weights to be passed to measure where supported; if unspecified, these are understood to be uniform.\nclass_weights=nothing: class-weights to be passed to measure where supported; if unspecified, these are understood to be uniform.\noperation=nothing: Operation, such as predict or predict_mode, for computing target values, or proxy target values, for consumption by measure; automatically inferred by default.\ncheck_measure=true: Specify false to override checks on measure for compatibility with the training data.\niteration_parameter=nothing: A symbol, such as :epochs, naming the iteration parameter of model; inferred by default. Note that the actual value of the iteration parameter in the supplied model is ignored; only the value of an internal clone is mutated during training the wrapped model.\ncache=true: Whether or not model-specific representations of data are cached in between iteration parameter increments; specify cache=false to prioritize memory over speed.","category":"page"},{"location":"models/IteratedModel_MLJIteration/#Training","page":"IteratedModel","title":"Training","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Training an instance iterated_model of IteratedModel on some data (by binding to a machine and calling fit!, for example) performs the following actions:","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Assuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy.\nA clone of the wrapped model, model is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).\nThe iteration parameter of the clone is set to 0.\nThe specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).\nOnce a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false (true by default) or resampling === nothing, in which case mach_production coincides with train_mach.","category":"page"},{"location":"models/IteratedModel_MLJIteration/#Prediction","page":"IteratedModel","title":"Prediction","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Calling predict(mach, Xnew) in the example above returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.","category":"page"},{"location":"models/IteratedModel_MLJIteration/#Controls-that-mutate-parameters","page":"IteratedModel","title":"Controls that mutate parameters","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"A control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Callback(mach -> mach.model.eta = 1.05*mach.model.eta)","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"However, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.","category":"page"},{"location":"models/IteratedModel_MLJIteration/#Warm-restarts","page":"IteratedModel","title":"Warm restarts","text":"","category":"section"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"In the following example, the second fit! call will not restart training of the internal train_mach, assuming model supports warm restarts:","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"iterated_model = IteratedModel(\n model,\n controls = [Step(1), NumberLimit(100)],\n)\nmach = machine(iterated_model, X, y)\nfit!(mach) ## train for 100 iterations\niterated_model.controls = [Step(1), NumberLimit(50)],\nfit!(mach) ## train for an *extra* 50 iterations","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"More generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.","category":"page"},{"location":"models/IteratedModel_MLJIteration/","page":"IteratedModel","title":"IteratedModel","text":"Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.","category":"page"},{"location":"common_mlj_workflows/#Common-MLJ-Workflows","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"This demo assumes you have certain packages in your active package environment. To activate a new environment, \"MyNewEnv\", with just these packages, do this in a new REPL session:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using Pkg\nPkg.activate(\"MyNewEnv\")\nPkg.add([\"MLJ\", \"RDatasets\", \"DataFrames\", \"MLJDecisionTreeInterface\",\n \"MLJMultivariateStatsInterface\", \"NearestNeighborModels\", \"MLJGLMInterface\",\n \"Plots\"])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"The following starts MLJ and shows the current version of MLJ (you can also use Pkg.status()):","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using MLJ\nMLJ_VERSION","category":"page"},{"location":"common_mlj_workflows/#Data-ingestion","page":"Common MLJ Workflows","title":"Data ingestion","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"# to avoid RDatasets as a doc dependency, generate synthetic data with\n# similar parameters, with the first four rows mimicking the original dataset\n# for display purposes\ncolor_off()\nimport DataFrames\nchanning = (Sex = [repeat([\"Male\"], 4)..., rand([\"Male\",\"Female\"], 458)...],\n Entry = Int32[782, 1020, 856, 915, rand(733:1140, 458)...],\n Exit = Int32[909, 1128, 969, 957, rand(777:1207, 458)...],\n Time = Int32[127, 108, 113, 42, rand(0:137, 458)...],\n Cens = Int32[1, 1, 1, 1, rand(0:1, 458)...]) |> DataFrames.DataFrame\ncoerce!(channing, :Sex => Multiclass)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"import RDatasets\nchanning = RDatasets.dataset(\"boot\", \"channing\")","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"first(channing, 4) |> pretty","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting metadata, including column scientific types:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"schema(channing)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Horizontally splitting data and shuffling rows.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Here y is the :Exit column and X a table with everything else:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y, X = unpack(channing, ==(:Exit), rng=123)\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Here y is the :Exit column and X everything else except :Time:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y, X = unpack(channing,\n ==(:Exit),\n !=(:Time);\n rng=123);\nscitype(y)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"schema(X)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fixing wrong scientific types in X:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X = coerce(X, :Exit=>Continuous, :Entry=>Continuous, :Cens=>Multiclass);\nschema(X)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Loading a built-in supervised dataset:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"table = load_iris();\nschema(table)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Loading a built-in data set already split into X and y:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris;\nselectrows(X, 1:4) # selectrows works whenever `Tables.istable(X)==true`.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y[1:4]","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Splitting data vertically after row shuffling:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"channing_train, channing_test = partition(channing, 0.6, rng=123);\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Or, if already horizontally split:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.6, multi=true, rng=123)","category":"page"},{"location":"common_mlj_workflows/#Model-Search","page":"Common MLJ Workflows","title":"Model Search","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Model Search","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Searching for a supervised model:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_boston\nms = models(matching(X, y))","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"ms[6]","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"models(\"Tree\")","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"A more refined search:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"models() do model\n matching(model, X, y) &&\n model.prediction_type == :deterministic &&\n model.is_pure_julia\nend;\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Searching for an unsupervised model:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"models(matching(X))","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Getting the metadata entry for a given model type:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"info(\"PCA\")\ninfo(\"RidgeRegressor\", pkg=\"MultivariateStats\") # a model type in multiple packages","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Extracting the model document string (output omitted):","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/#Instantiating-a-model","page":"Common MLJ Workflows","title":"Instantiating a model","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Getting Started, Loading Model Code","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Assumes MLJDecisionTreeClassifier is in your environment. Otherwise, try interactive loading with @iload:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree(min_samples_split=5, max_depth=4)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"or","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"tree = (@load DecisionTreeClassifier)()\ntree.min_samples_split = 5\ntree.max_depth = 4","category":"page"},{"location":"common_mlj_workflows/#Evaluating-a-model","page":"Common MLJ Workflows","title":"Evaluating a model","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Evaluating Model Performance","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_boston # a table and a vector\nKNN = @load KNNRegressor\nknn = KNN()\nevaluate(knn, X, y,\n resampling=CV(nfolds=5),\n measure=[RootMeanSquaredError(), LPLoss(1)])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Note RootMeanSquaredError() has alias rms and LPLoss(1) has aliases l1, mae.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Do measures() to list all losses and scores and their aliases, or refer to the StatisticalMeasures.jl docs.","category":"page"},{"location":"common_mlj_workflows/#Basic-fit/evaluate/predict-by-hand","page":"Common MLJ Workflows","title":"Basic fit/evaluate/predict by hand","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Getting Started, Machines, Evaluating Model Performance, Performance Measures","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"crabs = load_crabs() |> DataFrames.DataFrame\nschema(crabs)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y, X = unpack(crabs, ==(:sp), !in([:index, :sex]); rng=123)\n\nTree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree(max_depth=2) # hide","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Bind the model and data together in a machine, which will additionally, store the learned parameters (fitresults) when fit:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"mach = machine(tree, X, y)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Split row indices into training and evaluation rows:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"train, test = partition(eachindex(y), 0.7); # 70:30 split","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fit on the train data set and evaluate on the test data set:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"fit!(mach, rows=train)\nyhat = predict(mach, X[test,:])\nLogLoss(tol=1e-4)(yhat, y[test])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Note LogLoss() has aliases log_loss and cross_entropy.","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Predict on the new data set:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Xnew = (FL = rand(3), RW = rand(3), CL = rand(3), CW = rand(3), BD = rand(3))\npredict(mach, Xnew) # a vector of distributions","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"predict_mode(mach, Xnew) # a vector of point-predictions","category":"page"},{"location":"common_mlj_workflows/#More-performance-evaluation-examples","page":"Common MLJ Workflows","title":"More performance evaluation examples","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Evaluating model + data directly:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"evaluate(tree, X, y,\n resampling=Holdout(fraction_train=0.7, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"If a machine is already defined, as above:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"evaluate!(mach,\n resampling=Holdout(fraction_train=0.7, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Using cross-validation:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"evaluate!(mach, resampling=CV(nfolds=5, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"With user-specified train/test pairs of row indices:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"f1, f2, f3 = 1:13, 14:26, 27:36\npairs = [(f1, vcat(f2, f3)), (f2, vcat(f3, f1)), (f3, vcat(f1, f2))];\nevaluate!(mach,\n resampling=pairs,\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Changing a hyperparameter and re-evaluating:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"tree.max_depth = 3\nevaluate!(mach,\n resampling=CV(nfolds=5, shuffle=true, rng=1234),\n measure=[LogLoss(), Accuracy()])","category":"page"},{"location":"common_mlj_workflows/#Inspecting-training-results","page":"Common MLJ Workflows","title":"Inspecting training results","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fit an ordinary least square model to some synthetic data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"x1 = rand(100)\nx2 = rand(100)\n\nX = (x1=x1, x2=x2)\ny = x1 - 2x2 + 0.1*rand(100);\n\nOLS = @load LinearRegressor pkg=GLM\nols = OLS()\nmach = machine(ols, X, y) |> fit!","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Get a named tuple representing the learned parameters, human-readable if appropriate:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"fitted_params(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Get other training-related information:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"report(mach)","category":"page"},{"location":"common_mlj_workflows/#Basic-fit/transform-for-unsupervised-models","page":"Common MLJ Workflows","title":"Basic fit/transform for unsupervised models","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Load data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris # a table and a vector\ntrain, test = partition(eachindex(y), 0.97, shuffle=true, rng=123)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Instantiate and fit the model/machine:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"PCA = @load PCA\npca = PCA(maxoutdim=2)\nmach = machine(pca, X)\nfit!(mach, rows=train)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Transform selected data bound to the machine:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"transform(mach, rows=test);","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Transform new data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Xnew = (sepal_length=rand(3), sepal_width=rand(3),\n petal_length=rand(3), petal_width=rand(3));\ntransform(mach, Xnew)","category":"page"},{"location":"common_mlj_workflows/#Inverting-learned-transformations","page":"Common MLJ Workflows","title":"Inverting learned transformations","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"y = rand(100);\nstand = Standardizer()\nmach = machine(stand, y)\nfit!(mach)\nz = transform(mach, y);\n@assert inverse_transform(mach, z) ≈ y # true","category":"page"},{"location":"common_mlj_workflows/#Nested-hyperparameter-tuning","page":"Common MLJ Workflows","title":"Nested hyperparameter tuning","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Tuning Models","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Define a model with nested hyperparameters:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree()\nforest = EnsembleModel(model=tree, n=300)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Define ranges for hyperparameters to be tuned:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r1 = range(forest, :bagging_fraction, lower=0.5, upper=1.0, scale=:log10)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r2 = range(forest, :(model.n_subfeatures), lower=1, upper=4) # nested","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Wrap the model in a tuning strategy:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"tuned_forest = TunedModel(model=forest,\n tuning=Grid(resolution=12),\n resampling=CV(nfolds=6),\n ranges=[r1, r2],\n measure=BrierLoss())","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Bound the wrapped model to data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"mach = machine(tuned_forest, X, y)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Fitting the resultant machine optimizes the hyperparameters specified in range, using the specified tuning and resampling strategies and performance measure (possibly a vector of measures), and retrains on all data bound to the machine:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"fit!(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting the optimal model:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"F = fitted_params(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"F.best_model","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting details of tuning procedure:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r = report(mach);\nkeys(r)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r.history[[1,end]]","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Visualizing these results:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using Plots\nplot(mach)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Image: )","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Predicting on new data using the optimized model trained on all data:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"predict(mach, Xnew)","category":"page"},{"location":"common_mlj_workflows/#Constructing-linear-pipelines","page":"Common MLJ Workflows","title":"Constructing linear pipelines","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Linear Pipelines","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Constructing a linear (unbranching) pipeline with a learned target transformation/inverse transformation:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_reduced_ames\nKNN = @load KNNRegressor\nknn_with_target = TransformedTargetModel(model=KNN(K=3), transformer=Standardizer())","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"pipe = (X -> coerce(X, :age=>Continuous)) |> OneHotEncoder() |> knn_with_target","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Evaluating the pipeline (just as you would any other model):","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"pipe.one_hot_encoder.drop_last = true # mutate a nested hyper-parameter\nevaluate(pipe, X, y, resampling=Holdout(), measure=RootMeanSquaredError(), verbosity=2)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Inspecting the learned parameters in a pipeline:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"mach = machine(pipe, X, y) |> fit!\nF = fitted_params(mach)\nF.transformed_target_model_deterministic.model","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Constructing a linear (unbranching) pipeline with a static (unlearned) target transformation/inverse transformation:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Tree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0\ntree_with_target = TransformedTargetModel(model=Tree(),\n transformer=y -> log.(y),\n inverse = z -> exp.(z))\npipe2 = (X -> coerce(X, :age=>Continuous)) |> OneHotEncoder() |> tree_with_target\nnothing # hide","category":"page"},{"location":"common_mlj_workflows/#Creating-a-homogeneous-ensemble-of-models","page":"Common MLJ Workflows","title":"Creating a homogeneous ensemble of models","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Reference: Homogeneous Ensembles","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"X, y = @load_iris\nTree = @load DecisionTreeClassifier pkg=DecisionTree\ntree = Tree()\nforest = EnsembleModel(model=tree, bagging_fraction=0.8, n=300)\nmach = machine(forest, X, y)\nevaluate!(mach, measure=LogLoss())","category":"page"},{"location":"common_mlj_workflows/#Performance-curves","page":"Common MLJ Workflows","title":"Performance curves","text":"","category":"section"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Generate a plot of performance, as a function of some hyperparameter (building on the preceding example)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Single performance curve:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"r = range(forest, :n, lower=1, upper=1000, scale=:log10)\ncurve = learning_curve(mach,\n range=r,\n resampling=Holdout(),\n resolution=50,\n measure=LogLoss(),\n verbosity=0)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"using Plots\nplot(curve.parameter_values, curve.measurements,\n xlab=curve.parameter_name, xscale=curve.parameter_scale)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Image: )","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"Multiple curves:","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"curve = learning_curve(mach,\n range=r,\n resampling=Holdout(),\n measure=LogLoss(),\n resolution=50,\n rng_name=:rng,\n rngs=4,\n verbosity=0)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"plot(curve.parameter_values, curve.measurements,\n xlab=curve.parameter_name, xscale=curve.parameter_scale)","category":"page"},{"location":"common_mlj_workflows/","page":"Common MLJ Workflows","title":"Common MLJ Workflows","text":"(Image: )","category":"page"},{"location":"models/LinearRegressor_GLM/#LinearRegressor_GLM","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a linear regressor, based on GLM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=GLM","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor assumes the target is a continuous variable whose conditional distribution is normal with constant variance, and whose expected value is a linear combination of the features (identity link function). Options exist to specify an intercept or offset feature.","category":"page"},{"location":"models/LinearRegressor_GLM/#Training-data","page":"LinearRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with one of:","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"mach = machine(model, X, y)\nmach = machine(model, X, y, w)","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"Here","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)\ny: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)\nw: is a vector of Real per-observation weights","category":"page"},{"location":"models/LinearRegressor_GLM/#Hyper-parameters","page":"LinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)\ndropcollinear=false: Whether to drop features in the training data to ensure linear independence. If true , only the first of each set of linearly-dependent features is used. The coefficient for redundant linearly dependent features is 0.0 and all associated statistics are set to NaN.\noffsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.\nreport_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearRegressor_GLM/#Operations","page":"LinearRegressor","title":"Operations","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.\npredict_mean(mach, Xnew): instead return the mean of each prediction above\npredict_median(mach, Xnew): instead return the median of each prediction above.","category":"page"},{"location":"models/LinearRegressor_GLM/#Fitted-parameters","page":"LinearRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"features: The names of the features encountered during model fitting.\ncoef: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearRegressor_GLM/#Report","page":"LinearRegressor","title":"Report","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"When all keys are enabled in report_keys, the following fields are available in report(mach):","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares\ndof_residual: The degrees of freedom for residuals, when meaningful.\nstderror: The standard errors of the coefficients.\nvcov: The estimated variance-covariance matrix of the coefficient estimates.\ncoef_table: Table which displays coefficients and summarizes their significance and confidence intervals.\nglm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.","category":"page"},{"location":"models/LinearRegressor_GLM/#Examples","page":"LinearRegressor","title":"Examples","text":"","category":"section"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"using MLJ\nLinearRegressor = @load LinearRegressor pkg=GLM\nglm = LinearRegressor()\n\nX, y = make_regression(100, 2) ## synthetic data\nmach = machine(glm, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2)\nyhat = predict(mach, Xnew) ## new predictions\nyhat_point = predict_mean(mach, Xnew) ## new predictions\n\nfitted_params(mach).features\nfitted_params(mach).coef ## x1, x2, intercept\nfitted_params(mach).intercept\n\nreport(mach)","category":"page"},{"location":"models/LinearRegressor_GLM/","page":"LinearRegressor","title":"LinearRegressor","text":"See also LinearCountRegressor, LinearBinaryClassifier","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#SelfOrganizingMap_SelfOrganizingMaps","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"SelfOrganizingMap","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"A model type for constructing a self organizing map, based on SelfOrganizingMaps.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"SelfOrganizingMap = @load SelfOrganizingMap pkg=SelfOrganizingMaps","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"Do model = SelfOrganizingMap() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SelfOrganizingMap(k=...).","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"SelfOrganizingMaps implements Kohonen's Self Organizing Map, Proceedings of the IEEE; Kohonen, T.; (1990):\"The self-organizing map\"","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Training-data","page":"SelfOrganizingMap","title":"Training data","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X) where","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"X: an AbstractMatrix or Table of input features whose columns are of scitype Continuous.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Hyper-parameters","page":"SelfOrganizingMap","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"k=10: Number of nodes along once side of SOM grid. There are k² total nodes.\nη=0.5: Learning rate. Scales adjust made to winning node and its neighbors during each round of training.\nσ²=0.05: The (squared) neighbor radius. Used to determine scale for neighbor node adjustments.\ngrid_type=:rectangular Node grid geometry. One of (:rectangular, :hexagonal, :spherical).\nη_decay=:exponential Learning rate schedule function. One of (:exponential, :asymptotic)\nσ_decay=:exponential Neighbor radius schedule function. One of (:exponential, :asymptotic, :none)\nneighbor_function=:gaussian Kernel function used to make adjustment to neighbor weights. Scale is set by σ². One of (:gaussian, :mexican_hat).\nmatching_distance=euclidean Distance function from Distances.jl used to determine winning node.\nNepochs=1 Number of times to repeat training on the shuffled dataset.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Operations","page":"SelfOrganizingMap","title":"Operations","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"transform(mach, Xnew): returns the coordinates of the winning SOM node for each instance of Xnew. For SOM of gridtype :rectangular and :hexagonal, these are cartesian coordinates. For gridtype :spherical, these are the latitude and longitude in radians.","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Fitted-parameters","page":"SelfOrganizingMap","title":"Fitted parameters","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"coords: The coordinates of each of the SOM nodes (points in the domain of the map) with shape (k², 2)\nweights: Array of weight vectors for the SOM nodes (corresponding points in the map's range) of shape (k², input dimension)","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Report","page":"SelfOrganizingMap","title":"Report","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"The fields of report(mach) are:","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"classes: the index of the winning node for each instance of the training data X interpreted as a class label","category":"page"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/#Examples","page":"SelfOrganizingMap","title":"Examples","text":"","category":"section"},{"location":"models/SelfOrganizingMap_SelfOrganizingMaps/","page":"SelfOrganizingMap","title":"SelfOrganizingMap","text":"using MLJ\nsom = @load SelfOrganizingMap pkg=SelfOrganizingMaps\nmodel = som()\nX, y = make_regression(50, 3) ## synthetic data\nmach = machine(model, X) |> fit!\nX̃ = transform(mach, X)\n\nrpt = report(mach)\nclasses = rpt.classes","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#MultinomialClassifier_MLJLinearModels","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"MultinomialClassifier","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"A model type for constructing a multinomial classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"MultinomialClassifier = @load MultinomialClassifier pkg=MLJLinearModels","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"Do model = MultinomialClassifier() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"This model coincides with LogisticClassifier, except certain optimizations possible in the special binary case will not be applied. Its hyperparameters are identical.","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#Training-data","page":"MultinomialClassifier","title":"Training data","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"where:","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#Hyperparameters","page":"MultinomialClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: eps()\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, NewtonCG, ProxGrad; but subject to the following restrictions:\nIf penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.\nUnless scitype(y) <: Finite{2} (binary target) Newton is disallowed.\nIf solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/#Example","page":"MultinomialClassifier","title":"Example","text":"","category":"section"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"using MLJ\nX, y = make_blobs(centers = 3)\nmach = fit!(machine(MultinomialClassifier(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/MultinomialClassifier_MLJLinearModels/","page":"MultinomialClassifier","title":"MultinomialClassifier","text":"See also LogisticClassifier.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#MultitargetSRRegressor_SymbolicRegression","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"MultitargetSRRegressor","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(binary_operators=...).","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Training-data","page":"MultitargetSRRegressor","title":"Training data","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"OR","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Here:","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.\nw is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous. The same weights are used for all targets.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Hyper-parameters","page":"MultitargetSRRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.\nunary_operators: Same, but for unary operators (one input scalar, gives an output scalar).\nconstraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.\nbatching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.\nbatch_size: What batch size to use if using batching.\nelementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).\nloss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,\n function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}\n prediction, flag = eval_tree_array(tree, dataset.X, options)\n if !flag\n return L(Inf)\n end\n return sum((prediction .- dataset.y) .^ 2) / dataset.n\n end\npopulations: How many populations of equations to use.\npopulation_size: How many equations in each population.\nncycles_per_iteration: How many generations to consider per iteration.\ntournament_selection_n: Number of expressions considered in each tournament.\ntournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.\ntopn: Number of equations to return to the host process, and to consider for the hall of fame.\ncomplexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].\ncomplexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.\ncomplexity_of_variables: What complexity should be assigned to each variable. By default, this is 1.\nalpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.\nmaxsize: Maximum size of equations during the search.\nmaxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.\nparsimony: A multiplicative factor for how much complexity is punished.\ndimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.\nuse_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.\nuse_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.\nadaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.\nturbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!\nmigration: Whether to migrate equations between processes.\nhof_migration: Whether to migrate equations from the hall of fame to processes.\nfraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.\nfraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.\nshould_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.\nshould_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.\noptimizer_nrestarts: How many different random starting positions to consider for optimization of constants.\noptimizer_algorithm: Select algorithm to use for optimizing constants. Default is \"BFGS\", but \"NelderMead\" is also supported.\noptimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.\noutput_file: What file to store equations to, as a backup.\nperturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).\nprobability_negate_constant: Probability of negating a constant in the equation when mutating it.\nmutation_weights: Relative probabilities of the mutations. The struct MutationWeights should be passed to these options. See its documentation on MutationWeights for the different weights.\ncrossover_probability: Probability of performing crossover.\nannealing: Whether to use simulated annealing.\nwarmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.\nverbosity: Whether to print debugging statements or not.\nprint_precision: How many digits to print when printing equations. By default, this is 5.\nsave_to_file: Whether to save equations to a file during the search.\nbin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).\nuna_constraints: Likewise, for unary operators.\nseed: What random seed to use. nothing uses no seed.\nprogress: Whether to use a progress bar output (verbosity will have no effect).\nearly_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.\ntimeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).\nmax_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.\nskip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.\nenable_autodiff: Whether to enable automatic differentiation functionality. This is turned off by default. If turned on, this will be turned off if one of the operators does not have well-defined gradients.\nnested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.\ndeterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.\ndefine_helper_functions: Whether to define helper functions for constructing and evaluating trees.\nniterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.\nparallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like \"multithreading\".\nnumprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).\nprocs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.\naddprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.\nheap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.\nruntests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.\nloss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.\nselection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.\ndimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Operations","page":"MultitargetSRRegressor","title":"Operations","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.\npredict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Fitted-parameters","page":"MultitargetSRRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity. T is equal to the element type of the passed data.\nequation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Report","page":"MultitargetSRRegressor","title":"Report","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.\nequations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity.\nequation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.\ncomplexities::Vector{Vector{Int}}: The complexity of each expression in each Pareto frontier.\nlosses::Vector{Vector{L}}: The loss of each expression in each Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.\nscores::Vector{Vector{L}}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/#Examples","page":"MultitargetSRRegressor","title":"Examples","text":"","category":"section"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"using MLJ\nMultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression\nX = (a=rand(100), b=rand(100), c=rand(100))\nY = (y1=(@. cos(X.c) * 2.1 - 0.9), y2=(@. X.a * X.b + X.c))\nmodel = MultitargetSRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)\nmach = machine(model, X, Y)\nfit!(mach)\ny_hat = predict(mach, X)\n## View the equations used:\nr = report(mach)\nfor (output_index, (eq, i)) in enumerate(zip(r.equation_strings, r.best_idx))\n println(\"Equation used for \", output_index, \": \", eq[i])\nend","category":"page"},{"location":"models/MultitargetSRRegressor_SymbolicRegression/","page":"MultitargetSRRegressor","title":"MultitargetSRRegressor","text":"See also SRRegressor.","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/#PerceptronClassifier_MLJScikitLearnInterface","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"","category":"section"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"PerceptronClassifier","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"A model type for constructing a perceptron classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"PerceptronClassifier = @load PerceptronClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"Do model = PerceptronClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PerceptronClassifier(penalty=...).","category":"page"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"PerceptronClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PerceptronClassifier_MLJScikitLearnInterface/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"penalty = nothing\nalpha = 0.0001\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\neta0 = 1.0\nn_jobs = nothing\nrandom_state = 0\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nclass_weight = nothing\nwarm_start = false","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/#KNeighborsRegressor_MLJScikitLearnInterface","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"","category":"section"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"KNeighborsRegressor","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"A model type for constructing a K-nearest neighbors regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"KNeighborsRegressor = @load KNeighborsRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"Do model = KNeighborsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsRegressor(n_neighbors=...).","category":"page"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"KNeighborsRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNeighborsRegressor_MLJScikitLearnInterface/","page":"KNeighborsRegressor","title":"KNeighborsRegressor","text":"n_neighbors = 5\nweights = uniform\nalgorithm = auto\nleaf_size = 30\np = 2\nmetric = minkowski\nmetric_params = nothing\nn_jobs = nothing","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#NeuralNetworkRegressor_MLJFlux","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"A model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Do model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Training-data","page":"NeuralNetworkRegressor","title":"Training data","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Here:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Hyper-parameters","page":"NeuralNetworkRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreasing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Operations","page":"NeuralNetworkRegressor","title":"Operations","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Fitted-parameters","page":"NeuralNetworkRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"chain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Report","page":"NeuralNetworkRegressor","title":"Report","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"The fields of report(mach) are:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"training_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Examples","page":"NeuralNetworkRegressor","title":"Examples","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"In this example we build a regression model for the Boston house price dataset.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"using MLJ\nimport MLJFlux\nusing Flux","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"First, we load in the data: The :MEDV column becomes the target vector y, and all remaining columns go into a table X, with the exception of :CHAS:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"data = OpenML.load(531); ## Loads from https://www.openml.org/d/531\ny, X = unpack(data, ==(:MEDV), !=(:CHAS); rng=123);\n\nscitype(y)\nschema(X)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Since MLJFlux models do not handle ordered factors, we'll treat :RAD as Continuous:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"X = coerce(X, :RAD=>Continuous)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Splitting off a test set:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Next, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features (which will be known at fit! time) and rng is a proxy for a RNG (which will be passed from the rng field of model defined below). We also have the parameter n_out which is the number of output features. As we are doing single target regression, the value passed will always be 1, but the builder we define will also work for MultitargetNeuralRegressor.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"builder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"Instantiating a model:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux\nmodel = NeuralNetworkRegressor(\n builder=builder,\n rng=123,\n epochs=20\n)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"We arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"pipe = Standardizer |> TransformedTargetModel(model, target=Standardizer)","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"If we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach).","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"mach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n## first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/#Experimenting-with-learning-rate","page":"NeuralNetworkRegressor","title":"Experimenting with learning rate","text":"","category":"section"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"We can visually compare how the learning rate affects the predictions:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"using Plots\n\nrates = rates = [5e-5, 1e-4, 0.005, 0.001, 0.05]\nplt=plot()\n\nforeach(rates) do η\n pipe.transformed_target_model_deterministic.model.optimiser.eta = η\n fit!(mach, force=true, verbosity=0)\n losses =\n report(mach).transformed_target_model_deterministic.model.training_losses[3:end]\n plot!(1:length(losses), losses, label=η)\nend\n\nplt\n\npipe.transformed_target_model_deterministic.model.optimiser.eta = 0.0001","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"With the learning rate fixed, we compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"## CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=l2)\n\n## loss for `(Xtest, test)`:\nfit!(mach) ## train on `(X, y)`\nyhat = predict(mach, Xtest)\nl2(yhat, ytest) |> mean","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.","category":"page"},{"location":"models/NeuralNetworkRegressor_MLJFlux/","page":"NeuralNetworkRegressor","title":"NeuralNetworkRegressor","text":"See also MultitargetNeuralNetworkRegressor","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/#PassiveAggressiveRegressor_MLJScikitLearnInterface","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"","category":"section"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"PassiveAggressiveRegressor","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"A model type for constructing a passive aggressive regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"PassiveAggressiveRegressor = @load PassiveAggressiveRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"Do model = PassiveAggressiveRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveRegressor(C=...).","category":"page"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"PassiveAggressiveRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/PassiveAggressiveRegressor_MLJScikitLearnInterface/","page":"PassiveAggressiveRegressor","title":"PassiveAggressiveRegressor","text":"C = 1.0\nfit_intercept = true\nmax_iter = 1000\ntol = 0.0001\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nshuffle = true\nverbose = 0\nloss = epsilon_insensitive\nepsilon = 0.1\nrandom_state = nothing\nwarm_start = false\naverage = false","category":"page"},{"location":"models/LOCIDetector_OutlierDetectionPython/#LOCIDetector_OutlierDetectionPython","page":"LOCIDetector","title":"LOCIDetector","text":"","category":"section"},{"location":"models/LOCIDetector_OutlierDetectionPython/","page":"LOCIDetector","title":"LOCIDetector","text":"LOCIDetector(alpha = 0.5,\n k = 3)","category":"page"},{"location":"models/LOCIDetector_OutlierDetectionPython/","page":"LOCIDetector","title":"LOCIDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.loci","category":"page"},{"location":"api/#Index-of-Methods","page":"Index of Methods","title":"Index of Methods","text":"","category":"section"},{"location":"api/","page":"Index of Methods","title":"Index of Methods","text":"","category":"page"},{"location":"models/OCSVMDetector_OutlierDetectionPython/#OCSVMDetector_OutlierDetectionPython","page":"OCSVMDetector","title":"OCSVMDetector","text":"","category":"section"},{"location":"models/OCSVMDetector_OutlierDetectionPython/","page":"OCSVMDetector","title":"OCSVMDetector","text":"OCSVMDetector(kernel = \"rbf\",\n degree = 3,\n gamma = \"auto\",\n coef0 = 0.0,\n tol = 0.001,\n nu = 0.5,\n shrinking = true,\n cache_size = 200,\n verbose = false,\n max_iter = -1)","category":"page"},{"location":"models/OCSVMDetector_OutlierDetectionPython/","page":"OCSVMDetector","title":"OCSVMDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.ocsvm","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/#ExtraTreesRegressor_MLJScikitLearnInterface","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"","category":"section"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"ExtraTreesRegressor","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"A model type for constructing a extra trees regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"ExtraTreesRegressor = @load ExtraTreesRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"Do model = ExtraTreesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesRegressor(n_estimators=...).","category":"page"},{"location":"models/ExtraTreesRegressor_MLJScikitLearnInterface/","page":"ExtraTreesRegressor","title":"ExtraTreesRegressor","text":"Extra trees regressor, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.","category":"page"},{"location":"models/LOFDetector_OutlierDetectionPython/#LOFDetector_OutlierDetectionPython","page":"LOFDetector","title":"LOFDetector","text":"","category":"section"},{"location":"models/LOFDetector_OutlierDetectionPython/","page":"LOFDetector","title":"LOFDetector","text":"LOFDetector(n_neighbors = 5,\n algorithm = \"auto\",\n leaf_size = 30,\n metric = \"minkowski\",\n p = 2,\n metric_params = nothing,\n n_jobs = 1,\n novelty = true)","category":"page"},{"location":"models/LOFDetector_OutlierDetectionPython/","page":"LOFDetector","title":"LOFDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lof","category":"page"},{"location":"models/PerceptronClassifier_BetaML/#PerceptronClassifier_BetaML","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"","category":"section"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"mutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/PerceptronClassifier_BetaML/#Hyperparameters:","page":"PerceptronClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]\ninitial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]\nepochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]\nshuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]\nforce_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]\nreturn_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/PerceptronClassifier_BetaML/#Example:","page":"PerceptronClassifier","title":"Example:","text":"","category":"section"},{"location":"models/PerceptronClassifier_BetaML/","page":"PerceptronClassifier","title":"PerceptronClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load PerceptronClassifier pkg = \"BetaML\"\n[ Info: For silent loading, specify `verbosity=0`. \nimport BetaML ✔\nBetaML.Perceptron.PerceptronClassifier\n\njulia> model = modelType()\nPerceptronClassifier(\n initial_coefficients = nothing, \n initial_constant = nothing, \n epochs = 1000, \n shuffle = true, \n force_origin = false, \n return_mean_hyperplane = false, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(PerceptronClassifier(initial_coefficients = nothing, …), …).\n*** Avg. error after epoch 2 : 0.0 (all elements of the set has been correctly classified)\njulia> est_classes = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>2.53e-34, virginica=>0.0)\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>1.27e-18, virginica=>1.86e-310)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>2.77e-57, versicolor=>1.1099999999999999e-82, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)","category":"page"},{"location":"models/ABODDetector_OutlierDetectionPython/#ABODDetector_OutlierDetectionPython","page":"ABODDetector","title":"ABODDetector","text":"","category":"section"},{"location":"models/ABODDetector_OutlierDetectionPython/","page":"ABODDetector","title":"ABODDetector","text":"ABODDetector(n_neighbors = 5,\n method = \"fast\")","category":"page"},{"location":"models/ABODDetector_OutlierDetectionPython/","page":"ABODDetector","title":"ABODDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.abod","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/#TransformedTargetModel_MLJBase","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"","category":"section"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"Wrap the supervised or semi-supervised model in a transformation of the target variable.","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"Here transformer one of the following:","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"The Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.\nA callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"Specify cache=false to prioritize memory over speed, or to guarantee data anonymity.","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"Specify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/#Examples","page":"TransformedTargetModel","title":"Examples","text":"","category":"section"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"A model that normalizes the target before applying ridge regression, with predictions returned on the original scale:","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"@load RidgeRegressor pkg=MLJLinearModels\nmodel = RidgeRegressor()\ntmodel = TransformedTargetModel(model, transformer=Standardizer())","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"A model that applies a static log transformation to the data, again returning predictions to the original scale:","category":"page"},{"location":"models/TransformedTargetModel_MLJBase/","page":"TransformedTargetModel","title":"TransformedTargetModel","text":"tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))","category":"page"},{"location":"preparing_data/#Preparing-Data","page":"Preparing Data","title":"Preparing Data","text":"","category":"section"},{"location":"preparing_data/#Splitting-data","page":"Preparing Data","title":"Splitting data","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MLJ has two tools for splitting data. To split data vertically (that is, to split by observations) use partition. This is commonly applied to a vector of observation indices, but can also be applied to datasets themselves, provided they are vectors, matrices or tables.","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"To split tabular data horizontally (i.e., break up a table based on feature names) use unpack.","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MLJBase.partition\nMLJBase.unpack","category":"page"},{"location":"preparing_data/#MLJBase.partition","page":"Preparing Data","title":"MLJBase.partition","text":"partition(X, fractions...;\n shuffle=nothing,\n rng=Random.GLOBAL_RNG,\n stratify=nothing,\n multi=false)\n\nSplits the vector, matrix or table X into a tuple of objects of the same type, whose vertical concatenation is X. The number of rows in each component of the return value is determined by the corresponding fractions of length(nrows(X)), where valid fractions are floats between 0 and 1 whose sum is less than one. The last fraction is not provided, as it is inferred from the preceding ones.\n\nFor synchronized partitioning of multiple objects, use the multi=true option.\n\njulia> partition(1:1000, 0.8)\n([1,...,800], [801,...,1000])\n\njulia> partition(1:1000, 0.2, 0.7)\n([1,...,200], [201,...,900], [901,...,1000])\n\njulia> partition(reshape(1:10, 5, 2), 0.2, 0.4)\n([1 6], [2 7; 3 8], [4 9; 5 10])\n\njulia> X, y = make_blobs() # a table and vector\njulia> Xtrain, Xtest = partition(X, 0.8, stratify=y)\n\nHere's an example of synchronized partitioning of multiple objects:\n\njulia> (Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)\n\nKeywords\n\nshuffle=nothing: if set to true, shuffles the rows before taking fractions.\nrng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.\nstratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.\nmulti=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).\n\n\n\n\n\n","category":"function"},{"location":"preparing_data/#MLJBase.unpack","page":"Preparing Data","title":"MLJBase.unpack","text":"unpack(table, f1, f2, ... fk;\n wrap_singles=false,\n shuffle=false,\n rng::Union{AbstractRNG,Int,Nothing}=nothing,\n coerce_options...)\n\nHorizontally split any Tables.jl compatible table into smaller tables or vectors by making column selections determined by the predicates f1, f2, ..., fk. Selection from the column names is without replacement. A predicate is any object f such that f(name) is true or false for each column name::Symbol of table.\n\nReturns a tuple of tables/vectors with length one greater than the number of supplied predicates, with the last component including all previously unselected columns.\n\njulia> table = DataFrame(x=[1,2], y=['a', 'b'], z=[10.0, 20.0], w=[\"A\", \"B\"])\n2×4 DataFrame\n Row │ x y z w\n │ Int64 Char Float64 String\n─────┼──────────────────────────────\n 1 │ 1 a 10.0 A\n 2 │ 2 b 20.0 B\n\njulia> Z, XY, W = unpack(table, ==(:z), !=(:w));\njulia> Z\n2-element Vector{Float64}:\n 10.0\n 20.0\n\njulia> XY\n2×2 DataFrame\n Row │ x y\n │ Int64 Char\n─────┼─────────────\n 1 │ 1 a\n 2 │ 2 b\n\njulia> W # the column(s) left over\n2-element Vector{String}:\n \"A\"\n \"B\"\n\nWhenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.\n\nIf coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.\n\nIf shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.\n\n\n\n\n\n","category":"function"},{"location":"preparing_data/#Bridging-the-gap-between-data-type-and-model-requirements","page":"Preparing Data","title":"Bridging the gap between data type and model requirements","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"As outlined in Getting Started, it is important that the scientific type of data matches the requirements of the model of interest. For example, while the majority of supervised learning models require input features to be Continuous, newcomers to MLJ are sometimes surprised at the disappointing results of model queries such as this one:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"using MLJ","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"X = (height = [185, 153, 163, 114, 180],\n time = [2.3, 4.5, 4.2, 1.8, 7.1],\n mark = [\"D\", \"A\", \"C\", \"B\", \"A\"],\n admitted = [\"yes\", \"no\", missing, \"yes\"]);\ny = [12.4, 12.5, 12.0, 31.9, 43.0]\nmodels(matching(X, y))","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Or are unsure about the source of the following warning:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"julia> Tree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0;\njulia> tree = Tree();\n\njulia> machine(tree, X, y)\n┌ Warning: The scitype of `X`, in `machine(model, X, ...)` is incompatible with `model=DecisionTreeRegressor @378`:\n│ scitype(X) = Table{Union{AbstractVector{Continuous}, AbstractVector{Count}, AbstractVector{Textual}, AbstractVector{Union{Missing, Textual}}}}\n│ input_scitype(model) = Table{var\"#s46\"} where var\"#s46\"<:Union{AbstractVector{var\"#s9\"} where var\"#s9\"<:Continuous, AbstractVector{var\"#s9\"} where var\"#s9\"<:Count, AbstractVector{var\"#s9\"} where var\"#s9\"<:OrderedFactor}.\n└ @ MLJBase ~/Dropbox/Julia7/MLJ/MLJBase/src/machines.jl:103\nMachine{DecisionTreeRegressor,…} @198 trained 0 times; caches data\n args:\n 1: Source @628 ⏎ `Table{Union{AbstractVector{Continuous}, AbstractVector{Count}, AbstractVector{Textual}, AbstractVector{Union{Missing, Textual}}}}`\n 2: Source @544 ⏎ `AbstractVector{Continuous}`","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"The meaning of the warning is:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"The input X is a table with column scitypes Continuous, Count, and Textual and Union{Missing, Textual}, which can also see by inspecting the schema:\nschema(X)\nThe model requires a table whose column element scitypes subtype Continuous, an incompatibility.","category":"page"},{"location":"preparing_data/#Common-data-preprocessing-workflows","page":"Preparing Data","title":"Common data preprocessing workflows","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"There are two tools for addressing data-model type mismatches like the above, with links to further documentation given below:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Scientific type coercion: We coerce machine types to obtain the intended scientific interpretation. If height in the above example is intended to be Continuous, mark is supposed to be OrderedFactor, and admitted a (binary) Multiclass, then we can do","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"X_coerced = coerce(X, :height=>Continuous, :mark=>OrderedFactor, :admitted=>Multiclass);\nschema(X_coerced)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Data transformations: We carry out conventional data transformations, such as missing value imputation and feature encoding:","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"imputer = FillImputer()\nmach = machine(imputer, X_coerced) |> fit!\nX_imputed = transform(mach, X_coerced);\nschema(X_imputed)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"encoder = ContinuousEncoder()\nmach = machine(encoder, X_imputed) |> fit!\nX_encoded = transform(mach, X_imputed)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"schema(X_encoded)","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Such transformations can also be combined in a pipeline; see Linear Pipelines.","category":"page"},{"location":"preparing_data/#Scientific-type-coercion","page":"Preparing Data","title":"Scientific type coercion","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Scientific type coercion is documented in detail at ScientificTypesBase.jl. See also the tutorial at the this MLJ Workshop (specifically, here) and this Data Science in Julia tutorial.","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"Also relevant is the section, Working with Categorical Data.","category":"page"},{"location":"preparing_data/#Data-transformation","page":"Preparing Data","title":"Data transformation","text":"","category":"section"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MLJ's Built-in transformers are documented at Transformers and Other Unsupervised Models. The most relevant in the present context are: ContinuousEncoder, OneHotEncoder, FeatureSelector and FillImputer. A Gaussian mixture models imputer is provided by BetaML, which can be loaded with","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"MissingImputator = @load MissingImputator pkg=BetaML","category":"page"},{"location":"preparing_data/","page":"Preparing Data","title":"Preparing Data","text":"This MLJ Workshop, and the \"End-to-end examples\" in Data Science in Julia tutorials give further illustrations of data preprocessing in MLJ.","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/#AgglomerativeClustering_MLJScikitLearnInterface","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"","category":"section"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"AgglomerativeClustering","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"A model type for constructing a agglomerative clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"AgglomerativeClustering = @load AgglomerativeClustering pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"Do model = AgglomerativeClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AgglomerativeClustering(n_clusters=...).","category":"page"},{"location":"models/AgglomerativeClustering_MLJScikitLearnInterface/","page":"AgglomerativeClustering","title":"AgglomerativeClustering","text":"Recursively merges the pair of clusters that minimally increases a given linkage distance. Note: there is no predict or transform. Instead, inspect the fitted_params.","category":"page"},{"location":"","page":"Home","title":"Home","text":"\n\n
\n About  | \n Install  | \n Learn  | \n Cheatsheet  | \n Workflows  | \n For Developers  | \n 3rd Party Packages\n
\n\n\nMLJ\n
\n\nA Machine Learning Framework for Julia","category":"page"},{"location":"","page":"Home","title":"Home","text":"To support MLJ development, please cite these works or star the repo:","category":"page"},{"location":"","page":"Home","title":"Home","text":"(Image: DOI) (Image: arXiv)","category":"page"},{"location":"","page":"Home","title":"Home","text":"\n Star","category":"page"},{"location":"#[Model-Browser](@ref)","page":"Home","title":"Model Browser","text":"","category":"section"},{"location":"#Reference-Manual","page":"Home","title":"Reference Manual","text":"","category":"section"},{"location":"#Basics","page":"Home","title":"Basics","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Getting Started | Working with Categorical Data | Common MLJ Workflows | Machines | MLJ Cheatsheet ","category":"page"},{"location":"#Data","page":"Home","title":"Data","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Working with Categorical Data | Preparing Data | Generating Synthetic Data | OpenML Integration | Correcting Class Imbalance","category":"page"},{"location":"#Models","page":"Home","title":"Models","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Model Search | Loading Model Code | Transformers and Other Unsupervised Models | Simple User Defined Models | List of Supported Models | Third Party Packages ","category":"page"},{"location":"#Meta-algorithms","page":"Home","title":"Meta-algorithms","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Evaluating Model Performance | Tuning Models | Composing Models | Controlling Iterative Models | Learning Curves| Correcting Class Imbalance | Thresholding Probabilistic Predictors","category":"page"},{"location":"#Composition","page":"Home","title":"Composition","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Composing Models | Linear Pipelines | Target Transformations | Homogeneous Ensembles | Model Stacking | Learning Networks| Correcting Class Imbalance","category":"page"},{"location":"#Integration","page":"Home","title":"Integration","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Logging Workflows | OpenML Integration","category":"page"},{"location":"#Customization-and-Extension","page":"Home","title":"Customization and Extension","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Simple User Defined Models | Quick-Start Guide to Adding Models | Adding Models for General Use | Composing Models | Internals | Modifying Behavior","category":"page"},{"location":"#Miscellaneous","page":"Home","title":"Miscellaneous","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Weights | Acceleration and Parallelism | Performance Measures ","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/#SVMNuClassifier_MLJScikitLearnInterface","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"","category":"section"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"SVMNuClassifier","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"A model type for constructing a nu-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"SVMNuClassifier = @load SVMNuClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"Do model = SVMNuClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuClassifier(nu=...).","category":"page"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMNuClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMNuClassifier_MLJScikitLearnInterface/","page":"SVMNuClassifier","title":"SVMNuClassifier","text":"nu = 0.5\nkernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\nshrinking = true\ntol = 0.001\ncache_size = 200\nmax_iter = -1\ndecision_function_shape = ovr\nrandom_state = nothing","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#KernelPCA_MultivariateStats","page":"KernelPCA","title":"KernelPCA","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"KernelPCA","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"A model type for constructing a kernel prinicipal component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"KernelPCA = @load KernelPCA pkg=MultivariateStats","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"Do model = KernelPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KernelPCA(maxoutdim=...).","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"In kernel PCA the linear operations of ordinary principal component analysis are performed in a reproducing Hilbert space.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Training-data","page":"KernelPCA","title":"Training data","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"mach = machine(model, X)","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"Here:","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Hyper-parameters","page":"KernelPCA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.\nkernel::Function=(x,y)->x'y: The kernel function, takes in 2 vector arguments x and y, returns a scalar value. Defaults to the dot product of x and y.\nsolver::Symbol=:eig: solver to use for the eigenvalues, one of :eig(default, uses LinearAlgebra.eigen), :eigs(uses Arpack.eigs).\ninverse::Bool=true: perform calculations needed for inverse transform\nbeta::Real=1.0: strength of the ridge regression that learns the inverse transform when inverse is true.\ntol::Real=0.0: Convergence tolerance for eigenvalue solver.\nmaxiter::Int=300: maximum number of iterations for eigenvalue solver.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Operations","page":"KernelPCA","title":"Operations","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.\ninverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Fitted-parameters","page":"KernelPCA","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Report","page":"KernelPCA","title":"Report","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"The fields of report(mach) are:","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"indim: Dimension (number of columns) of the training data and new data to be transformed.\noutdim: Dimension of transformed data.\nprincipalvars: The variance of the principal components.","category":"page"},{"location":"models/KernelPCA_MultivariateStats/#Examples","page":"KernelPCA","title":"Examples","text":"","category":"section"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"using MLJ\nusing LinearAlgebra\n\nKernelPCA = @load KernelPCA pkg=MultivariateStats\n\nX, y = @load_iris ## a table and a vector\n\nfunction rbf_kernel(length_scale)\n return (x,y) -> norm(x-y)^2 / ((2 * length_scale)^2)\nend\n\nmodel = KernelPCA(maxoutdim=2, kernel=rbf_kernel(1))\nmach = machine(model, X) |> fit!\n\nXproj = transform(mach, X)","category":"page"},{"location":"models/KernelPCA_MultivariateStats/","page":"KernelPCA","title":"KernelPCA","text":"See also PCA, ICA, FactorAnalysis, PPCA","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#StableRulesClassifier_SIRUS","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"StableRulesClassifier","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"A model type for constructing a stable rules classifier, based on SIRUS.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"StableRulesClassifier = @load StableRulesClassifier pkg=SIRUS","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"Do model = StableRulesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesClassifier(rng=...).","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"StableRulesClassifier implements the explainable rule-based model based on a random forest.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Training-data","page":"StableRulesClassifier","title":"Training data","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"where","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Hyperparameters","page":"StableRulesClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.\npartial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.\nn_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.\nmax_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).\nq::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.\nmin_data_in_leaf::Int=5: Minimum number of data points per leaf.\nmax_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.\nlambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Fitted-parameters","page":"StableRulesClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"fitresult: A StableRules object.","category":"page"},{"location":"models/StableRulesClassifier_SIRUS/#Operations","page":"StableRulesClassifier","title":"Operations","text":"","category":"section"},{"location":"models/StableRulesClassifier_SIRUS/","page":"StableRulesClassifier","title":"StableRulesClassifier","text":"predict(mach, Xnew): Return a vector of predictions for each row of Xnew.","category":"page"},{"location":"quick_start_guide_to_adding_models/#Quick-Start-Guide-to-Adding-Models","page":"Quick-Start Guide to Adding Models","title":"Quick-Start Guide to Adding Models","text":"","category":"section"},{"location":"quick_start_guide_to_adding_models/","page":"Quick-Start Guide to Adding Models","title":"Quick-Start Guide to Adding Models","text":"This guide has moved to this section of the MLJModelInterface.jl documentation.","category":"page"},{"location":"quick_start_guide_to_adding_models/","page":"Quick-Start Guide to Adding Models","title":"Quick-Start Guide to Adding Models","text":"For quick-and-dirty user-defined models, not intended for registering with the MLJ Model Registry, see Simple User Defined Models. ","category":"page"},{"location":"target_transformations/#Target-Transformations","page":"Target Transformations","title":"Target Transformations","text":"","category":"section"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Some supervised models work best if the target variable has been standardized, i.e., rescaled to have zero mean and unit variance. Such a target transformation is learned from the values of the training target variable. In particular, one generally learns a different transformation when training on a proper subset of the training data. Good data hygiene prescribes that a new transformation should be computed each time the supervised model is trained on new data - for example in cross-validation.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Additionally, one generally wants to inverse transform the predictions of the supervised model for the final target predictions to be on the original scale.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"All these concerns are addressed by wrapping the supervised model using TransformedTargetModel:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"using MLJ\nMLJ.color_off()","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Ridge = @load RidgeRegressor pkg=MLJLinearModels verbosity=0\nridge = Ridge(fit_intercept=false)\nridge2 = TransformedTargetModel(ridge, transformer=Standardizer())","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Note that all the original hyperparameters, as well as those of the Standardizer, are accessible as nested hyper-parameters of the wrapped model, which can be trained or evaluated like any other:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"X, y = make_regression(rng=1234, intercept=false)\ny = y*10^5\nmach = machine(ridge2, X, y)\nfit!(mach, rows=1:60, verbosity=0)\npredict(mach, rows=61:62)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Training and predicting using ridge2 as above means:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Standardizing the target y using the first 60 rows to get a new target z\nTraining the original ridge model using the first 60 rows of X and z\nCalling predict on the machine trained in Step 2 on rows 61:62 of X\nApplying the inverse scaling learned in Step 1 to those predictions (to get the final output shown above)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Since both ridge and ridge2 return predictions on the original scale, we can meaningfully compare the corresponding mean absolute errors, which are indeed different in this case.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"evaluate(ridge, X, y, measure=l1)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"evaluate(ridge2, X, y, measure=l1)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Ordinary functions can also be used in target transformations but an inverse must be explicitly specified:","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"ridge3 = TransformedTargetModel(ridge, transformer=y->log.(y), inverse=z->exp.(z))\nX, y = @load_boston\nevaluate(ridge3, X, y, measure=l1)","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"Without the log transform (ie, using ridge) we get the poorer mean absolute error, l1, of 3.9.","category":"page"},{"location":"target_transformations/","page":"Target Transformations","title":"Target Transformations","text":"TransformedTargetModel","category":"page"},{"location":"target_transformations/#MLJBase.TransformedTargetModel","page":"Target Transformations","title":"MLJBase.TransformedTargetModel","text":"TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)\n\nWrap the supervised or semi-supervised model in a transformation of the target variable.\n\nHere transformer one of the following:\n\nThe Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.\nA callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.\n\nSpecify cache=false to prioritize memory over speed, or to guarantee data anonymity.\n\nSpecify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).\n\nExamples\n\nA model that normalizes the target before applying ridge regression, with predictions returned on the original scale:\n\n@load RidgeRegressor pkg=MLJLinearModels\nmodel = RidgeRegressor()\ntmodel = TransformedTargetModel(model, transformer=Standardizer())\n\nA model that applies a static log transformation to the data, again returning predictions to the original scale:\n\ntmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))\n\n\n\n\n\n","category":"function"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/#SVMClassifier_MLJScikitLearnInterface","page":"SVMClassifier","title":"SVMClassifier","text":"","category":"section"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"SVMClassifier","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"A model type for constructing a C-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"SVMClassifier = @load SVMClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"Do model = SVMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMClassifier(C=...).","category":"page"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMClassifier_MLJScikitLearnInterface/","page":"SVMClassifier","title":"SVMClassifier","text":"C = 1.0\nkernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\nshrinking = true\ntol = 0.001\ncache_size = 200\nmax_iter = -1\ndecision_function_shape = ovr\nrandom_state = nothing","category":"page"},{"location":"models/PCADetector_OutlierDetectionPython/#PCADetector_OutlierDetectionPython","page":"PCADetector","title":"PCADetector","text":"","category":"section"},{"location":"models/PCADetector_OutlierDetectionPython/","page":"PCADetector","title":"PCADetector","text":"PCADetector(n_components = nothing,\n n_selected_components = nothing,\n copy = true,\n whiten = false,\n svd_solver = \"auto\",\n tol = 0.0\n iterated_power = \"auto\",\n standardization = true,\n weighted = true,\n random_state = nothing)","category":"page"},{"location":"models/PCADetector_OutlierDetectionPython/","page":"PCADetector","title":"PCADetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.pca","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#RandomForestClassifier_DecisionTree","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"A model type for constructing a CART random forest classifier, based on DecisionTree.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(max_depth=...).","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"RandomForestClassifier implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): \"Random Forests.\", Machine Learning, vol. 45, pp. 5–32.","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Training-data","page":"RandomForestClassifier","title":"Training data","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"where","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)\ny: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"Train the machine with fit!(mach, rows=...).","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Hyperparameters","page":"RandomForestClassifier","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"max_depth=-1: max depth of the decision tree (-1=any)\nmin_samples_leaf=1: min number of samples each leaf needs to have\nmin_samples_split=2: min number of samples needed for a split\nmin_purity_increase=0: min purity needed for a split\nn_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)\nn_trees=10: number of trees to train\nsampling_fraction=0.7 fraction of samples to train each tree on\nfeature_importance: method to use for computing feature importances. One of (:impurity, :split)\nrng=Random.GLOBAL_RNG: random number generator or seed","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Operations","page":"RandomForestClassifier","title":"Operations","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.\npredict_mode(mach, Xnew): instead return the mode of each prediction above.","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Fitted-parameters","page":"RandomForestClassifier","title":"Fitted parameters","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"forest: the Ensemble object returned by the core DecisionTree.jl algorithm","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Report","page":"RandomForestClassifier","title":"Report","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"The fields of report(mach) are:","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"features: the names of the features encountered in training","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Accessor-functions","page":"RandomForestClassifier","title":"Accessor functions","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/#Examples","page":"RandomForestClassifier","title":"Examples","text":"","category":"section"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"using MLJ\nForest = @load RandomForestClassifier pkg=DecisionTree\nforest = Forest(min_samples_split=6, n_subfeatures=3)\n\nX, y = @load_iris\nmach = machine(forest, X, y) |> fit!\n\nXnew = (sepal_length = [6.4, 7.2, 7.4],\n sepal_width = [2.8, 3.0, 2.8],\n petal_length = [5.6, 5.8, 6.1],\n petal_width = [2.1, 1.6, 1.9],)\nyhat = predict(mach, Xnew) ## probabilistic predictions\npredict_mode(mach, Xnew) ## point predictions\npdf.(yhat, \"virginica\") ## probabilities for the \"verginica\" class\n\nfitted_params(mach).forest ## raw `Ensemble` object from DecisionTrees.jl\n\nfeature_importances(mach) ## `:impurity` feature importances\nforest.feature_importance = :split\nfeature_importance(mach) ## `:split` feature importances\n","category":"page"},{"location":"models/RandomForestClassifier_DecisionTree/","page":"RandomForestClassifier","title":"RandomForestClassifier","text":"See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestClassifier.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#LADRegressor_MLJLinearModels","page":"LADRegressor","title":"LADRegressor","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"LADRegressor","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"A model type for constructing a lad regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"LADRegressor = @load LADRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Do model = LADRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Least absolute deviation regression is a linear model with objective function","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"$","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"where ρ is the absolute loss and n is the number of observations.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"If scale_penalty_with_samples = false the objective function is instead","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"$","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":".","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Training-data","page":"LADRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"where:","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Hyperparameters","page":"LADRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"See also RobustRegressor.","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Parameters","page":"LADRegressor","title":"Parameters","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/LADRegressor_MLJLinearModels/#Example","page":"LADRegressor","title":"Example","text":"","category":"section"},{"location":"models/LADRegressor_MLJLinearModels/","page":"LADRegressor","title":"LADRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(LADRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#RidgeRegressor_MLJLinearModels","page":"RidgeRegressor","title":"RidgeRegressor","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"A model type for constructing a ridge regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Do model = RidgeRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Ridge regression is a linear model with objective function","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"$","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 $","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"where n is the number of observations.","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"If scale_penalty_with_samples = false then the objective function is instead","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"$","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"|Xθ - y|₂²/2 + λ|θ|₂²/2 $","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":".","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#Training-data","page":"RidgeRegressor","title":"Training data","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"where:","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#Hyperparameters","page":"RidgeRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"lambda::Real: strength of the L2 regularization. Default: 1.0\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient. If solver = nothing (default) then Analytical() is used. Default: nothing","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/#Example","page":"RidgeRegressor","title":"Example","text":"","category":"section"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(RidgeRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/RidgeRegressor_MLJLinearModels/","page":"RidgeRegressor","title":"RidgeRegressor","text":"See also ElasticNetRegressor.","category":"page"},{"location":"models/KNNDetector_OutlierDetectionPython/#KNNDetector_OutlierDetectionPython","page":"KNNDetector","title":"KNNDetector","text":"","category":"section"},{"location":"models/KNNDetector_OutlierDetectionPython/","page":"KNNDetector","title":"KNNDetector","text":"KNNDetector(n_neighbors = 5,\n method = \"largest\",\n radius = 1.0,\n algorithm = \"auto\",\n leaf_size = 30,\n metric = \"minkowski\",\n p = 2,\n metric_params = nothing,\n n_jobs = 1)","category":"page"},{"location":"models/KNNDetector_OutlierDetectionPython/","page":"KNNDetector","title":"KNNDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.knn","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#LinearRegressor_MultivariateStats","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=MultivariateStats","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(bias=...).","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor assumes the target is a Continuous variable and trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Training-data","page":"LinearRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"Here:","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X).\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y).","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Hyper-parameters","page":"LinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"bias=true: Include the bias term if true, otherwise fit without bias term.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Operations","page":"LinearRegressor","title":"Operations","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Fitted-parameters","page":"LinearRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"coefficients: The linear coefficients determined by the model.\nintercept: The intercept determined by the model.","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/#Examples","page":"LinearRegressor","title":"Examples","text":"","category":"section"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"using MLJ\n\nLinearRegressor = @load LinearRegressor pkg=MultivariateStats\nlinear_regressor = LinearRegressor()\n\nX, y = make_regression(100, 2) ## a table and a vector (synthetic data)\nmach = machine(linear_regressor, X, y) |> fit!\n\nXnew, _ = make_regression(3, 2)\nyhat = predict(mach, Xnew) ## new predictions","category":"page"},{"location":"models/LinearRegressor_MultivariateStats/","page":"LinearRegressor","title":"LinearRegressor","text":"See also MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#QuantileRegressor_MLJLinearModels","page":"QuantileRegressor","title":"QuantileRegressor","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"QuantileRegressor","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"A model type for constructing a quantile regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"Do model = QuantileRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to QuantileRho(delta), where delta is a new hyperparameter.","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#Training-data","page":"QuantileRegressor","title":"Training data","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"where:","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#Hyperparameters","page":"QuantileRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"delta::Real: parameterizes the QuantileRho function (indicating the quantile to use with default 0.5 for the median regression) Default: 0.5\nlambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0\ngamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0\npenalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.\nIf solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.\nSolver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/#Example","page":"QuantileRegressor","title":"Example","text":"","category":"section"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(QuantileRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/QuantileRegressor_MLJLinearModels/","page":"QuantileRegressor","title":"QuantileRegressor","text":"See also RobustRegressor, HuberRegressor.","category":"page"},{"location":"mlj_cheatsheet/#MLJ-Cheatsheet","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"","category":"section"},{"location":"mlj_cheatsheet/#Starting-an-interactive-MLJ-session","page":"MLJ Cheatsheet","title":"Starting an interactive MLJ session","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"using MLJ\nMLJ_VERSION # version of MLJ for this cheatsheet","category":"page"},{"location":"mlj_cheatsheet/#Model-search-and-code-loading","page":"MLJ Cheatsheet","title":"Model search and code loading","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"info(\"PCA\") retrieves registry metadata for the model called \"PCA\"","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"info(\"RidgeRegressor\", pkg=\"MultivariateStats\") retrieves metadata for \"RidgeRegresssor\", which is provided by multiple packages","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"doc(\"DecisionTreeClassifier\", pkg=\"DecisionTree\") retrieves the model document string for the classifier, without loading model code","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models() lists metadata of every registered model.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(\"Tree\") lists models with \"Tree\" in the model or package name.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(x -> x.is_supervised && x.is_pure_julia) lists all supervised models written in pure julia.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(matching(X)) lists all unsupervised models compatible with input X.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(matching(X, y)) lists all supervised models compatible with input/target X/y.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"With additional conditions:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models() do model\n matching(model, X, y) &&\n model.prediction_type == :probabilistic &&\n model.is_pure_julia\nend","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Tree = @load DecisionTreeClassifier pkg=DecisionTree","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"imports \"DecisionTreeClassifier\" type and binds it to Tree.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"tree = Tree() to instantiate a Tree.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"tree2 = Tree(max_depth=2) instantiates a tree with different hyperparameter","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Ridge = @load RidgeRegressor pkg=MultivariateStats imports a type for a model provided by multiple packages","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"For interactive loading instead, use @iload","category":"page"},{"location":"mlj_cheatsheet/#Scitypes-and-coercion","page":"MLJ Cheatsheet","title":"Scitypes and coercion","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"scitype(x) is the scientific type of x. For example scitype(2.4) == Continuous","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"(Image: scitypes_small.png)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"type scitype\nAbstractFloat Continuous\nInteger Count\nCategoricalValue and CategoricalString Multiclass or OrderedFactor\nAbstractString Textual","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Figure and Table for common scalar scitypes","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Use schema(X) to get the column scitypes of a table X","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"To coerce the data into different scitypes, use the coerce function:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"coerce(y, Multiclass) attempts coercion of all elements of y into scitype Multiclass\ncoerce(X, :x1 => Continuous, :x2 => OrderedFactor) to coerce columns :x1 and :x2 of table X.\ncoerce(X, Count => Continuous) to coerce all columns with Count scitype to Continuous.","category":"page"},{"location":"mlj_cheatsheet/#Ingesting-data","page":"MLJ Cheatsheet","title":"Ingesting data","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Split the table channing into target y (the :Exit column) and features X (everything else), after a seeded row shuffling:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"using RDatasets\nchanning = dataset(\"boot\", \"channing\")\ny, X = unpack(channing, ==(:Exit); rng=123)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Same as above but exclude :Time column from X:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"using RDatasets\nchanning = dataset(\"boot\", \"channing\")\ny, X = unpack(channing,\n ==(:Exit),\n !=(:Time);\n rng=123)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Here, y is assigned the :Exit column, and X is assigned the rest, except :Time.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Splitting row indices into train/validation/test, with seeded shuffling:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"train, valid, test = partition(eachindex(y), 0.7, 0.2, rng=1234) # for 70:20:10 ratio","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"For a stratified split:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"train, test = partition(eachindex(y), 0.8, stratify=y)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Split a table or matrix X, instead of indices:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Xtrain, Xvalid, Xtest = partition(X, 0.5, 0.3, rng=123)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Simultaneous splitting (needs multi=true):","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Getting data from OpenML:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"table = OpenML.load(91)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Creating synthetic classification data:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"X, y = make_blobs(100, 2)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"(also: make_moons, make_circles, make_regression)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Creating synthetic regression data:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"X, y = make_regression(100, 2)","category":"page"},{"location":"mlj_cheatsheet/#Machine-construction","page":"MLJ Cheatsheet","title":"Machine construction","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Supervised case:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"model = KNNRegressor(K=1)\nmach = machine(model, X, y)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Unsupervised case:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"model = OneHotEncoder()\nmach = machine(model, X)","category":"page"},{"location":"mlj_cheatsheet/#Fitting","page":"MLJ Cheatsheet","title":"Fitting","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"The fit! function can be used to fit a machine (defaults shown):","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"fit!(mach, rows=1:100, verbosity=1, force=false)","category":"page"},{"location":"mlj_cheatsheet/#Prediction","page":"MLJ Cheatsheet","title":"Prediction","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Supervised case: predict(mach, Xnew) or predict(mach, rows=1:100)\nFor probabilistic models: predict_mode, predict_mean and predict_median.\nUnsupervised case: W = transform(mach, Xnew) or inverse_transform(mach, W), etc.","category":"page"},{"location":"mlj_cheatsheet/#Inspecting-objects","page":"MLJ Cheatsheet","title":"Inspecting objects","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"info(ConstantRegressor()), info(\"PCA\"), info(\"RidgeRegressor\", pkg=\"MultivariateStats\") gets all properties (aka traits) of registered models","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"schema(X) get column names, types and scitypes, and nrows, of a table X","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"scitype(X) gets the scientific type of X","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"fitted_params(mach) gets learned parameters of the fitted machine","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"report(mach) gets other training results (e.g. feature rankings)","category":"page"},{"location":"mlj_cheatsheet/#Saving-and-retrieving-machines-using-Julia-serializer","page":"MLJ Cheatsheet","title":"Saving and retrieving machines using Julia serializer","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"MLJ.save(\"my_machine.jls\", mach) to save machine mach (without data)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"predict_only_mach = machine(\"my_machine.jls\") to deserialize.","category":"page"},{"location":"mlj_cheatsheet/#Performance-estimation","page":"MLJ Cheatsheet","title":"Performance estimation","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"evaluate(model, X, y, resampling=CV(), measure=rms)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"evaluate!(mach, resampling=Holdout(), measure=[rms, mav])","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"evaluate!(mach, resampling=[(fold1, fold2), (fold2, fold1)], measure=rms)","category":"page"},{"location":"mlj_cheatsheet/#Resampling-strategies-(resampling...)","page":"MLJ Cheatsheet","title":"Resampling strategies (resampling=...)","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Holdout(fraction_train=0.7, rng=1234) for simple holdout","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"CV(nfolds=6, rng=1234) for cross-validation","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"StratifiedCV(nfolds=6, rng=1234) for stratified cross-validation","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"TimeSeriesSV(nfolds=4) for time-series cross-validation","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"InSample(): test set = train set","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"or a list of pairs of row indices:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"[(train1, eval1), (train2, eval2), ... (traink, evalk)]","category":"page"},{"location":"mlj_cheatsheet/#Tuning-model-wrapper","page":"MLJ Cheatsheet","title":"Tuning model wrapper","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"tuned_model = TunedModel(model; tuning=RandomSearch(), resampling=Holdout(), measure=…, range=…)","category":"page"},{"location":"mlj_cheatsheet/#Ranges-for-tuning-(range...)","page":"MLJ Cheatsheet","title":"Ranges for tuning (range=...)","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"If r = range(KNNRegressor(), :K, lower=1, upper = 20, scale=:log)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"then Grid() search uses iterator(r, 6) == [1, 2, 3, 6, 11, 20].","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"lower=-Inf and upper=Inf are allowed.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Non-numeric ranges: r = range(model, :parameter, values=…)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Instead of model, declare type: r = range(Char, :c; values=['a', 'b'])","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Nested ranges: Use dot syntax, as in r = range(EnsembleModel(atom=tree), :(atom.max_depth), ...)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Specify multiple ranges, as in range=[r1, r2, r3]. For more range options do ?Grid or ?RandomSearch","category":"page"},{"location":"mlj_cheatsheet/#Tuning-strategies","page":"MLJ Cheatsheet","title":"Tuning strategies","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"RandomSearch(rng=1234) for basic random search","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Grid(resolution=10) or Grid(goal=50) for basic grid search","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Also available: LatinHyperCube, Explicit (built-in), MLJTreeParzenTuning, ParticleSwarm, AdaptiveParticleSwarm (3rd-party packages)","category":"page"},{"location":"mlj_cheatsheet/#Learning-curves","page":"MLJ Cheatsheet","title":"Learning curves","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"For generating a plot of performance against parameter specified by range:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"curve = learning_curve(mach, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"curve = learning_curve(model, X, y, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"If using Plots.jl:","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"plot(curve.parameter_values, curve.measurements, xlab=curve.parameter_name, xscale=curve.parameter_scale)","category":"page"},{"location":"mlj_cheatsheet/#Controlling-iterative-models","page":"MLJ Cheatsheet","title":"Controlling iterative models","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Requires: using MLJIteration","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"iterated_model = IteratedModel(model=…, resampling=Holdout(), measure=…, controls=…, retrain=false)","category":"page"},{"location":"mlj_cheatsheet/#Controls","page":"MLJ Cheatsheet","title":"Controls","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Increment training: Step(n=1)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Stopping: TimeLimit(t=0.5) (in hours), NumberLimit(n=100), NumberSinceBest(n=6), NotANumber(), Threshold(value=0.0), GL(alpha=2.0), PQ(alpha=0.75, k=5), Patience(n=5)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Logging: Info(f=identity), Warn(f=\"\"), Error(predicate, f=\"\")","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Callbacks: Callback(f=mach->nothing), WithNumberDo(f=n->@info(n)), WithIterationsDo(f=i->@info(\"num iterations: $i\")), WithLossDo(f=x->@info(\"loss: $x\")), WithTrainingLossesDo(f=v->@info(v))","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Snapshots: Save(filename=\"machine.jlso\")","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Wraps: MLJIteration.skip(control, predicate=1), IterationControl.with_state_do(control)","category":"page"},{"location":"mlj_cheatsheet/#Performance-measures-(metrics)","page":"MLJ Cheatsheet","title":"Performance measures (metrics)","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Do measures() to get full list.","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Do measures(\"log\") to list measures with \"log\" in doc-string.","category":"page"},{"location":"mlj_cheatsheet/#Transformers","page":"MLJ Cheatsheet","title":"Transformers","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Built-ins include: Standardizer, OneHotEncoder, UnivariateBoxCoxTransformer, FeatureSelector, FillImputer, UnivariateDiscretizer, ContinuousEncoder, UnivariateTimeTypeToContinuous","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Externals include: PCA (in MultivariateStats), KMeans, KMedoids (in Clustering).","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"models(m -> !m.is_supervised) to get full list","category":"page"},{"location":"mlj_cheatsheet/#Ensemble-model-wrapper","page":"MLJ Cheatsheet","title":"Ensemble model wrapper","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"EnsembleModel(model; weights=Float64[], bagging_fraction=0.8, rng=GLOBAL_RNG, n=100, parallel=true, out_of_bag_measure=[])","category":"page"},{"location":"mlj_cheatsheet/#Target-transformation-wrapper","page":"MLJ Cheatsheet","title":"Target transformation wrapper","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"TransformedTargetModel(model; target=Standardizer())","category":"page"},{"location":"mlj_cheatsheet/#Pipelines","page":"MLJ Cheatsheet","title":"Pipelines","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"pipe = (X -> coerce(X, :height=>Continuous)) |> OneHotEncoder |> KNNRegressor(K=3)","category":"page"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"Unsupervised:\npipe = Standardizer |> OneHotEncoder\nConcatenation:\npipe1 |> pipe2 or model |> pipe or pipe |> model, etc.","category":"page"},{"location":"mlj_cheatsheet/#Advanced-model-composition-techniques","page":"MLJ Cheatsheet","title":"Advanced model composition techniques","text":"","category":"section"},{"location":"mlj_cheatsheet/","page":"MLJ Cheatsheet","title":"MLJ Cheatsheet","text":"See the Composing Models section of the MLJ manual.","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/#ExtraTreesClassifier_MLJScikitLearnInterface","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"","category":"section"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"ExtraTreesClassifier","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"A model type for constructing a extra trees classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"ExtraTreesClassifier = @load ExtraTreesClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"Do model = ExtraTreesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesClassifier(n_estimators=...).","category":"page"},{"location":"models/ExtraTreesClassifier_MLJScikitLearnInterface/","page":"ExtraTreesClassifier","title":"ExtraTreesClassifier","text":"Extra trees classifier, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/#SGDRegressor_MLJScikitLearnInterface","page":"SGDRegressor","title":"SGDRegressor","text":"","category":"section"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"SGDRegressor","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"A model type for constructing a stochastic gradient descent-based regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"SGDRegressor = @load SGDRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"Do model = SGDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDRegressor(loss=...).","category":"page"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SGDRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SGDRegressor_MLJScikitLearnInterface/","page":"SGDRegressor","title":"SGDRegressor","text":"loss = squared_error\npenalty = l2\nalpha = 0.0001\nl1_ratio = 0.15\nfit_intercept = true\nmax_iter = 1000\ntol = 0.001\nshuffle = true\nverbose = 0\nepsilon = 0.1\nrandom_state = nothing\nlearning_rate = invscaling\neta0 = 0.01\npower_t = 0.25\nearly_stopping = false\nvalidation_fraction = 0.1\nn_iter_no_change = 5\nwarm_start = false\naverage = false","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/#LassoCVRegressor_MLJScikitLearnInterface","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"","category":"section"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"LassoCVRegressor","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"A model type for constructing a lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"LassoCVRegressor = @load LassoCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"Do model = LassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoCVRegressor(eps=...).","category":"page"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoCVRegressor_MLJScikitLearnInterface/","page":"LassoCVRegressor","title":"LassoCVRegressor","text":"eps = 0.001\nn_alphas = 100\nalphas = nothing\nfit_intercept = true\nprecompute = auto\nmax_iter = 1000\ntol = 0.0001\ncopy_X = true\ncv = 5\nverbose = false\nn_jobs = nothing\npositive = false\nrandom_state = nothing\nselection = cyclic","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#BorderlineSMOTE1_Imbalance","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Initiate a BorderlineSMOTE1 model with the given hyper-parameters.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"BorderlineSMOTE1","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"A model type for constructing a borderline smot e1, based on Imbalance.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Do model = BorderlineSMOTE1() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BorderlineSMOTE1(m=...).","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"BorderlineSMOTE1 implements the BorderlineSMOTE1 algorithm to correct for class imbalance as in Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In D.S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer. ","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Training-data","page":"BorderlineSMOTE1","title":"Training data","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"In MLJ or MLJBase, wrap the model in a machine by","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"mach = machine(model)","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"There is no need to provide any data here because the model is a static transformer.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Likewise, there is no need to fit!(mach).","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"For default values of the hyper-parameters, model can be constructed by","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"model = BorderlineSMOTE1()","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Hyperparameters","page":"BorderlineSMOTE1","title":"Hyperparameters","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"m::Integer=5: The number of neighbors to consider while checking the BorderlineSMOTE1 condition. Should be within the range 0 < m < N where N is the number of observations in the data. It will be automatically set to N-1 if N ≤ m.\nk::Integer=5: Number of nearest neighbors to consider in the SMOTE part of the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class. It will be automatically set to l-1 for any class with l points where l ≤ k.\nratios=1.0: A parameter that controls the amount of oversampling to be done for each class\nCan be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class\nCan be a dictionary mapping each class label to the float ratio for that class\nrng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.\nverbosity::Integer=1: Whenever higher than 0 info regarding the points that will participate in oversampling is logged.","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Transform-Inputs","page":"BorderlineSMOTE1","title":"Transform Inputs","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"X: A matrix or table of floats where each row is an observation from the dataset\ny: An abstract vector of labels (e.g., strings) that correspond to the observations in X","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Transform-Outputs","page":"BorderlineSMOTE1","title":"Transform Outputs","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively\nyover: An abstract vector of labels corresponding to Xover","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Operations","page":"BorderlineSMOTE1","title":"Operations","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"transform(mach, X, y): resample the data X and y using BorderlineSMOTE1, returning both the new and original observations","category":"page"},{"location":"models/BorderlineSMOTE1_Imbalance/#Example","page":"BorderlineSMOTE1","title":"Example","text":"","category":"section"},{"location":"models/BorderlineSMOTE1_Imbalance/","page":"BorderlineSMOTE1","title":"BorderlineSMOTE1","text":"using MLJ\nimport Imbalance\n\n## set probability of each class\nclass_probs = [0.5, 0.2, 0.3] \nnum_rows, num_continuous_feats = 1000, 5\n## generate a table and categorical vector accordingly\nX, y = Imbalance.generate_imbalanced_data(num_rows, num_continuous_feats; \n stds=[0.1 0.1 0.1], min_sep=0.01, class_probs, rng=42) \n\njulia> Imbalance.checkbalance(y)\n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 200 (40.8%) \n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 310 (63.3%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%) \n\n## load BorderlineSMOTE1\nBorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance\n\n## wrap the model in a machine\noversampler = BorderlineSMOTE1(m=3, k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)\nmach = machine(oversampler)\n\n## provide the data to transform (there is nothing to fit)\nXover, yover = transform(mach, X, y)\n\n\njulia> Imbalance.checkbalance(yover)\n2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 392 (80.0%) \n1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 441 (90.0%) \n0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%) ","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/#DecisionTreeClassifier_BetaML","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"mutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/#Hyperparameters:","page":"DecisionTreeClassifier","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]\nmin_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]\nmin_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]\nmax_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]\nsplitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the \"impurity\" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.\nrng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/DecisionTreeClassifier_BetaML/#Example:","page":"DecisionTreeClassifier","title":"Example:","text":"","category":"section"},{"location":"models/DecisionTreeClassifier_BetaML/","page":"DecisionTreeClassifier","title":"DecisionTreeClassifier","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load DecisionTreeClassifier pkg = \"BetaML\" verbosity=0\nBetaML.Trees.DecisionTreeClassifier\n\njulia> model = modelType()\nDecisionTreeClassifier(\n max_depth = 0, \n min_gain = 0.0, \n min_records = 2, \n max_features = 0, \n splitting_criterion = BetaML.Utils.gini, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, y);\n\njulia> fit!(mach);\n[ Info: Training machine(DecisionTreeClassifier(max_depth = 0, …), …).\n\njulia> cat_est = predict(mach, X)\n150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)\n ⋮\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)\n UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)","category":"page"},{"location":"loading_model_code/#Loading-Model-Code","page":"Loading Model Code","title":"Loading Model Code","text":"","category":"section"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"Once the name of a model, and the package providing that model, have been identified (see Model Search) one can either import the model type interactively with @iload, as shown under Installation, or use @load as shown below. The @load macro works from within a module, a package or a function, provided the relevant package providing the MLJ interface has been added to your package environment. It will attempt to load the model type into the global namespace of the module in which @load is invoked (Main if invoked at the REPL).","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"In general, the code providing core functionality for the model (living in a package you should consult for documentation) may be different from the package providing the MLJ interface. Since the core package is a dependency of the interface package, only the interface package needs to be added to your environment.","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"For instance, suppose you have activated a Julia package environment my_env that you wish to use for your MLJ project; for example, you have run:","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"using Pkg\nPkg.activate(\"my_env\", shared=true)","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"Furthermore, suppose you want to use DecisionTreeClassifier, provided by the DecisionTree.jl package. Then, to determine which package provides the MLJ interface you call load_path:","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> load_path(\"DecisionTreeClassifier\", pkg=\"DecisionTree\")\n\"MLJDecisionTreeInterface.DecisionTreeClassifier\"","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"In this case, we see that the package required is MLJDecisionTreeInterface.jl. If this package is not in my_env (do Pkg.status() to check) you add it by running","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> Pkg.add(\"MLJDecisionTreeInterface\")","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"So long as my_env is the active environment, this action need never be repeated (unless you run Pkg.rm(\"MLJDecisionTreeInterface\")). You are now ready to instantiate a decision tree classifier:","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> Tree = @load DecisionTree pkg=DecisionTree\njulia> tree = Tree()","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"which is equivalent to","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"julia> import MLJDecisionTreeInterface.DecisionTreeClassifier\njulia> Tree = MLJDecisionTreeInterface.DecisionTreeClassifier\njulia> tree = Tree()","category":"page"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"Tip. The specification pkg=... above can be dropped for the many models that are provided by only a single package.","category":"page"},{"location":"loading_model_code/#API","page":"Loading Model Code","title":"API","text":"","category":"section"},{"location":"loading_model_code/","page":"Loading Model Code","title":"Loading Model Code","text":"load_path\n@load\n@iload","category":"page"},{"location":"loading_model_code/#StatisticalTraits.load_path","page":"Loading Model Code","title":"StatisticalTraits.load_path","text":"load_path(model_name::String, pkg=nothing)\n\nReturn the load path for model type with name model_name, specifying the algorithm=providing package name pkg to resolve name conflicts, if necessary.\n\nload_path(proxy::NamedTuple)\n\nReturn the load path for the model whose name is proxy.name and whose algorithm-providing package has name proxy.package_name. For example, proxy could be any element of the vector returned by models().\n\nload_path(model)\n\nReturn the load path of a model instance or type. Usually requires necessary model code to have been separately loaded. Supply strings as above if code is not loaded.\n\n\n\n\n\n","category":"function"},{"location":"loading_model_code/#MLJModels.@load","page":"Loading Model Code","title":"MLJModels.@load","text":"@load ModelName pkg=nothing verbosity=0 add=false\n\nImport the model type the model named in the first argument into the calling module, specfying pkg in the case of an ambiguous name (to packages providing a model type with the same name). Returns the model type.\n\nWarning In older versions of MLJ/MLJModels, @load returned an instance instead.\n\nTo automatically add required interface packages to the current environment, specify add=true. For interactive loading, use @iload instead.\n\nExamples\n\nTree = @load DecisionTreeRegressor\ntree = Tree()\ntree2 = Tree(min_samples_split=6)\n\nSVM = @load SVC pkg=LIBSVM\nsvm = SVM()\n\nSee also @iload\n\n\n\n\n\n","category":"macro"},{"location":"loading_model_code/#MLJModels.@iload","page":"Loading Model Code","title":"MLJModels.@iload","text":"@iload ModelName\n\nInteractive alternative to @load. Provides user with an optioin to install (add) the required interface package to the current environment, and to choose the relevant model-providing package in ambiguous cases. See @load\n\n\n\n\n\n","category":"macro"},{"location":"models/MCDDetector_OutlierDetectionPython/#MCDDetector_OutlierDetectionPython","page":"MCDDetector","title":"MCDDetector","text":"","category":"section"},{"location":"models/MCDDetector_OutlierDetectionPython/","page":"MCDDetector","title":"MCDDetector","text":"MCDDetector(store_precision = true,\n assume_centered = false,\n support_fraction = nothing,\n random_state = nothing)","category":"page"},{"location":"models/MCDDetector_OutlierDetectionPython/","page":"MCDDetector","title":"MCDDetector","text":"https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.mcd","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#OneClassSVM_LIBSVM","page":"OneClassSVM","title":"OneClassSVM","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"OneClassSVM","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"A model type for constructing a one-class support vector machine, based on LIBSVM.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"OneClassSVM = @load OneClassSVM pkg=LIBSVM","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Do model = OneClassSVM() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneClassSVM(kernel=...).","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): \"LIBSVM: a library for support vector machines.\" ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf. ","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"This model is an outlier detection model delivering raw scores based on the decision function of a support vector machine. Like the NuSVC classifier, it uses the nu re-parameterization of the cost parameter appearing in standard support vector classification SVC.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"To extract normalized scores (\"probabilities\") wrap the model using ProbabilisticDetector from OutlierDetection.jl. For threshold-based classification, wrap the probabilistic model using MLJ's BinaryThresholdPredictor. Examples of wrapping appear below.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Training-data","page":"OneClassSVM","title":"Training data","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"In MLJ or MLJBase, bind an instance model to data with:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"where","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Hyper-parameters","page":"OneClassSVM","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see \"Examples\" below).\nLIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2\nLIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree\nLIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))\nLIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)\nHere gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91\ngamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).\ncoef0 = 0.0: kernel parameter (see above)\ndegree::Int32 = Int32(3): degree in polynomial kernel (see above)\nnu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.\ncachesize=200.0 cache memory size in MB\ntolerance=0.001: tolerance for the stopping criterion\nshrinking=true: whether to use shrinking heuristics","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Operations","page":"OneClassSVM","title":"Operations","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"transform(mach, Xnew): return scores for outlierness, given features Xnew having the same scitype as X above. The greater the score, the more likely it is an outlier. This score is based on the SVM decision function. For normalized scores, wrap model using ProbabilisticDetector from OutlierDetection.jl and call predict instead, and for threshold-based classification, wrap again using BinaryThresholdPredictor. See the examples below.","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Fitted-parameters","page":"OneClassSVM","title":"Fitted parameters","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"libsvm_model: the trained model object created by the LIBSVM.jl package\norientation: this equals 1 if the decision function for libsvm_model is increasing with increasing outlierness, and -1 if it is decreasing instead. Correspondingly, the libsvm_model attaches true to outliers in the first case, and false in the second. (The scores given in the MLJ report and generated by MLJ.transform already correct for this ambiguity, which is therefore only an issue for users directly accessing libsvm_model.)","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Report","page":"OneClassSVM","title":"Report","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"The fields of report(mach) are:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"gamma: actual value of the kernel parameter gamma used in training","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Examples","page":"OneClassSVM","title":"Examples","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/#Generating-raw-scores-for-outlierness","page":"OneClassSVM","title":"Generating raw scores for outlierness","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"using MLJ\nimport LIBSVM\nimport StableRNGs.StableRNG\n\nOneClassSVM = @load OneClassSVM pkg=LIBSVM ## model type\nmodel = OneClassSVM(kernel=LIBSVM.Kernel.Polynomial) ## instance\n\nrng = StableRNG(123)\nXmatrix = randn(rng, 5, 3)\nXmatrix[1, 1] = 100.0\nX = MLJ.table(Xmatrix)\n\nmach = machine(model, X) |> fit!\n\n## training scores (outliers have larger scores):\njulia> report(mach).scores\n5-element Vector{Float64}:\n 6.711689156091755e-7\n -6.740101976655081e-7\n -6.711632439648446e-7\n -6.743015858874887e-7\n -6.745393717880104e-7\n\n## scores for new data:\nXnew = MLJ.table(rand(rng, 2, 3))\n\njulia> transform(mach, rand(rng, 2, 3))\n2-element Vector{Float64}:\n -6.746293022511047e-7\n -6.744289265348623e-7","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Generating-probabilistic-predictions-of-outlierness","page":"OneClassSVM","title":"Generating probabilistic predictions of outlierness","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Continuing the previous example:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"using OutlierDetection\npmodel = ProbabilisticDetector(model)\npmach = machine(pmodel, X) |> fit!\n\n## probabilistic predictions on new data:\n\njulia> y_prob = predict(pmach, Xnew)\n2-element UnivariateFiniteVector{OrderedFactor{2}, String, UInt8, Float64}:\n UnivariateFinite{OrderedFactor{2}}(normal=>1.0, outlier=>9.57e-5)\n UnivariateFinite{OrderedFactor{2}}(normal=>1.0, outlier=>0.0)\n\n## probabilities for outlierness:\n\njulia> pdf.(y_prob, \"outlier\")\n2-element Vector{Float64}:\n 9.572583265925801e-5\n 0.0\n\n## raw scores are still available using `transform`:\n\njulia> transform(pmach, Xnew)\n2-element Vector{Float64}:\n 9.572583265925801e-5\n 0.0","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#Outlier-classification-using-a-probability-threshold:","page":"OneClassSVM","title":"Outlier classification using a probability threshold:","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Continuing the previous example:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"dmodel = BinaryThresholdPredictor(pmodel, threshold=0.9)\ndmach = machine(dmodel, X) |> fit!\n\njulia> yhat = predict(dmach, Xnew)\n2-element CategoricalArrays.CategoricalArray{String,1,UInt8}:\n \"normal\"\n \"normal\"","category":"page"},{"location":"models/OneClassSVM_LIBSVM/#User-defined-kernels","page":"OneClassSVM","title":"User-defined kernels","text":"","category":"section"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"Continuing the first example:","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`\nmodel = OneClassSVM(kernel=k)\nmach = machine(model, X) |> fit!\n\njulia> yhat = transform(mach, Xnew)\n2-element Vector{Float64}:\n -0.4825363352732942\n -0.4848772169720227","category":"page"},{"location":"models/OneClassSVM_LIBSVM/","page":"OneClassSVM","title":"OneClassSVM","text":"See also LIVSVM.jl and the original C implementation documentation. For an alternative source of outlier detection models with an MLJ interface, see OutlierDetection.jl.","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/#KNeighborsClassifier_MLJScikitLearnInterface","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"","category":"section"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"KNeighborsClassifier","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"A model type for constructing a K-nearest neighbors classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"KNeighborsClassifier = @load KNeighborsClassifier pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"Do model = KNeighborsClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsClassifier(n_neighbors=...).","category":"page"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/#Hyper-parameters","page":"KNeighborsClassifier","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNeighborsClassifier_MLJScikitLearnInterface/","page":"KNeighborsClassifier","title":"KNeighborsClassifier","text":"n_neighbors = 5\nweights = uniform\nalgorithm = auto\nleaf_size = 30\np = 2\nmetric = minkowski\nmetric_params = nothing\nn_jobs = nothing","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/#BayesianLDA_MLJScikitLearnInterface","page":"BayesianLDA","title":"BayesianLDA","text":"","category":"section"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"A model type for constructing a Bayesian linear discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"BayesianLDA = @load BayesianLDA pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(solver=...).","category":"page"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/#Hyper-parameters","page":"BayesianLDA","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/BayesianLDA_MLJScikitLearnInterface/","page":"BayesianLDA","title":"BayesianLDA","text":"solver = svd\nshrinkage = nothing\npriors = nothing\nn_components = nothing\nstore_covariance = false\ntol = 0.0001\ncovariance_estimator = nothing","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/#LassoLarsCVRegressor_MLJScikitLearnInterface","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"","category":"section"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"LassoLarsCVRegressor","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"A model type for constructing a Lasso model fit with least angle regression (LARS) with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"LassoLarsCVRegressor = @load LassoLarsCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"Do model = LassoLarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsCVRegressor(fit_intercept=...).","category":"page"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LassoLarsCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LassoLarsCVRegressor_MLJScikitLearnInterface/","page":"LassoLarsCVRegressor","title":"LassoLarsCVRegressor","text":"fit_intercept = true\nverbose = false\nmax_iter = 500\nprecompute = auto\ncv = 5\nmax_n_alphas = 1000\nn_jobs = nothing\neps = 2.220446049250313e-16\ncopy_X = true\npositive = false","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/#OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"OrthogonalMatchingPursuitCVRegressor","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"A model type for constructing a orthogonal ,atching pursuit (OMP) model with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"OrthogonalMatchingPursuitCVRegressor = @load OrthogonalMatchingPursuitCVRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"Do model = OrthogonalMatchingPursuitCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitCVRegressor(copy=...).","category":"page"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"OrthogonalMatchingPursuitCVRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/","page":"OrthogonalMatchingPursuitCVRegressor","title":"OrthogonalMatchingPursuitCVRegressor","text":"copy = true\nfit_intercept = true\nmax_iter = nothing\ncv = 5\nn_jobs = 1\nverbose = false","category":"page"},{"location":"models/KMeansClusterer_BetaML/#KMeansClusterer_BetaML","page":"KMeansClusterer","title":"KMeansClusterer","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"mutable struct KMeansClusterer <: MLJModelInterface.Unsupervised","category":"page"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).","category":"page"},{"location":"models/KMeansClusterer_BetaML/#Parameters:","page":"KMeansClusterer","title":"Parameters:","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"n_classes::Int64: Number of classes to discriminate the data [def: 3]\ndist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary to KMedoidsClusterer, the KMeansClusterer algorithm is not guaranteed to converge with other distances than the Euclidean one.\ninitialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:\n\"random\": randomly in the X space\n\"grid\": using a grid approach\n\"shuffle\": selecting randomly within the available points [default]\n\"given\": using a provided set of initial representatives provided in the initial_representatives parameter\ninitial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy=\"given\") [default: nothing]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/KMeansClusterer_BetaML/#Notes:","page":"KMeansClusterer","title":"Notes:","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"data must be numerical\nonline fitting (re-fitting with new data) is supported","category":"page"},{"location":"models/KMeansClusterer_BetaML/#Example:","page":"KMeansClusterer","title":"Example:","text":"","category":"section"},{"location":"models/KMeansClusterer_BetaML/","page":"KMeansClusterer","title":"KMeansClusterer","text":"julia> using MLJ\n\njulia> X, y = @load_iris;\n\njulia> modelType = @load KMeansClusterer pkg = \"BetaML\" verbosity=0\nBetaML.Clustering.KMeansClusterer\n\njulia> model = modelType()\nKMeansClusterer(\n n_classes = 3, \n dist = BetaML.Clustering.var\"#34#36\"(), \n initialisation_strategy = \"shuffle\", \n initial_representatives = nothing, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X);\n\njulia> fit!(mach);\n[ Info: Training machine(KMeansClusterer(n_classes = 3, …), …).\n\njulia> classes_est = predict(mach, X);\n\njulia> hcat(y,classes_est)\n150×2 CategoricalArrays.CategoricalArray{Union{Int64, String},2,UInt32}:\n \"setosa\" 2\n \"setosa\" 2\n \"setosa\" 2\n ⋮ \n \"virginica\" 3\n \"virginica\" 3\n \"virginica\" 1","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#LassoRegressor_MLJLinearModels","page":"LassoRegressor","title":"LassoRegressor","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"A model type for constructing a lasso regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"LassoRegressor = @load LassoRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Do model = LassoRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Lasso regression is a linear model with objective function","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"$","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"|Xθ - y|₂²/2 + n⋅λ|θ|₁ $","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"where n is the number of observations.","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"If scale_penalty_with_samples = false the objective function is","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"$","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"|Xθ - y|₂²/2 + λ|θ|₁ $","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":".","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#Training-data","page":"LassoRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"where:","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#Hyperparameters","page":"LassoRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"lambda::Real: strength of the L1 regularization. Default: 1.0\nfit_intercept::Bool: whether to fit the intercept or not. Default: true\npenalize_intercept::Bool: whether to penalize the intercept. Default: false\nscale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad. If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used. Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/#Example","page":"LassoRegressor","title":"Example","text":"","category":"section"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(LassoRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"},{"location":"models/LassoRegressor_MLJLinearModels/","page":"LassoRegressor","title":"LassoRegressor","text":"See also ElasticNetRegressor.","category":"page"},{"location":"model_stacking/#Model-Stacking","page":"Model Stacking","title":"Model Stacking","text":"","category":"section"},{"location":"model_stacking/","page":"Model Stacking","title":"Model Stacking","text":"In a model stack, as introduced by Wolpert (1992), an adjudicating model learns the best way to combine the predictions of multiple base models. In MLJ, such models are constructed using the Stack constructor. To learn more about stacking and to see how to construct a stack \"by hand\" using Learning Networks, see this Data Science in Julia tutorial","category":"page"},{"location":"model_stacking/","page":"Model Stacking","title":"Model Stacking","text":"MLJBase.Stack","category":"page"},{"location":"model_stacking/#MLJBase.Stack","page":"Model Stacking","title":"MLJBase.Stack","text":"Stack(; metalearner=nothing, name1=model1, name2=model2, ..., keyword_options...)\n\nImplements the two-layer generalized stack algorithm introduced by Wolpert (1992) and generalized by Van der Laan et al (2007). Returns an instance of type ProbabilisticStack or DeterministicStack, depending on the prediction type of metalearner.\n\nWhen training a machine bound to such an instance:\n\nThe data is split into training/validation sets according to the specified resampling strategy.\nEach base model model1, model2, ... is trained on each training subset and outputs predictions on the corresponding validation sets. The multi-fold predictions are spliced together into a so-called out-of-sample prediction for each model.\nThe adjudicating model, metalearner, is subsequently trained on the out-of-sample predictions to learn the best combination of base model predictions.\nEach base model is retrained on all supplied data for purposes of passing on new production data onto the adjudicator for making new predictions\n\nArguments\n\nmetalearner::Supervised: The model that will optimize the desired criterion based on its internals. For instance, a LinearRegression model will optimize the squared error.\nresampling: The resampling strategy used to prepare out-of-sample predictions of the base learners.\nmeasures: A measure or iterable over measures, to perform an internal evaluation of the learners in the Stack while training. This is not for the evaluation of the Stack itself.\ncache: Whether machines created in the learning network will cache data or not.\nacceleration: A supported AbstractResource to define the training parallelization mode of the stack.\nname1=model1, name2=model2, ...: the Supervised model instances to be used as base learners. The provided names become properties of the instance created to allow hyper-parameter access\n\nExample\n\nThe following code defines a DeterministicStack instance for learning a Continuous target, and demonstrates that:\n\nBase models can be Probabilistic models even if the stack itself is Deterministic (predict_mean is applied in such cases).\nAs an alternative to hyperparameter optimization, one can stack multiple copies of given model, mutating the hyper-parameter used in each copy.\n\nusing MLJ\n\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nEvoTreeRegressor = @load EvoTreeRegressor\nXGBoostRegressor = @load XGBoostRegressor\nKNNRegressor = @load KNNRegressor pkg=NearestNeighborModels\nLinearRegressor = @load LinearRegressor pkg=MLJLinearModels\n\nX, y = make_regression(500, 5)\n\nstack = Stack(;metalearner=LinearRegressor(),\n resampling=CV(),\n measures=rmse,\n constant=ConstantRegressor(),\n tree_2=DecisionTreeRegressor(max_depth=2),\n tree_3=DecisionTreeRegressor(max_depth=3),\n evo=EvoTreeRegressor(),\n knn=KNNRegressor(),\n xgb=XGBoostRegressor())\n\nmach = machine(stack, X, y)\nevaluate!(mach; resampling=Holdout(), measure=rmse)\n\n\nThe internal evaluation report can be accessed like this and provides a PerformanceEvaluation object for each model:\n\nreport(mach).cv_report\n\n\n\n\n\n","category":"type"},{"location":"models/LGBMClassifier_LightGBM/#LGBMClassifier_LightGBM","page":"LGBMClassifier","title":"LGBMClassifier","text":"","category":"section"},{"location":"models/LGBMClassifier_LightGBM/","page":"LGBMClassifier","title":"LGBMClassifier","text":"Microsoft LightGBM FFI wrapper: Classifier","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/#Birch_MLJScikitLearnInterface","page":"Birch","title":"Birch","text":"","category":"section"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Birch","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"A model type for constructing a birch, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Birch = @load Birch pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Do model = Birch() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Birch(threshold=...).","category":"page"},{"location":"models/Birch_MLJScikitLearnInterface/","page":"Birch","title":"Birch","text":"Memory-efficient, online-learning algorithm provided as an alternative to MiniBatchKMeans. Note: noisy samples are given the label -1.","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/#MultitargetGaussianMixtureRegressor_BetaML","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"","category":"section"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/#Hyperparameters:","page":"MultitargetGaussianMixtureRegressor","title":"Hyperparameters:","text":"","category":"section"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]\ninitial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]\nmixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to \"gived\" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]\ntol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]\nminimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]\nminimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).\ninitialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:\n\"grid\": using a grid approach\n\"given\": using the mixture provided in the fully qualified mixtures parameter\n\"kmeans\": use first kmeans (itself initialised with a \"grid\" strategy) to set the initial mixture centers [default]\nNote that currently \"random\" and \"shuffle\" initialisations are not supported in gmm-based algorithms.\nmaximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]\nrng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]","category":"page"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/#Example:","page":"MultitargetGaussianMixtureRegressor","title":"Example:","text":"","category":"section"},{"location":"models/MultitargetGaussianMixtureRegressor_BetaML/","page":"MultitargetGaussianMixtureRegressor","title":"MultitargetGaussianMixtureRegressor","text":"julia> using MLJ\n\njulia> X, y = @load_boston;\n\njulia> ydouble = hcat(y, y .*2 .+5);\n\njulia> modelType = @load MultitargetGaussianMixtureRegressor pkg = \"BetaML\" verbosity=0\nBetaML.GMM.MultitargetGaussianMixtureRegressor\n\njulia> model = modelType()\nMultitargetGaussianMixtureRegressor(\n n_classes = 3, \n initial_probmixtures = Float64[], \n mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], \n tol = 1.0e-6, \n minimum_variance = 0.05, \n minimum_covariance = 0.0, \n initialisation_strategy = \"kmeans\", \n maximum_iterations = 9223372036854775807, \n rng = Random._GLOBAL_RNG())\n\njulia> mach = machine(model, X, ydouble);\n\njulia> fit!(mach);\n[ Info: Training machine(MultitargetGaussianMixtureRegressor(n_classes = 3, …), …).\nIter. 1: Var. of the post 20.46947926187522 Log-likelihood -23662.72770575145\n\njulia> ŷdouble = predict(mach, X)\n506×2 Matrix{Float64}:\n 23.3358 51.6717\n 23.3358 51.6717\n ⋮ \n 16.6843 38.3686\n 16.6843 38.3686","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/#DBSCAN_MLJScikitLearnInterface","page":"DBSCAN","title":"DBSCAN","text":"","category":"section"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"A model type for constructing a dbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"DBSCAN = @load DBSCAN pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(eps=...).","category":"page"},{"location":"models/DBSCAN_MLJScikitLearnInterface/","page":"DBSCAN","title":"DBSCAN","text":"Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#KNNRegressor_NearestNeighborModels","page":"KNNRegressor","title":"KNNRegressor","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"KNNRegressor","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"A model type for constructing a K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"Do model = KNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNRegressor(K=...).","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"KNNRegressor implements K-Nearest Neighbors regressor which is non-parametric algorithm that predicts the response associated with a new point by taking an weighted average of the response of the K-nearest points.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Training-data","page":"KNNRegressor","title":"Training data","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"OR","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"mach = machine(model, X, y, w)","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"Here:","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).\ny is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).\nw is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Hyper-parameters","page":"KNNRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"K::Int=5 : number of neighbors\nalgorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)\nmetric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.\nleafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.\nreorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.\nweights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Operations","page":"KNNRegressor","title":"Operations","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Fitted-parameters","page":"KNNRegressor","title":"Fitted parameters","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"The fields of fitted_params(mach) are:","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/#Examples","page":"KNNRegressor","title":"Examples","text":"","category":"section"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"using MLJ\nKNNRegressor = @load KNNRegressor pkg=NearestNeighborModels\nX, y = @load_boston; ## loads the crabs dataset from MLJBase\n## view possible kernels\nNearestNeighborModels.list_kernels()\nmodel = KNNRegressor(weights = NearestNeighborModels.Inverse()) #KNNRegressor instantiation\nmach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit\ny_hat = predict(mach, X)\n","category":"page"},{"location":"models/KNNRegressor_NearestNeighborModels/","page":"KNNRegressor","title":"KNNRegressor","text":"See also MultitargetKNNRegressor","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/#ARDRegressor_MLJScikitLearnInterface","page":"ARDRegressor","title":"ARDRegressor","text":"","category":"section"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"ARDRegressor","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"A model type for constructing a Bayesian ARD regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"ARDRegressor = @load ARDRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"Do model = ARDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ARDRegressor(max_iter=...).","category":"page"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"ARDRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/ARDRegressor_MLJScikitLearnInterface/","page":"ARDRegressor","title":"ARDRegressor","text":"max_iter = 300\ntol = 0.001\nalpha_1 = 1.0e-6\nalpha_2 = 1.0e-6\nlambda_1 = 1.0e-6\nlambda_2 = 1.0e-6\ncompute_score = false\nthreshold_lambda = 10000.0\nfit_intercept = true\ncopy_X = true\nverbose = false","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/#LinearRegressor_MLJScikitLearnInterface","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a ordinary least-squares regressor (OLS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).","category":"page"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"LinearRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/LinearRegressor_MLJScikitLearnInterface/","page":"LinearRegressor","title":"LinearRegressor","text":"fit_intercept = true\ncopy_X = true\nn_jobs = nothing","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/#SVMNuRegressor_MLJScikitLearnInterface","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"","category":"section"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"SVMNuRegressor","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"A model type for constructing a nu-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"SVMNuRegressor = @load SVMNuRegressor pkg=MLJScikitLearnInterface","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"Do model = SVMNuRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuRegressor(nu=...).","category":"page"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/#Hyper-parameters","page":"SVMNuRegressor","title":"Hyper-parameters","text":"","category":"section"},{"location":"models/SVMNuRegressor_MLJScikitLearnInterface/","page":"SVMNuRegressor","title":"SVMNuRegressor","text":"nu = 0.5\nC = 1.0\nkernel = rbf\ndegree = 3\ngamma = scale\ncoef0 = 0.0\nshrinking = true\ntol = 0.001\ncache_size = 200\nmax_iter = -1","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#LinearRegressor_MLJLinearModels","page":"LinearRegressor","title":"LinearRegressor","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"A model type for constructing a linear regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"From MLJ, the type can be imported using","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"LinearRegressor = @load LinearRegressor pkg=MLJLinearModels","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"Do model = LinearRegressor() to construct an instance with default hyper-parameters.","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"This model provides standard linear regression with objective function","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"$","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"|Xθ - y|₂²/2 $","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"Different solver options exist, as indicated under \"Hyperparameters\" below. ","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#Training-data","page":"LinearRegressor","title":"Training data","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"In MLJ or MLJBase, bind an instance model to data with","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"mach = machine(model, X, y)","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"where:","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"Train the machine using fit!(mach, rows=...).","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#Hyperparameters","page":"LinearRegressor","title":"Hyperparameters","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"fit_intercept::Bool: whether to fit the intercept or not. Default: true\nsolver::Union{Nothing, MLJLinearModels.Solver}: \"any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient.\nIf solver = nothing (default) then Analytical() is used. Default: nothing","category":"page"},{"location":"models/LinearRegressor_MLJLinearModels/#Example","page":"LinearRegressor","title":"Example","text":"","category":"section"},{"location":"models/LinearRegressor_MLJLinearModels/","page":"LinearRegressor","title":"LinearRegressor","text":"using MLJ\nX, y = make_regression()\nmach = fit!(machine(LinearRegressor(), X, y))\npredict(mach, X)\nfitted_params(mach)","category":"page"}] } diff --git a/dev/simple_user_defined_models/index.html b/dev/simple_user_defined_models/index.html index 102e4794a..bca77c7d3 100644 --- a/dev/simple_user_defined_models/index.html +++ b/dev/simple_user_defined_models/index.html @@ -1,5 +1,5 @@ -Simple User Defined Models · MLJ

Simple User Defined Models

To quickly implement a new supervised model in MLJ, it suffices to:

  • Define a mutable struct to store hyperparameters. This is either a subtype of Probabilistic or Deterministic, depending on whether probabilistic or ordinary point predictions are intended. This struct is the model.

  • Define a fit method, dispatched on the model, returning learned parameters, also known as the fitresult.

  • Define a predict method, dispatched on the model, and the fitresult, to return predictions on new patterns.

In the examples below, the training input X of fit, and the new input Xnew passed to predict, are tables. Each training target y is an AbstractVector.

The predictions returned by predict have the same form as y for deterministic models, but are Vectors of distributions for probabilistic models.

Advanced model functionality not addressed here includes: (i) optional update method to avoid redundant calculations when calling fit! on machines a second time; (ii) reporting extra training-related statistics; (iii) exposing model-specific functionality; (iv) checking the scientific type of data passed to your model in machine construction; and (iv) checking the validity of hyperparameter values. All this is described in Adding Models for General Use.

For an unsupervised model, implement transform and, optionally, inverse_transform using the same signature at predict below.

A simple deterministic regressor

Here's a quick-and-dirty implementation of a ridge regressor with no intercept:

import MLJBase
+Simple User Defined Models · MLJ

Simple User Defined Models

To quickly implement a new supervised model in MLJ, it suffices to:

  • Define a mutable struct to store hyperparameters. This is either a subtype of Probabilistic or Deterministic, depending on whether probabilistic or ordinary point predictions are intended. This struct is the model.

  • Define a fit method, dispatched on the model, returning learned parameters, also known as the fitresult.

  • Define a predict method, dispatched on the model, and the fitresult, to return predictions on new patterns.

In the examples below, the training input X of fit, and the new input Xnew passed to predict, are tables. Each training target y is an AbstractVector.

The predictions returned by predict have the same form as y for deterministic models, but are Vectors of distributions for probabilistic models.

Advanced model functionality not addressed here includes: (i) optional update method to avoid redundant calculations when calling fit! on machines a second time; (ii) reporting extra training-related statistics; (iii) exposing model-specific functionality; (iv) checking the scientific type of data passed to your model in machine construction; and (iv) checking the validity of hyperparameter values. All this is described in Adding Models for General Use.

For an unsupervised model, implement transform and, optionally, inverse_transform using the same signature at predict below.

A simple deterministic regressor

Here's a quick-and-dirty implementation of a ridge regressor with no intercept:

import MLJBase
 using LinearAlgebra
 
 mutable struct MyRegressor <: MLJBase.Deterministic
@@ -21,8 +21,8 @@
   lambda = 1.0)
julia> regressor = machine(model, X, y)untrained Machine; caches model-specific representations of data model: MyRegressor(lambda = 1.0) args: - 1: Source @670 ⏎ Table{AbstractVector{Continuous}} - 2: Source @478 ⏎ AbstractVector{Continuous}
julia> evaluate!(regressor, resampling=CV(), measure=rms, verbosity=0)PerformanceEvaluation object with these fields: + 1: Source @422 ⏎ Table{AbstractVector{Continuous}} + 2: Source @260 ⏎ AbstractVector{Continuous}
julia> evaluate!(regressor, resampling=CV(), measure=rms, verbosity=0)PerformanceEvaluation object with these fields: model, measure, operation, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold, @@ -56,4 +56,4 @@ MLJBase.predict(model::MyClassifier, fitresult, Xnew) = [fitresult for r in 1:nrows(Xnew)]
julia> X, y = @load_iris;
julia> mach = machine(MyClassifier(), X, y) |> fit!;[ Info: Training machine(MyClassifier(), …).
julia> predict(mach, selectrows(X, 1:2))2-element Vector{UnivariateFinite{Multiclass{3}, String, UInt32, Float64}}: UnivariateFinite{Multiclass{3}}(setosa=>0.333, versicolor=>0.333, virginica=>0.333) - UnivariateFinite{Multiclass{3}}(setosa=>0.333, versicolor=>0.333, virginica=>0.333)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.333, versicolor=>0.333, virginica=>0.333)
diff --git a/dev/target_transformations/index.html b/dev/target_transformations/index.html index 266a6d9c6..309400804 100644 --- a/dev/target_transformations/index.html +++ b/dev/target_transformations/index.html @@ -1,5 +1,5 @@ -Target Transformations · MLJ

Target Transformations

Some supervised models work best if the target variable has been standardized, i.e., rescaled to have zero mean and unit variance. Such a target transformation is learned from the values of the training target variable. In particular, one generally learns a different transformation when training on a proper subset of the training data. Good data hygiene prescribes that a new transformation should be computed each time the supervised model is trained on new data - for example in cross-validation.

Additionally, one generally wants to inverse transform the predictions of the supervised model for the final target predictions to be on the original scale.

All these concerns are addressed by wrapping the supervised model using TransformedTargetModel:

Ridge = @load RidgeRegressor pkg=MLJLinearModels verbosity=0
+Target Transformations · MLJ

Target Transformations

Some supervised models work best if the target variable has been standardized, i.e., rescaled to have zero mean and unit variance. Such a target transformation is learned from the values of the training target variable. In particular, one generally learns a different transformation when training on a proper subset of the training data. Good data hygiene prescribes that a new transformation should be computed each time the supervised model is trained on new data - for example in cross-validation.

Additionally, one generally wants to inverse transform the predictions of the supervised model for the final target predictions to be on the original scale.

All these concerns are addressed by wrapping the supervised model using TransformedTargetModel:

Ridge = @load RidgeRegressor pkg=MLJLinearModels verbosity=0
 ridge = Ridge(fit_intercept=false)
 ridge2 = TransformedTargetModel(ridge, transformer=Standardizer())
TransformedTargetModelDeterministic(
   model = RidgeRegressor(
@@ -75,4 +75,4 @@
 └──────────────────────────────────────┴─────────┘
 

Without the log transform (ie, using ridge) we get the poorer mean absolute error, l1, of 3.9.

MLJBase.TransformedTargetModelFunction
TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)

Wrap the supervised or semi-supervised model in a transformation of the target variable.

Here transformer one of the following:

  • The Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.

  • A callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.

Specify cache=false to prioritize memory over speed, or to guarantee data anonymity.

Specify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).

Examples

A model that normalizes the target before applying ridge regression, with predictions returned on the original scale:

@load RidgeRegressor pkg=MLJLinearModels
 model = RidgeRegressor()
-tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
source
+tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
source
diff --git a/dev/third_party_packages/index.html b/dev/third_party_packages/index.html index 3caaa121b..14f9237cd 100644 --- a/dev/third_party_packages/index.html +++ b/dev/third_party_packages/index.html @@ -1,2 +1,2 @@ -Third Party Packages · MLJ

Third Party Packages

A list of third-party packages with integration with MLJ.

Last updated December 2020.

Pull requests to update this list are very welcome. Otherwise, you may post an issue requesting this here.

Packages providing models in the MLJ model registry

See List of Supported Models

Providing unregistered models:

Packages providing other kinds of functionality:

+Third Party Packages · MLJ

Third Party Packages

A list of third-party packages with integration with MLJ.

Last updated December 2020.

Pull requests to update this list are very welcome. Otherwise, you may post an issue requesting this here.

Packages providing models in the MLJ model registry

See List of Supported Models

Providing unregistered models:

Packages providing other kinds of functionality:

diff --git a/dev/thresholding_probabilistic_predictors/index.html b/dev/thresholding_probabilistic_predictors/index.html new file mode 100644 index 000000000..b64ec68bf --- /dev/null +++ b/dev/thresholding_probabilistic_predictors/index.html @@ -0,0 +1,26 @@ + +Thresholding Probabilistic Predictors · MLJ

Thresholding Probabilistic Predictors

Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.

The positive class is always the second class returned when calling levels on the training target y.

MLJModels.BinaryThresholdPredictorType
BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
+rng = Xoshiro(123)
+
+diabetes = OpenML.load(43582)
+outcome, X = unpack(diabetes, ==(:Outcome), rng=rng);
+y = coerce(Int.(outcome), OrderedFactor);

Choosing a probabilistic classifier:

EvoTreesClassifier = @load EvoTreesClassifier
+prob_predictor = EvoTreesClassifier()

Wrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:

point_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)
+Xnew, _ = make_moons(3, rng=rng)
+mach = machine(point_predictor, X, y) |> fit!
+predict(mach, X)[1:3] # [0, 0, 0]

Estimating performance:

balanced = BalancedAccuracy(adjusted=true)
+e = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])
+e.measurement[1] # 0.405 ± 0.089

Wrapping in tuning strategy to learn threshold that maximizes balanced accuracy:

r = range(point_predictor, :threshold, lower=0.1, upper=0.9)
+tuned_point_predictor = TunedModel(
+    point_predictor,
+    tuning=RandomSearch(rng=rng),
+    resampling=CV(nfolds=6),
+    range = r,
+    measure=balanced,
+    n=30,
+)
+mach2 = machine(tuned_point_predictor, X, y) |> fit!
+optimized_point_predictor = report(mach2).best_model
+optimized_point_predictor.threshold # 0.260
+predict(mach2, X)[1:3] # [1, 1, 0]

Estimating the performance of the auto-thresholding model (nested resampling here):

e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
+e.measurement[1] # 0.477 ± 0.110
source
diff --git a/dev/transformers/index.html b/dev/transformers/index.html index d4a7d39fa..7acd0868a 100644 --- a/dev/transformers/index.html +++ b/dev/transformers/index.html @@ -1,5 +1,5 @@ -Transformers and Other Unsupervised models · MLJ

Transformers and Other Unsupervised Models

Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.

A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (a subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.

Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict

Finally, we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.

Built-in transformers

MLJModels.StandardizerType
Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype

    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).

    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.

  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized

  • means - the corresponding untransformed mean values

  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
+Transformers and Other Unsupervised models · MLJ

Transformers and Other Unsupervised Models

Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.

A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (a subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.

Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict

Finally, we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.

Built-in transformers

MLJModels.StandardizerType
Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype

    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).

    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.

  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized

  • means - the corresponding untransformed mean values

  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
 
 X = (ordinal1 = [1, 2, 3],
      ordinal2 = coerce([:x, :y, :x], OrderedFactor),
@@ -34,7 +34,7 @@
  ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],
  ordinal3 = [10.0, 20.0, 30.0],
  ordinal4 = [1.0, 0.0, -1.0],
- nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

source
MLJModels.OneHotEncoderType
OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.

  • ordered_factor=false: when true, OrderedFactor features are universally excluded

  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach) are:

  • all_features: names of all features encountered in training

  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name

  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded

  • new_features: names of all output features

Example

using MLJ
+ nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

source
MLJModels.OneHotEncoderType
OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.

  • ordered_factor=false: when true, OrderedFactor features are universally excluded

  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach) are:

  • all_features: names of all features encountered in training

  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name

  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded

  • new_features: names of all output features

Example

using MLJ
 
 X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
      grade=categorical(["A", "B", "A", "C"], ordered=true),
@@ -66,7 +66,7 @@
 │ grade__B     │ Continuous │
 │ height       │ Continuous │
 │ n_devices    │ Count      │
-└──────────────┴────────────┘

See also ContinuousEncoder.

source
MLJModels.ContinuousEncoderType
ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.

  • If ftr is Multiclass, one-hot encode it.

  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.

  • If ftr is Count, replace it with coerce(ftr, Continuous).

  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.

  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table

  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding

  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table

  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
+└──────────────┴────────────┘

See also ContinuousEncoder.

source
MLJModels.ContinuousEncoderType
ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.

  • If ftr is Multiclass, one-hot encode it.

  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.

  • If ftr is Count, replace it with coerce(ftr, Continuous).

  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.

  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table

  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding

  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table

  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
      grade=categorical(["A", "B", "A", "C"], ordered=true),
      height=[1.85, 1.67, 1.5, 1.67],
      n_devices=[3, 2, 4, 3],
@@ -102,7 +102,7 @@
 julia> setdiff(schema(X).names, report(mach).features_to_keep) # dropped features
 1-element Vector{Symbol}:
  :comments
-

See also OneHotEncoder

source
MLJModels.FillImputerType
FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values

  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values

  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training

  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)

  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
+

See also OneHotEncoder

source
MLJModels.FillImputerType
FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values

  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values

  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training

  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)

  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
 imputer = FillImputer()
 
 X = (a = [1.0, 2.0, missing, 3.0, missing],
@@ -134,7 +134,7 @@
 julia> transform(mach, X)
 (a = [1.0, 2.0, 2.0, 3.0, 2.0],
  b = CategoricalValue{String, UInt32}["y", "n", "y", "y", "y"],
- c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

source
MLJModels.UnivariateFillImputerType
UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values

  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values

  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
+ c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

source
MLJModels.UnivariateFillImputerType
UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values

  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values

  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
 imputer = UnivariateFillImputer()
 
 x_continuous = [1.0, 2.0, missing, 3.0]
@@ -169,7 +169,7 @@
 3-element Vector{Int64}:
  2
  2
- 5

For imputing tabular data, use FillImputer.

source
MLJModels.FeatureSelectorType
FeatureSelector

A model type for constructing a feature selector, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=MLJModels

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training

    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)

    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.

  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ
+ 5

For imputing tabular data, use FillImputer.

source
FeatureSelection.FeatureSelectorType
FeatureSelector

A model type for constructing a feature selector, based on FeatureSelection.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=FeatureSelection

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training

    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)

    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.

  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ
 
 X = (ordinal1 = [1, 2, 3],
      ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
@@ -184,7 +184,7 @@
  ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
  ordinal4 = [-20.0, -30.0, -40.0],
  nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
-
source
MLJModels.UnivariateBoxCoxTransformerType
UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try

  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach

  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent

  • c: the learned shift

Examples

using MLJ
+
source
MLJModels.UnivariateBoxCoxTransformerType
UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try

  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach

  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent

  • c: the learned shift

Examples

using MLJ
 using UnicodePlots
 using Random
 Random.seed!(123)
@@ -223,7 +223,7 @@
    [ 3.0,  4.0) ┤▎ 1
                 └                                        ┘
                                  Frequency
-
source
MLJModels.UnivariateDiscretizerType
UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach

  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)

  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Example

using MLJ
+
source
MLJModels.UnivariateDiscretizerType
UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach

  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)

  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Example

using MLJ
 using Random
 Random.seed!(123)
 
@@ -254,7 +254,7 @@
  0.012731354778359405
  0.0056265330571125816
  0.005738175684445124
- 0.006835652575801987
source
MLJModels.UnivariateTimeTypeToContinuousType
UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.

  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
+ 0.006835652575801987
source
MLJModels.UnivariateTimeTypeToContinuousType
UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.

  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
 using Dates
 
 x = [Date(2001, 1, 1) + Day(i) for i in 0:4]
@@ -270,7 +270,7 @@
  52.42857142857143
  52.57142857142857
  52.714285714285715
- 52.857142
source

Static transformers

A static transformer is a model for transforming data that does not generalize to new data (does not "learn") but which nevertheless has hyperparameters. For example, the DBSAN clustering model from Clustering.jl can assign labels to some collection of observations, cannot directly assign a label to some new observation.

The general user may define their own static models. The main use-case is insertion into a Linear Pipelines some parameter-dependent transformation. (If a static transformer has no hyper-parameters, it is tantamount to an ordinary function. An ordinary function can be inserted directly into a pipeline; the situation for learning networks is only slightly more complicated.

The following example defines a new model type Averager to perform the weighted average of two vectors (target predictions, for example). We suppose the weighting is normalized, and therefore controlled by a single hyper-parameter, mix.

mutable struct Averager <: Static
+ 52.857142
source

Static transformers

A static transformer is a model for transforming data that does not generalize to new data (does not "learn") but which nevertheless has hyperparameters. For example, the DBSAN clustering model from Clustering.jl can assign labels to some collection of observations, cannot directly assign a label to some new observation.

The general user may define their own static models. The main use-case is insertion into a Linear Pipelines some parameter-dependent transformation. (If a static transformer has no hyper-parameters, it is tantamount to an ordinary function. An ordinary function can be inserted directly into a pipeline; the situation for learning networks is only slightly more complicated.

The following example defines a new model type Averager to perform the weighted average of two vectors (target predictions, for example). We suppose the weighting is normalized, and therefore controlled by a single hyper-parameter, mix.

mutable struct Averager <: Static
     mix::Float64
 end
 
@@ -434,4 +434,4 @@
  (3, "virginica")
  (3, "virginica")
  (1, "virginica")
- (3, "virginica")
+ (3, "virginica") diff --git a/dev/tuning_models/index.html b/dev/tuning_models/index.html index 37e9d72c3..5a8592899 100644 --- a/dev/tuning_models/index.html +++ b/dev/tuning_models/index.html @@ -1,5 +1,5 @@ -Tuning Models · MLJ

Tuning Models

MLJ provides several built-in and third-party options for optimizing a model's hyper-parameters. The quick-reference table below omits some advanced keyword options.

tuning strategynotespackage to importpackage providing the core algorithm
Grid(goal=nothing, resolution=10)shuffled by default; goal is upper bound for number of grid pointsMLJ.jl or MLJTuning.jlMLJTuning.jl
RandomSearch(rng=GLOBAL_RNG)with customizable priorsMLJ.jl or MLJTuning.jlMLJTuning.jl
LatinHypercube(rng=GLOBAL_RNG)with discrete parameter supportMLJ.jl or MLJTuning.jlLatinHypercubeSampling
MLJTreeParzenTuning()See this example for usageTreeParzen.jlTreeParzen.jl (port to Julia of hyperopt)
ParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Standard Kennedy-Eberhart algorithm, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
AdaptiveParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Zhan et al. variant with automated swarm coefficient updates, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
Explicit()For an explicit list of models of varying typeMLJ.jl or MLJTuning.jlMLJTuning.jl

Below we illustrate hyperparameter optimization using the Grid, RandomSearch, LatinHypercube and Explicit tuning strategies.

Overview

In MLJ model tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine called mach, calling fit!(mach) instigates a search for optimal model hyperparameters, within a specified range, and then uses all supplied data to train the best model. To predict using that model, one then calls predict(mach, Xnew). In this way, the wrapped model may be viewed as a "self-tuning" version of the unwrapped model. That is, wrapping the model simply transforms certain hyper-parameters into learned parameters.

A corollary of the tuning-as-wrapper approach is that the evaluation of the performance of a TunedModel instance using evaluate! implies nested resampling. This approach is inspired by MLR. See also below.

In MLJ, tuning is an iterative procedure, with an iteration parameter n, the total number of model instances to be evaluated. Accordingly, tuning can be controlled using MLJ's IteratedModel wrapper. After familiarizing oneself with the TunedModel wrapper described below, see Controlling model tuning for more on this advanced feature.

For a more in-depth overview of tuning in MLJ, or for implementation details, see the MLJTuning documentation. For a complete list of options see the TunedModel doc-string below.

Tuning a single hyperparameter using a grid search (regression example)

using MLJ
+Tuning Models · MLJ

Tuning Models

MLJ provides several built-in and third-party options for optimizing a model's hyper-parameters. The quick-reference table below omits some advanced keyword options.

tuning strategynotespackage to importpackage providing the core algorithm
Grid(goal=nothing, resolution=10)shuffled by default; goal is upper bound for number of grid pointsMLJ.jl or MLJTuning.jlMLJTuning.jl
RandomSearch(rng=GLOBAL_RNG)with customizable priorsMLJ.jl or MLJTuning.jlMLJTuning.jl
LatinHypercube(rng=GLOBAL_RNG)with discrete parameter supportMLJ.jl or MLJTuning.jlLatinHypercubeSampling
MLJTreeParzenTuning()See this example for usageTreeParzen.jlTreeParzen.jl (port to Julia of hyperopt)
ParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Standard Kennedy-Eberhart algorithm, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
AdaptiveParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Zhan et al. variant with automated swarm coefficient updates, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
Explicit()For an explicit list of models of varying typeMLJ.jl or MLJTuning.jlMLJTuning.jl

Below we illustrate hyperparameter optimization using the Grid, RandomSearch, LatinHypercube and Explicit tuning strategies.

Overview

In MLJ model tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine called mach, calling fit!(mach) instigates a search for optimal model hyperparameters, within a specified range, and then uses all supplied data to train the best model. To predict using that model, one then calls predict(mach, Xnew). In this way, the wrapped model may be viewed as a "self-tuning" version of the unwrapped model. That is, wrapping the model simply transforms certain hyper-parameters into learned parameters.

A corollary of the tuning-as-wrapper approach is that the evaluation of the performance of a TunedModel instance using evaluate! implies nested resampling. This approach is inspired by MLR. See also below.

In MLJ, tuning is an iterative procedure, with an iteration parameter n, the total number of model instances to be evaluated. Accordingly, tuning can be controlled using MLJ's IteratedModel wrapper. After familiarizing oneself with the TunedModel wrapper described below, see Controlling model tuning for more on this advanced feature.

For a more in-depth overview of tuning in MLJ, or for implementation details, see the MLJTuning documentation. For a complete list of options see the TunedModel doc-string below.

Tuning a single hyperparameter using a grid search (regression example)

using MLJ
 X = MLJ.table(rand(100, 10));
 y = 2X.x1 - X.x2 + 0.05*rand(100);
 Tree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0;
@@ -66,8 +66,8 @@
 fit!(mach, verbosity=0)
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DecisionTreeRegressor(max_depth = -1, …), …)
   args: 
-    1:	Source @218 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @220 ⏎ AbstractVector{Continuous}
+    1:	Source @836 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @834 ⏎ AbstractVector{Continuous}
 

We can inspect the detailed results of the grid search with report(mach) or just retrieve the optimal model, as here:

fitted_params(mach).best_model
DecisionTreeRegressor(
   max_depth = -1, 
   min_samples_leaf = 5, 
@@ -109,8 +109,8 @@
 fit!(mach, verbosity=0);
trained Machine; does not cache data
   model: ProbabilisticTunedModel(model = KNNClassifier(K = 5, …), …)
   args: 
-    1:	Source @201 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @663 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @437 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @873 ⏎ AbstractVector{Multiclass{3}}
 

Case (ii) - deterministic measure:

self_tuning_knn = TunedModel(
     model=knn,
     resampling = CV(nfolds=4, rng=1234),
@@ -123,8 +123,8 @@
 fit!(mach, verbosity=0);
trained Machine; does not cache data
   model: ProbabilisticTunedModel(model = KNNClassifier(K = 5, …), …)
   args: 
-    1:	Source @037 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @220 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @727 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @740 ⏎ AbstractVector{Multiclass{3}}
 

Let's inspect the best model and corresponding evaluation of the metric in case (ii):

entry = report(mach).best_history_entry
(model = KNNClassifier(K = 9, …),
  measure = StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.FussyMeasure{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.Multimeasure{StatisticalMeasuresBase.SupportsMissingsMeasure{StatisticalMeasures.MisclassificationRateOnScalars}, Nothing, StatisticalMeasuresBase.Mean, typeof(identity)}}, Nothing}}[MisclassificationRate()],
  measurement = [0.02666666666666667],
@@ -181,8 +181,8 @@
 fit!(mach, verbosity=0);
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @708 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @470 ⏎ AbstractVector{Continuous}
+    1:	Source @023 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @976 ⏎ AbstractVector{Continuous}
 

We can plot the grid search results:

using Plots
 plot(mach)

Instead of specifying a goal, we can declare a global resolution, which is overridden for a particular parameter by pairing its range with the resolution desired. In the next example, the default resolution=100 is applied to the r2 field, but a resolution of 3 is applied to the r1 field. Additionally, we ask that the grid points be randomly traversed and the total number of evaluations be limited to 25.

tuning = Grid(resolution=100, shuffle=true, rng=1234)
 self_tuning_forest = TunedModel(
@@ -196,8 +196,8 @@
 fit!(machine(self_tuning_forest, X, y), verbosity=0);
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @214 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @347 ⏎ AbstractVector{Continuous}
+    1:	Source @550 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @017 ⏎ AbstractVector{Continuous}
 

For more options for a grid search, see Grid below.

Let's attempt to tune the same hyperparameters using a RandomSearch tuning strategy. By default, bounded numeric ranges like r1 and r2 are sampled uniformly (before rounding, in the case of the integer range r1). Positive unbounded ranges are sampled using a Gamma distribution by default, and all others using a (truncated) normal distribution.

self_tuning_forest = TunedModel(
     model=forest,
     tuning=RandomSearch(),
@@ -212,8 +212,8 @@
 fit!(mach, verbosity=0)
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @552 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @577 ⏎ AbstractVector{Continuous}
+    1:	Source @784 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @612 ⏎ AbstractVector{Continuous}
 
using Plots
 plot(mach)

The prior distributions used for sampling each hyperparameter can be customized, as can the global fallbacks. See the RandomSearch doc-string below for details.

Tuning using Latin hypercube sampling

One can also tune the hyperparameters using the LatinHypercube tuning strategy. This method uses a genetic-based optimization algorithm based on the inverse of the Audze-Eglais function, using the library LatinHypercubeSampling.jl.

We'll work with the data X, y and ranges r1 and r2 defined above and instantiate a Latin hypercube resampling strategy:

latin = LatinHypercube(gens=2, popsize=120)
LatinHypercube(
   gens = 2, 
@@ -236,8 +236,8 @@
 fit!(mach, verbosity=0)
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @731 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @995 ⏎ AbstractVector{Continuous}
+    1:	Source @005 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @803 ⏎ AbstractVector{Continuous}
 
using Plots
 plot(mach)

Comparing models of different type and nested cross-validation

Instead of mutating hyperparameters of a fixed model, one can instead optimise over an explicit list of models, whose types are allowed to vary. As with other tuning strategies, evaluating the resulting TunedModel itself implies nested resampling (e.g., nested cross-validation) which we now examine in a bit more detail.

tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()
 knn = (@load KNNClassifier pkg=NearestNeighborModels verbosity=0)()
@@ -281,8 +281,8 @@
  measurement = [0.7208730677823433],
  per_fold = [[2.220446049250313e-16, 2.1202149052421855, 2.220446049250313e-16]],
  evaluation = CompactPerformanceEvaluation(0.721,),)

Reference

Base.rangeFunction
r = range(model, :hyper; values=nothing)

Define a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.

A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.

r = range(model, :hyper; upper=nothing, lower=nothing,
-          scale=nothing, values=nothing)

Assuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.

Note that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.

By default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.

A nested hyperparameter is specified using dot notation (see above).

If scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.

If values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).

See also: iterator, sampler

source
MLJBase.iteratorFunction
iterator([rng, ], r::NominalRange, [,n])
-iterator([rng, ], r::NumericRange, n)

Return an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:

(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:

r.lowerr.upperLU
finitefiniter.lowerr.upper
-Inffiniter.upper - 2r.unitr.upper
finiteInfr.lowerr.lower + 2r.unit
-InfInfr.origin - r.unitr.origin + r.unit

(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)

(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).

(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.

source
Distributions.samplerFunction
sampler(r::NominalRange, probs::AbstractVector{<:Real})
+          scale=nothing, values=nothing)

Assuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.

Note that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.

By default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.

A nested hyperparameter is specified using dot notation (see above).

If scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.

If values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).

See also: iterator, sampler

source
MLJBase.iteratorFunction
iterator([rng, ], r::NominalRange, [,n])
+iterator([rng, ], r::NumericRange, n)

Return an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:

(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:

r.lowerr.upperLU
finitefiniter.lowerr.upper
-Inffiniter.upper - 2r.unitr.upper
finiteInfr.lowerr.lower + 2r.unit
-InfInfr.origin - r.unitr.origin + r.unit

(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)

(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).

(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.

source
Distributions.samplerFunction
sampler(r::NominalRange, probs::AbstractVector{<:Real})
 sampler(r::NominalRange)
 sampler(r::NumericRange{T}, d)

Construct an object s which can be used to generate random samples from a ParamRange object r (a one-dimensional range) using one of the following calls:

rand(s)             # for one sample
 rand(s, n)          # for n samples
@@ -309,7 +309,7 @@
 [5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221
 [5.5, 6.0) ┤ 0
 [6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89
-           └                                        ┘
source
StatsAPI.fitMethod
Distributions.fit(D, r::MLJBase.NumericRange)

Fit and return a distribution d of type D to the one-dimensional range r.

Only types D in the table below are supported.

The distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.

Distribution type DCharacterization of d0
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweightminimum(d) = r.lower, maximum(d) = r.upper
Normal, Gamma, InverseGaussian, Logistic, LogNormalmean(d) = r.origin, std(d) = r.unit
Cauchy, Gumbel, Laplace, (Normal)Dist.location(d) = r.origin, Dist.scale(d) = r.unit
PoissonDist.mean(d) = r.unit

Here Dist = Distributions.

source
StatsAPI.fitMethod
Distributions.fit(D, r::MLJBase.NumericRange)

Fit and return a distribution d of type D to the one-dimensional range r.

Only types D in the table below are supported.

The distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.

Distribution type DCharacterization of d0
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweightminimum(d) = r.lower, maximum(d) = r.upper
Normal, Gamma, InverseGaussian, Logistic, LogNormalmean(d) = r.origin, std(d) = r.unit
Cauchy, Gumbel, Laplace, (Normal)Dist.location(d) = r.origin, Dist.scale(d) = r.unit
PoissonDist.mean(d) = r.unit

Here Dist = Distributions.

source
MLJTuning.TunedModelFunction
tuned_model = TunedModel(; model=<model to be mutated>,
                          tuning=RandomSearch(),
                          resampling=Holdout(),
                          range=nothing,
@@ -321,11 +321,11 @@
                          measure=nothing,
                          n=length(models),
                          operation=nothing,
-                         other_options...)

Construct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.

See below for a complete list of options.

Training

Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:

  • Instigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.

  • Fit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.

Search space

The range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.

The number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.

Measures (metrics)

If more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.

Important. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.

Accessing the fitted parameters and other training (tuning) outcomes

A Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).

Once a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:

keyvalue
best_modeloptimal model instance
best_fitted_paramslearned parameters of the optimal model

The named tuple report(mach) includes these keys/values:

keyvalue
best_modeloptimal model instance
best_history_entrycorresponding entry in the history, including performance estimate
best_reportreport generated by fitting the optimal model to all data
historytuning strategy-specific history of all evaluations

plus other key/value pairs specific to the tuning strategy.

Each element of history is a property-accessible object with these properties:

keyvalue
measurevector of measures (metrics)
measurementvector of measurements, one per measure
per_foldvector of vectors of unaggregated per-fold measurements
evaluationfull PerformanceEvaluation/CompactPerformaceEvaluation object

Complete list of key-word options

  • model: Supervised model prototype that is cloned and mutated to generate models for evaluation

  • models: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.

  • tuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.

  • resampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations

  • measure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history

  • weights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).

  • class_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).

  • repeats=1: for generating train/test sets multiple times in resampling ("Monte Carlo" resampling); see evaluate! for details

  • operation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.

  • range: range object; tuning strategy documentation describes supported types

  • selection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).

  • n: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified

  • train_best=true: whether to train the optimal model

  • acceleration=default_resource(): mode of parallelization for tuning strategies that support this

  • acceleration_resampling=CPU1(): mode of parallelization for resampling

  • check_measure=true: whether to check measure is compatible with the specified model and operation)

  • cache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.

  • compact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.

source
MLJTuning.GridType
Grid(goal=nothing, resolution=10, rng=Random.GLOBAL_RNG, shuffle=true)

Instantiate a Cartesian grid-based hyperparameter tuning strategy with a specified number of grid points as goal, or using a specified default resolution in each numeric dimension.

Supported ranges:

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in Grid search, the range field of a TunedModel instance can be:

  • A single one-dimensional range - ie, ParamRange object - r, or pair of the form (r, res) where res specifies a resolution to override the default resolution.

  • Any vector of objects of the above form

Two elements of a range vector may share the same field attribute, with the effect that their grids are combined, as in Example 3 below.

ParamRange objects are constructed using the range method.

Example 1:

range(model, :hyper1, lower=1, origin=2, unit=1)

Example 2:

[(range(model, :hyper1, lower=1, upper=10), 15),
+                         other_options...)

Construct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.

See below for a complete list of options.

Training

Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:

  • Instigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.

  • Fit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.

Search space

The range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.

The number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.

Measures (metrics)

If more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.

Important. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.

Accessing the fitted parameters and other training (tuning) outcomes

A Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).

Once a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:

keyvalue
best_modeloptimal model instance
best_fitted_paramslearned parameters of the optimal model

The named tuple report(mach) includes these keys/values:

keyvalue
best_modeloptimal model instance
best_history_entrycorresponding entry in the history, including performance estimate
best_reportreport generated by fitting the optimal model to all data
historytuning strategy-specific history of all evaluations

plus other key/value pairs specific to the tuning strategy.

Each element of history is a property-accessible object with these properties:

keyvalue
measurevector of measures (metrics)
measurementvector of measurements, one per measure
per_foldvector of vectors of unaggregated per-fold measurements
evaluationfull PerformanceEvaluation/CompactPerformaceEvaluation object

Complete list of key-word options

  • model: Supervised model prototype that is cloned and mutated to generate models for evaluation

  • models: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.

  • tuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.

  • resampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations

  • measure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history

  • weights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).

  • class_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).

  • repeats=1: for generating train/test sets multiple times in resampling ("Monte Carlo" resampling); see evaluate! for details

  • operation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.

  • range: range object; tuning strategy documentation describes supported types

  • selection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).

  • n: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified

  • train_best=true: whether to train the optimal model

  • acceleration=default_resource(): mode of parallelization for tuning strategies that support this

  • acceleration_resampling=CPU1(): mode of parallelization for resampling

  • check_measure=true: whether to check measure is compatible with the specified model and operation)

  • cache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.

  • compact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.

source
MLJTuning.GridType
Grid(goal=nothing, resolution=10, rng=Random.GLOBAL_RNG, shuffle=true)

Instantiate a Cartesian grid-based hyperparameter tuning strategy with a specified number of grid points as goal, or using a specified default resolution in each numeric dimension.

Supported ranges:

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in Grid search, the range field of a TunedModel instance can be:

  • A single one-dimensional range - ie, ParamRange object - r, or pair of the form (r, res) where res specifies a resolution to override the default resolution.

  • Any vector of objects of the above form

Two elements of a range vector may share the same field attribute, with the effect that their grids are combined, as in Example 3 below.

ParamRange objects are constructed using the range method.

Example 1:

range(model, :hyper1, lower=1, origin=2, unit=1)

Example 2:

[(range(model, :hyper1, lower=1, upper=10), 15),
   range(model, :hyper2, lower=2, upper=4),
   range(model, :hyper3, values=[:ball, :tree])]

Example 3:

# a range generating the grid `[1, 2, 10, 20, 30]` for `:hyper1`:
 [range(model, :hyper1, values=[1, 2]),
- (range(model, :hyper1, lower= 10, upper=30), 3)]

Note: All the field values of the ParamRange objects (:hyper1, :hyper2, :hyper3 in the preceding example) must refer to field names a of single model (the model specified during TunedModel construction).

Algorithm

This is a standard grid search with the following specifics: In all cases all values of each specified NominalRange are exhausted. If goal is specified, then all resolutions are ignored, and a global resolution is applied to the NumericRange objects that maximizes the number of grid points, subject to the restriction that this not exceed goal. (This assumes no field appears twice in the range vector.) Otherwise the default resolution and any parameter-specific resolutions apply.

In all cases the models generated are shuffled using rng, unless shuffle=false.

See also TunedModel, range.

source
MLJTuning.RandomSearchType
RandomSearch(bounded=Distributions.Uniform,
+ (range(model, :hyper1, lower= 10, upper=30), 3)]

Note: All the field values of the ParamRange objects (:hyper1, :hyper2, :hyper3 in the preceding example) must refer to field names a of single model (the model specified during TunedModel construction).

Algorithm

This is a standard grid search with the following specifics: In all cases all values of each specified NominalRange are exhausted. If goal is specified, then all resolutions are ignored, and a global resolution is applied to the NumericRange objects that maximizes the number of grid points, subject to the restriction that this not exceed goal. (This assumes no field appears twice in the range vector.) Otherwise the default resolution and any parameter-specific resolutions apply.

In all cases the models generated are shuffled using rng, unless shuffle=false.

See also TunedModel, range.

source
MLJTuning.RandomSearchType
RandomSearch(bounded=Distributions.Uniform,
              positive_unbounded=Distributions.Gamma,
              other=Distributions.Normal,
              rng=Random.GLOBAL_RNG)

Instantiate a random search tuning strategy, for searching over Cartesian hyperparameter domains, with customizable priors in each dimension.

Supported ranges

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. If not paired with a prior, then one is fitted, according to fallback distribution types specified by the tuning strategy hyperparameters. Specifically, in RandomSearch, the range field of a TunedModel instance can be:

  • a single one-dimensional range (ParamRange object) r

  • a pair of the form (r, d), with r as above and where d is:

    • a probability vector of the same length as r.values (r a NominalRange)

    • any Distributions.UnivariateDistribution instance (r a NumericRange)

    • one of the subtypes of Distributions.UnivariateDistribution listed in the table below, for automatic fitting using Distributions.fit(d, r), a distribution whose support always lies between r.lower and r.upper (r a NumericRange)

  • any pair of the form (field, s), where field is the (possibly nested) name of a field of the model to be tuned, and s an arbitrary sampler object for that field. This means only that rand(rng, s) is defined and returns valid values for the field.

  • any vector of objects of the above form

A range vector may contain multiple entries for the same model field, as in range = [(:lambda, s1), (:alpha, s), (:lambda, s2)]. In that case the entry used in each iteration is random.

distribution typesfor fitting to ranges of this type
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweightbounded
Gamma, InverseGaussian, Poissonpositive (bounded or unbounded)
Normal, Logistic, LogNormal, Cauchy, Gumbel, Laplaceany

ParamRange objects are constructed using the range method.

Examples

using Distributions
@@ -340,7 +340,7 @@
 # uniform sampling of :(atom.λ) from [0, 1] without defining a NumericRange:
 struct MySampler end
 Base.rand(rng::Random.AbstractRNG, ::MySampler) = rand(rng)
-range3 = (:(atom.λ), MySampler())

Algorithm

In each iteration, a model is generated for evaluation by mutating the fields of a deep copy of model. The range vector is shuffled and the fields sampled according to the new order (repeated fields being mutated more than once). For a range entry of the form (field, s) the algorithm calls rand(rng, s) and mutates the field field of the model clone to have this value. For an entry of the form (r, d), s is substituted with sampler(r, d). If no d is specified, then sampling is uniform (with replacement) if r is a NominalRange, and is otherwise given by the defaults specified by the tuning strategy parameters bounded, positive_unbounded, and other, depending on the field values of the NumericRange object r.

See also TunedModel, range, sampler.

source
MLJTuning.LatinHypercubeType
LatinHypercube(gens = 1,
+range3 = (:(atom.λ), MySampler())

Algorithm

In each iteration, a model is generated for evaluation by mutating the fields of a deep copy of model. The range vector is shuffled and the fields sampled according to the new order (repeated fields being mutated more than once). For a range entry of the form (field, s) the algorithm calls rand(rng, s) and mutates the field field of the model clone to have this value. For an entry of the form (r, d), s is substituted with sampler(r, d). If no d is specified, then sampling is uniform (with replacement) if r is a NominalRange, and is otherwise given by the defaults specified by the tuning strategy parameters bounded, positive_unbounded, and other, depending on the field values of the NumericRange object r.

See also TunedModel, range, sampler.

source
MLJTuning.LatinHypercubeType
LatinHypercube(gens = 1,
                popsize = 100,
                ntour = 2,
                ptour = 0.8.,
@@ -351,4 +351,4 @@
                          tuning=LatinHypercube(...),
                          range=...,
                          measures=...,
-                         n=...)

(See TunedModel for complete options.)

To use a periodic version of the Audze-Eglais function (to reduce clustering along the boundaries) specify periodic_ae = true.

Supported ranges:

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in LatinHypercubeSampling search, the range field of a TunedModel instance can be:

  • A single one-dimensional range - ie, ParamRange object - r, constructed

using the range method.

  • Any vector of objects of the above form

Both NumericRanges and NominalRanges are supported, and hyper-parameter values are sampled on a scale specified by the range (eg, r.scale = :log).

source
+ n=...)

(See TunedModel for complete options.)

To use a periodic version of the Audze-Eglais function (to reduce clustering along the boundaries) specify periodic_ae = true.

Supported ranges:

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in LatinHypercubeSampling search, the range field of a TunedModel instance can be:

  • A single one-dimensional range - ie, ParamRange object - r, constructed

using the range method.

  • Any vector of objects of the above form

Both NumericRanges and NominalRanges are supported, and hyper-parameter values are sampled on a scale specified by the range (eg, r.scale = :log).

source
diff --git a/dev/weights/index.html b/dev/weights/index.html index dedcf8207..09814ad58 100644 --- a/dev/weights/index.html +++ b/dev/weights/index.html @@ -1,5 +1,5 @@ -Weights · MLJ

Weights

In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.

There are two kinds of weights in use in MLJ:

  • per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations

  • class weights refer to dictionaries keyed on the target classes (levels) for use in classification problems

Specifying weights in training

To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:

KNNRegressor = @load KNNRegressor
+Weights · MLJ

Weights

In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.

There are two kinds of weights in use in MLJ:

  • per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations

  • class weights refer to dictionaries keyed on the target classes (levels) for use in classification problems

Specifying weights in training

To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:

KNNRegressor = @load KNNRegressor
 model = KNNRegressor()
 X, y = make_regression(10, 3)
 w = rand(length(y))
@@ -9,4 +9,4 @@
 end

The model model supports class weights if supports_class_weights(model) is true.

Specifying weights in performance evaluation

When calling a measure (metric) that supports weights, provide the weights as the last argument, as in

_, y = @load_iris
 ŷ = shuffle(y)
 w = Dict("versicolor" => 1, "setosa" => 2, "virginica"=> 3)
-macro_f1score(ŷ, y, w)

Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.

To pass weights to all the measures listed in an evaluate!/evaluate call, use the keyword specifiers weights=... or class_weights=.... For details, see Evaluating Model Performance.

+macro_f1score(ŷ, y, w)

Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.

To pass weights to all the measures listed in an evaluate!/evaluate call, use the keyword specifiers weights=... or class_weights=.... For details, see Evaluating Model Performance.

diff --git a/dev/working_with_categorical_data/index.html b/dev/working_with_categorical_data/index.html index fb614ab34..3decb130c 100644 --- a/dev/working_with_categorical_data/index.html +++ b/dev/working_with_categorical_data/index.html @@ -1,5 +1,5 @@ -Working with Categorical Data · MLJ

Working with Categorical Data

Scientific types for discrete data

Recall that models articulate their data requirements using scientific types (see Getting Started or the ScientificTypes.jl documentation). There are three scientific types discrete data can have: Count, OrderedFactor and Multiclass.

Count data

In MLJ you cannot use integers to represent (finite) categorical data. Integers are reserved for discrete data you want interpreted as Count <: Infinite:

scitype([1, 4, 5, 6])
AbstractVector{Count} (alias for AbstractArray{Count, 1})

The Count scientific type includes things like the number of phone calls, or city populations, and other "frequency" data of a generally unbounded nature.

That said, you may have data that is theoretically Count, but which you coerce to OrderedFactor to enable the use of more models, trusting to your knowledge of how those models work to inform an appropriate interpretation.

OrderedFactor and Multiclass data

Other integer data, such as the number of an animal's legs, or number of rooms in homes, are, generally, coerced to OrderedFactor <: Finite. The other categorical scientific type is Multiclass <: Finite, which is for unordered categorical data. Coercing data to one of these two forms is discussed under Detecting and coercing improperly represented categorical data below.

Binary data

There is no separate scientific type for binary data. Binary data is either OrderedFactor{2} if ordered, and Multiclass{2} otherwise. Data with type OrderedFactor{2} is considered to have an intrinsic "positive" class, e.g., the outcome of a medical test, and the "pass/fail" outcome of an exam. MLJ measures, such as true_positive assume the second class in the ordering is the "positive" class. Inspecting and changing order are discussed in the next section.

If data has type Bool it is considered Count data (as Bool <: Integer) and, generally, users will want to coerce such data to Multiclass or OrderedFactor.

Detecting and coercing improperly represented categorical data

One inspects the scientific type of data using scitype as shown above. To inspect all column scientific types in a table simultaneously, use schema. (The scitype(X) of a table X contains a condensed form of this information used in type dispatch; see here.)

import DataFrames: DataFrame
+Working with Categorical Data · MLJ

Working with Categorical Data

Scientific types for discrete data

Recall that models articulate their data requirements using scientific types (see Getting Started or the ScientificTypes.jl documentation). There are three scientific types discrete data can have: Count, OrderedFactor and Multiclass.

Count data

In MLJ you cannot use integers to represent (finite) categorical data. Integers are reserved for discrete data you want interpreted as Count <: Infinite:

scitype([1, 4, 5, 6])
AbstractVector{Count} (alias for AbstractArray{Count, 1})

The Count scientific type includes things like the number of phone calls, or city populations, and other "frequency" data of a generally unbounded nature.

That said, you may have data that is theoretically Count, but which you coerce to OrderedFactor to enable the use of more models, trusting to your knowledge of how those models work to inform an appropriate interpretation.

OrderedFactor and Multiclass data

Other integer data, such as the number of an animal's legs, or number of rooms in homes, are, generally, coerced to OrderedFactor <: Finite. The other categorical scientific type is Multiclass <: Finite, which is for unordered categorical data. Coercing data to one of these two forms is discussed under Detecting and coercing improperly represented categorical data below.

Binary data

There is no separate scientific type for binary data. Binary data is either OrderedFactor{2} if ordered, and Multiclass{2} otherwise. Data with type OrderedFactor{2} is considered to have an intrinsic "positive" class, e.g., the outcome of a medical test, and the "pass/fail" outcome of an exam. MLJ measures, such as true_positive assume the second class in the ordering is the "positive" class. Inspecting and changing order are discussed in the next section.

If data has type Bool it is considered Count data (as Bool <: Integer) and, generally, users will want to coerce such data to Multiclass or OrderedFactor.

Detecting and coercing improperly represented categorical data

One inspects the scientific type of data using scitype as shown above. To inspect all column scientific types in a table simultaneously, use schema. (The scitype(X) of a table X contains a condensed form of this information used in type dispatch; see here.)

import DataFrames: DataFrame
 X = DataFrame(
     name = ["Siri", "Robo", "Alexa", "Cortana"],
     gender = ["male", "male", "Female", "female"],
@@ -108,4 +108,4 @@
  UnivariateFinite{Multiclass{3}}(no=>0.245, yes=>0.755)
  UnivariateFinite{Multiclass{3}}(no=>0.447, yes=>0.553)
  UnivariateFinite{Multiclass{3}}(no=>0.509, yes=>0.491)
- UnivariateFinite{Multiclass{3}}(no=>0.218, yes=>0.782)

Or, equivalently:

d_vec = UnivariateFinite(["no", "yes"], yes_probs, augment=true, pool=v)

For more options, see UnivariateFinite.

+ UnivariateFinite{Multiclass{3}}(no=>0.218, yes=>0.782)

Or, equivalently:

d_vec = UnivariateFinite(["no", "yes"], yes_probs, augment=true, pool=v)

For more options, see UnivariateFinite.