October 20th, Q&A session: Get you issues solved and questions answered!

GitHub logo
Edit

Stacking

Stacking (sometimes called stacked generalization) involves training a learning algorithm to combine the predictions of several other learning algorithms.

First, all of the other algorithms are trained using the available data, then a combiner algorithm is trained to make a final prediction using all the predictions of the other algorithms as additional inputs. If an arbitrary combiner algorithm is used, then stacking can theoretically represent any of the widely known ensemble techniques, although, in practice, a logistic regression model is often used as the combiner like in the example below.

DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(5, 0);
DecisionTreeClassificationTrainer trainer1 = new DecisionTreeClassificationTrainer(3, 0);
DecisionTreeClassificationTrainer trainer2 = new DecisionTreeClassificationTrainer(4, 0);

LogisticRegressionSGDTrainer aggregator = new LogisticRegressionSGDTrainer()
  .withUpdatesStgy(new UpdatesStrategy<>(new SimpleGDUpdateCalculator(0.2),
                                         SimpleGDParameterUpdate.SUM_LOCAL,
                                         SimpleGDParameterUpdate.AVG));

StackedModel<Vector, Vector, Double, LogisticRegressionModel> mdl = new StackedVectorDatasetTrainer<>(aggregator)
  .addTrainerWithDoubleOutput(trainer)
  .addTrainerWithDoubleOutput(trainer1)
  .addTrainerWithDoubleOutput(trainer2)
  .fit(ignite,
       dataCache,
       vectorizer
      );
Note
The Evaluator works well with the StackedModel

Example

The full example could be found as a part of the Titanic tutorial here.