Bagging stands for bootstrap aggregation. One way to reduce the variance of an estimate is to average together multiple estimates. For example, we can train M different trees on different subsets of the data (chosen randomly with replacement) and compute the ensemble:
Bagging uses bootstrap sampling to obtain the data subsets for training the base learners. For aggregating the outputs of base learners, bagging uses voting for classification and averaging for regression.
// Define the weak classifier. DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(5, 0); // Set up the bagging process. BaggedTrainer<Double> baggedTrainer = TrainerTransformers.makeBagged( trainer, // Trainer for making bagged 10, // Size of ensemble 0.6, // Subsample ratio to whole dataset 4, // Feature vector dimensionality 3, // Feature subspace dimensionality new OnMajorityPredictionsAggregator()) .withEnvironmentBuilder(LearningEnvironmentBuilder .defaultBuilder() .withRNGSeed(1) ); // Train the Bagged Model. BaggedModel mdl = baggedTrainer.fit( ignite, dataCache, vectorizer );
|A commonly used class of ensemble algorithms are forests of randomized trees.|
The full example could be found as a part of the Titanic tutorial here.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.