diff --git a/doc/modules/calibration.rst b/doc/modules/calibration.rst index 0c0af594398a0..9762414ac8cc0 100644 --- a/doc/modules/calibration.rst +++ b/doc/modules/calibration.rst @@ -44,7 +44,7 @@ with different biases per method: * :class:`RandomForestClassifier` shows the opposite behavior: the histograms show peaks at approximately 0.2 and 0.9 probability, while probabilities close to 0 or 1 are very rare. An explanation for this is given by Niculescu-Mizil - and Caruana [4]: "Methods such as bagging and random forests that average + and Caruana [4]_: "Methods such as bagging and random forests that average predictions from a base set of models can have difficulty making predictions near 0 and 1 because variance in the underlying base models will bias predictions that should be near zero or one away from these values. Because @@ -57,7 +57,7 @@ with different biases per method: ensemble away from 0. We observe this effect most strongly with random forests because the base-level trees trained with random forests have relatively high variance due to feature subseting." As a result, the - calibration curve also referred to as the reliability diagram (Wilks 1995[5]) shows a + calibration curve also referred to as the reliability diagram (Wilks 1995 [5]_) shows a characteristic sigmoid shape, indicating that the classifier could trust its "intuition" more and return probabilties closer to 0 or 1 typically. @@ -65,7 +65,7 @@ with different biases per method: * Linear Support Vector Classification (:class:`LinearSVC`) shows an even more sigmoid curve as the RandomForestClassifier, which is typical for maximum-margin methods - (compare Niculescu-Mizil and Caruana [4]), which focus on hard samples + (compare Niculescu-Mizil and Caruana [4]_), which focus on hard samples that are close to the decision boundary (the support vectors). .. currentmodule:: sklearn.calibration @@ -190,18 +190,18 @@ a similar decrease in log-loss. .. topic:: References: - .. [1] Obtaining calibrated probability estimates from decision trees - and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001 + * Obtaining calibrated probability estimates from decision trees + and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001 - .. [2] Transforming Classifier Scores into Accurate Multiclass - Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002) + * Transforming Classifier Scores into Accurate Multiclass + Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002) - .. [3] Probabilistic Outputs for Support Vector Machines and Comparisons to - Regularized Likelihood Methods, J. Platt, (1999) + * Probabilistic Outputs for Support Vector Machines and Comparisons to + Regularized Likelihood Methods, J. Platt, (1999) .. [4] Predicting Good Probabilities with Supervised Learning, - A. Niculescu-Mizil & R. Caruana, ICML 2005 + A. Niculescu-Mizil & R. Caruana, ICML 2005 .. [5] On the combination of forecast probabilities for - consecutive precipitation periods. Wea. Forecasting, 5, 640â€“ - 650., Wilks, D. S., 1990a + consecutive precipitation periods. Wea. Forecasting, 5, 640â€“650., + Wilks, D. S., 1990a diff --git a/doc/modules/classes.rst b/doc/modules/classes.rst index b41de5c108b5c..128f1c85f13e2 100644 --- a/doc/modules/classes.rst +++ b/doc/modules/classes.rst @@ -41,9 +41,34 @@ Functions base.clone config_context - set_config get_config + set_config + +.. _calibration_ref: +:mod:`sklearn.calibration`: Probability Calibration +=================================================== + +.. automodule:: sklearn.calibration + :no-members: + :no-inherited-members: + +**User guide:** See the :ref:`calibration` section for further details. + +.. currentmodule:: sklearn + +.. autosummary:: + :toctree: generated/ + :template: class.rst + + calibration.CalibratedClassifierCV + + +.. autosummary:: + :toctree: generated/ + :template: function.rst + + calibration.calibration_curve .. _cluster_ref: @@ -80,13 +105,13 @@ Functions :toctree: generated/ :template: function.rst - cluster.estimate_bandwidth - cluster.k_means - cluster.ward_tree cluster.affinity_propagation cluster.dbscan + cluster.estimate_bandwidth + cluster.k_means cluster.mean_shift cluster.spectral_clustering + cluster.ward_tree .. _bicluster_ref: @@ -141,60 +166,21 @@ Classes :template: function.rst covariance.empirical_covariance + covariance.graph_lasso covariance.ledoit_wolf - covariance.shrunk_covariance covariance.oas - covariance.graph_lasso + covariance.shrunk_covariance +.. _cross_decomposition_ref: -:mod:`sklearn.model_selection`: Model Selection -=============================================== +:mod:`sklearn.cross_decomposition`: Cross decomposition +======================================================= -.. automodule:: sklearn.model_selection +.. automodule:: sklearn.cross_decomposition :no-members: :no-inherited-members: -**User guide:** See the :ref:`cross_validation`, :ref:`grid_search` and -:ref:`learning_curve` sections for further details. - -Splitter Classes ----------------- - -.. currentmodule:: sklearn - -.. autosummary:: - :toctree: generated/ - :template: class.rst - - model_selection.KFold - model_selection.GroupKFold - model_selection.StratifiedKFold - model_selection.LeaveOneGroupOut - model_selection.LeavePGroupsOut - model_selection.LeaveOneOut - model_selection.LeavePOut - model_selection.RepeatedKFold - model_selection.RepeatedStratifiedKFold - model_selection.ShuffleSplit - model_selection.GroupShuffleSplit - model_selection.StratifiedShuffleSplit - model_selection.PredefinedSplit - model_selection.TimeSeriesSplit - -Splitter Functions ------------------- - -.. currentmodule:: sklearn - -.. autosummary:: - :toctree: generated/ - :template: function.rst - - model_selection.train_test_split - model_selection.check_cv - -Hyper-parameter optimizers --------------------------- +**User guide:** See the :ref:`cross_decomposition` section for further details. .. currentmodule:: sklearn @@ -202,33 +188,10 @@ Hyper-parameter optimizers :toctree: generated/ :template: class.rst - model_selection.GridSearchCV - model_selection.RandomizedSearchCV - model_selection.ParameterGrid - model_selection.ParameterSampler - - -.. autosummary:: - :toctree: generated/ - :template: function.rst - - model_selection.fit_grid_point - -Model validation ----------------- - -.. currentmodule:: sklearn - -.. autosummary:: - :toctree: generated/ - :template: function.rst - - model_selection.cross_validate - model_selection.cross_val_score - model_selection.cross_val_predict - model_selection.permutation_test_score - model_selection.learning_curve - model_selection.validation_curve + cross_decomposition.CCA + cross_decomposition.PLSCanonical + cross_decomposition.PLSRegression + cross_decomposition.PLSSVD .. _datasets_ref: @@ -251,33 +214,33 @@ Loaders :template: function.rst datasets.clear_data_home - datasets.get_data_home + datasets.dump_svmlight_file datasets.fetch_20newsgroups datasets.fetch_20newsgroups_vectorized + datasets.fetch_california_housing + datasets.fetch_covtype + datasets.fetch_kddcup99 + datasets.fetch_lfw_pairs + datasets.fetch_lfw_people + datasets.fetch_mldata + datasets.fetch_olivetti_faces + datasets.fetch_rcv1 + datasets.fetch_species_distributions + datasets.get_data_home datasets.load_boston datasets.load_breast_cancer datasets.load_diabetes datasets.load_digits datasets.load_files datasets.load_iris - datasets.load_wine - datasets.fetch_lfw_pairs - datasets.fetch_lfw_people datasets.load_linnerud - datasets.mldata_filename - datasets.fetch_mldata - datasets.fetch_olivetti_faces - datasets.fetch_california_housing - datasets.fetch_covtype - datasets.fetch_kddcup99 - datasets.fetch_rcv1 datasets.load_mlcomp datasets.load_sample_image datasets.load_sample_images - datasets.fetch_species_distributions datasets.load_svmlight_file datasets.load_svmlight_files - datasets.dump_svmlight_file + datasets.load_wine + datasets.mldata_filename Samples generator ----------------- @@ -288,9 +251,11 @@ Samples generator :toctree: generated/ :template: function.rst + datasets.make_biclusters datasets.make_blobs - datasets.make_classification + datasets.make_checkerboard datasets.make_circles + datasets.make_classification datasets.make_friedman1 datasets.make_friedman2 datasets.make_friedman3 @@ -306,8 +271,6 @@ Samples generator datasets.make_sparse_uncorrelated datasets.make_spd_matrix datasets.make_swiss_roll - datasets.make_biclusters - datasets.make_checkerboard .. _decomposition_ref: @@ -327,29 +290,49 @@ Samples generator :toctree: generated/ :template: class.rst - decomposition.PCA - decomposition.IncrementalPCA - decomposition.KernelPCA + decomposition.DictionaryLearning decomposition.FactorAnalysis decomposition.FastICA - decomposition.TruncatedSVD + decomposition.IncrementalPCA + decomposition.KernelPCA + decomposition.LatentDirichletAllocation + decomposition.MiniBatchDictionaryLearning + decomposition.MiniBatchSparsePCA decomposition.NMF + decomposition.PCA decomposition.SparsePCA - decomposition.MiniBatchSparsePCA decomposition.SparseCoder - decomposition.DictionaryLearning - decomposition.MiniBatchDictionaryLearning - decomposition.LatentDirichletAllocation + decomposition.TruncatedSVD .. autosummary:: :toctree: generated/ :template: function.rst - decomposition.fastica decomposition.dict_learning decomposition.dict_learning_online + decomposition.fastica decomposition.sparse_encode +.. _lda_ref: + +:mod:`sklearn.discriminant_analysis`: Discriminant Analysis +=========================================================== + +.. automodule:: sklearn.discriminant_analysis + :no-members: + :no-inherited-members: + +**User guide:** See the :ref:`lda_qda` section for further details. + +.. currentmodule:: sklearn + +.. autosummary:: + :toctree: generated + :template: class.rst + + discriminant_analysis.LinearDiscriminantAnalysis + discriminant_analysis.QuadraticDiscriminantAnalysis + .. _dummy_ref: :mod:`sklearn.dummy`: Dummy estimators @@ -401,8 +384,8 @@ Samples generator ensemble.GradientBoostingRegressor ensemble.IsolationForest ensemble.RandomForestClassifier - ensemble.RandomTreesEmbedding ensemble.RandomForestRegressor + ensemble.RandomTreesEmbedding ensemble.VotingClassifier .. autosummary:: @@ -442,13 +425,13 @@ partial dependence :toctree: generated/ :template: class_without_init.rst - exceptions.NotFittedError exceptions.ChangedBehaviorWarning exceptions.ConvergenceWarning exceptions.DataConversionWarning exceptions.DataDimensionalityWarning exceptions.EfficiencyWarning exceptions.FitFailedWarning + exceptions.NotFittedError exceptions.NonBLASDotWarning exceptions.UndefinedMetricWarning @@ -485,9 +468,9 @@ From images :toctree: generated/ :template: function.rst - feature_extraction.image.img_to_graph - feature_extraction.image.grid_to_graph feature_extraction.image.extract_patches_2d + feature_extraction.image.grid_to_graph + feature_extraction.image.img_to_graph feature_extraction.image.reconstruct_from_patches_2d :template: class.rst @@ -571,8 +554,8 @@ From text :toctree: generated/ :template: class.rst - gaussian_process.GaussianProcessRegressor gaussian_process.GaussianProcessClassifier + gaussian_process.GaussianProcessRegressor Kernels: @@ -580,20 +563,20 @@ Kernels: :toctree: generated/ :template: class_with_call.rst + gaussian_process.kernels.CompoundKernel + gaussian_process.kernels.ConstantKernel + gaussian_process.kernels.DotProduct + gaussian_process.kernels.ExpSineSquared + gaussian_process.kernels.Exponentiation + gaussian_process.kernels.Hyperparameter gaussian_process.kernels.Kernel - gaussian_process.kernels.Sum + gaussian_process.kernels.Matern + gaussian_process.kernels.PairwiseKernel gaussian_process.kernels.Product - gaussian_process.kernels.Exponentiation - gaussian_process.kernels.ConstantKernel - gaussian_process.kernels.WhiteKernel gaussian_process.kernels.RBF - gaussian_process.kernels.Matern gaussian_process.kernels.RationalQuadratic - gaussian_process.kernels.ExpSineSquared - gaussian_process.kernels.DotProduct - gaussian_process.kernels.PairwiseKernel - gaussian_process.kernels.CompoundKernel - gaussian_process.kernels.Hyperparameter + gaussian_process.kernels.Sum + gaussian_process.kernels.WhiteKernel .. _isotonic_ref: @@ -618,8 +601,8 @@ Kernels: :toctree: generated :template: function.rst - isotonic.isotonic_regression isotonic.check_increasing + isotonic.isotonic_regression .. _kernel_approximation_ref: @@ -662,27 +645,6 @@ Kernels: kernel_ridge.KernelRidge -.. _lda_ref: - -:mod:`sklearn.discriminant_analysis`: Discriminant Analysis -=========================================================== - -.. automodule:: sklearn.discriminant_analysis - :no-members: - :no-inherited-members: - -**User guide:** See the :ref:`lda_qda` section for further details. - -.. currentmodule:: sklearn - -.. autosummary:: - :toctree: generated - :template: class.rst - - discriminant_analysis.LinearDiscriminantAnalysis - discriminant_analysis.QuadraticDiscriminantAnalysis - - .. _linear_model_ref: :mod:`sklearn.linear_model`: Generalized Linear Models @@ -763,8 +725,8 @@ Kernels: :toctree: generated :template: class.rst - manifold.LocallyLinearEmbedding manifold.Isomap + manifold.LocallyLinearEmbedding manifold.MDS manifold.SpectralEmbedding manifold.TSNE @@ -774,8 +736,8 @@ Kernels: :template: function.rst manifold.locally_linear_embedding - manifold.spectral_embedding manifold.smacof + manifold.spectral_embedding .. _metrics_ref: @@ -801,8 +763,8 @@ details. :toctree: generated/ :template: function.rst - metrics.make_scorer metrics.get_scorer + metrics.make_scorer Classification metrics ---------------------- @@ -930,9 +892,12 @@ See the :ref:`metrics` section of the user guide for further details. metrics.pairwise.additive_chi2_kernel metrics.pairwise.chi2_kernel + metrics.pairwise.cosine_similarity + metrics.pairwise.cosine_distances metrics.pairwise.distance_metrics metrics.pairwise.euclidean_distances metrics.pairwise.kernel_metrics + metrics.pairwise.laplacian_kernel metrics.pairwise.linear_kernel metrics.pairwise.manhattan_distances metrics.pairwise.pairwise_distances @@ -940,16 +905,13 @@ See the :ref:`metrics` section of the user guide for further details. metrics.pairwise.polynomial_kernel metrics.pairwise.rbf_kernel metrics.pairwise.sigmoid_kernel - metrics.pairwise.cosine_similarity - metrics.pairwise.cosine_distances - metrics.pairwise.laplacian_kernel - metrics.pairwise_distances - metrics.pairwise_distances_argmin - metrics.pairwise_distances_argmin_min metrics.pairwise.paired_euclidean_distances metrics.pairwise.paired_manhattan_distances metrics.pairwise.paired_cosine_distances metrics.pairwise.paired_distances + metrics.pairwise_distances + metrics.pairwise_distances_argmin + metrics.pairwise_distances_argmin_min .. _mixture_ref: @@ -969,9 +931,93 @@ See the :ref:`metrics` section of the user guide for further details. :toctree: generated/ :template: class.rst - mixture.GaussianMixture mixture.BayesianGaussianMixture + mixture.GaussianMixture + +.. _modelselection_ref: + +:mod:`sklearn.model_selection`: Model Selection +=============================================== + +.. automodule:: sklearn.model_selection + :no-members: + :no-inherited-members: + +**User guide:** See the :ref:`cross_validation`, :ref:`grid_search` and +:ref:`learning_curve` sections for further details. + +Splitter Classes +---------------- + +.. currentmodule:: sklearn + +.. autosummary:: + :toctree: generated/ + :template: class.rst + + model_selection.GroupKFold + model_selection.GroupShuffleSplit + model_selection.KFold + model_selection.LeaveOneGroupOut + model_selection.LeavePGroupsOut + model_selection.LeaveOneOut + model_selection.LeavePOut + model_selection.PredefinedSplit + model_selection.RepeatedKFold + model_selection.RepeatedStratifiedKFold + model_selection.ShuffleSplit + model_selection.StratifiedKFold + model_selection.StratifiedShuffleSplit + model_selection.TimeSeriesSplit +Splitter Functions +------------------ + +.. currentmodule:: sklearn + +.. autosummary:: + :toctree: generated/ + :template: function.rst + + model_selection.check_cv + model_selection.train_test_split + +Hyper-parameter optimizers +-------------------------- + +.. currentmodule:: sklearn + +.. autosummary:: + :toctree: generated/ + :template: class.rst + + model_selection.GridSearchCV + model_selection.ParameterGrid + model_selection.ParameterSampler + model_selection.RandomizedSearchCV + + +.. autosummary:: + :toctree: generated/ + :template: function.rst + + model_selection.fit_grid_point + +Model validation +---------------- + +.. currentmodule:: sklearn + +.. autosummary:: + :toctree: generated/ + :template: function.rst + + model_selection.cross_validate + model_selection.cross_val_predict + model_selection.cross_val_score + model_selection.learning_curve + model_selection.permutation_test_score + model_selection.validation_curve .. _multiclass_ref: @@ -1011,9 +1057,9 @@ See the :ref:`metrics` section of the user guide for further details. :toctree: generated :template: class.rst + multioutput.ClassifierChain multioutput.MultiOutputRegressor multioutput.MultiOutputClassifier - multioutput.ClassifierChain .. _naive_bayes_ref: @@ -1032,9 +1078,9 @@ See the :ref:`metrics` section of the user guide for further details. :toctree: generated/ :template: class.rst + naive_bayes.BernoulliNB naive_bayes.GaussianNB naive_bayes.MultinomialNB - naive_bayes.BernoulliNB .. _neighbors_ref: @@ -1054,17 +1100,17 @@ See the :ref:`metrics` section of the user guide for further details. :toctree: generated/ :template: class.rst - neighbors.NearestNeighbors - neighbors.KNeighborsClassifier - neighbors.RadiusNeighborsClassifier - neighbors.KNeighborsRegressor - neighbors.RadiusNeighborsRegressor - neighbors.NearestCentroid neighbors.BallTree - neighbors.KDTree neighbors.DistanceMetric + neighbors.KDTree neighbors.KernelDensity + neighbors.KNeighborsClassifier + neighbors.KNeighborsRegressor neighbors.LocalOutlierFactor + neighbors.RadiusNeighborsClassifier + neighbors.RadiusNeighborsRegressor + neighbors.NearestCentroid + neighbors.NearestNeighbors .. autosummary:: :toctree: generated/ @@ -1094,57 +1140,6 @@ See the :ref:`metrics` section of the user guide for further details. neural_network.MLPClassifier neural_network.MLPRegressor - -.. _calibration_ref: - -:mod:`sklearn.calibration`: Probability Calibration -=================================================== - -.. automodule:: sklearn.calibration - :no-members: - :no-inherited-members: - -**User guide:** See the :ref:`calibration` section for further details. - -.. currentmodule:: sklearn - -.. autosummary:: - :toctree: generated/ - :template: class.rst - - calibration.CalibratedClassifierCV - - -.. autosummary:: - :toctree: generated/ - :template: function.rst - - calibration.calibration_curve - - -.. _cross_decomposition_ref: - -:mod:`sklearn.cross_decomposition`: Cross decomposition -======================================================= - -.. automodule:: sklearn.cross_decomposition - :no-members: - :no-inherited-members: - -**User guide:** See the :ref:`cross_decomposition` section for further details. - -.. currentmodule:: sklearn - -.. autosummary:: - :toctree: generated/ - :template: class.rst - - cross_decomposition.PLSRegression - cross_decomposition.PLSCanonical - cross_decomposition.CCA - cross_decomposition.PLSSVD - - .. _pipeline_ref: :mod:`sklearn.pipeline`: Pipeline @@ -1160,8 +1155,8 @@ See the :ref:`metrics` section of the user guide for further details. :toctree: generated/ :template: class.rst - pipeline.Pipeline pipeline.FeatureUnion + pipeline.Pipeline .. autosummary:: :toctree: generated/ @@ -1287,13 +1282,13 @@ Estimators :toctree: generated/ :template: class.rst - svm.SVC svm.LinearSVC - svm.NuSVC - svm.SVR svm.LinearSVR + svm.NuSVC svm.NuSVR svm.OneClassSVM + svm.SVC + svm.SVR .. autosummary:: :toctree: generated/ @@ -1308,11 +1303,11 @@ Low-level methods :toctree: generated :template: function.rst - svm.libsvm.fit + svm.libsvm.cross_validation svm.libsvm.decision_function + svm.libsvm.fit svm.libsvm.predict svm.libsvm.predict_proba - svm.libsvm.cross_validation .. _tree_ref: @@ -1361,26 +1356,26 @@ Low-level methods :toctree: generated/ :template: function.rst - utils.assert_all_finite utils.as_float_array + utils.assert_all_finite utils.check_X_y utils.check_array utils.check_consistent_length utils.check_random_state - utils.indexable utils.class_weight.compute_class_weight utils.class_weight.compute_sample_weight utils.estimator_checks.check_estimator utils.extmath.safe_sparse_dot + utils.indexable utils.resample utils.safe_indexing utils.shuffle - utils.sparsefuncs.mean_variance_axis utils.sparsefuncs.incr_mean_variance_axis utils.sparsefuncs.inplace_column_scale utils.sparsefuncs.inplace_row_scale utils.sparsefuncs.inplace_swap_row utils.sparsefuncs.inplace_swap_column + utils.sparsefuncs.mean_variance_axis utils.validation.check_is_fitted utils.validation.check_symmetric utils.validation.column_or_1d @@ -1409,25 +1404,25 @@ To be removed in 0.20 :toctree: generated/ :template: deprecated_class.rst - grid_search.ParameterGrid - grid_search.ParameterSampler - grid_search.GridSearchCV - grid_search.RandomizedSearchCV - cross_validation.LeaveOneOut - cross_validation.LeavePOut cross_validation.KFold cross_validation.LabelKFold cross_validation.LeaveOneLabelOut + cross_validation.LeaveOneOut + cross_validation.LeavePOut cross_validation.LeavePLabelOut cross_validation.LabelShuffleSplit - cross_validation.StratifiedKFold cross_validation.ShuffleSplit + cross_validation.StratifiedKFold cross_validation.StratifiedShuffleSplit cross_validation.PredefinedSplit decomposition.RandomizedPCA gaussian_process.GaussianProcess - mixture.GMM + grid_search.ParameterGrid + grid_search.ParameterSampler + grid_search.GridSearchCV + grid_search.RandomizedSearchCV mixture.DPGMM + mixture.GMM mixture.VBGMM @@ -1435,11 +1430,11 @@ To be removed in 0.20 :toctree: generated/ :template: deprecated_function.rst - grid_search.fit_grid_point - learning_curve.learning_curve - learning_curve.validation_curve + cross_validation.check_cv cross_validation.cross_val_predict cross_validation.cross_val_score - cross_validation.check_cv cross_validation.permutation_test_score cross_validation.train_test_split + grid_search.fit_grid_point + learning_curve.learning_curve + learning_curve.validation_curve diff --git a/doc/modules/clustering.rst b/doc/modules/clustering.rst index f7977845a8ce2..b18cb3a6adcf7 100644 --- a/doc/modules/clustering.rst +++ b/doc/modules/clustering.rst @@ -301,7 +301,9 @@ is given. Affinity Propagation can be interesting as it chooses the number of clusters based on the data provided. For this purpose, the two important parameters are the *preference*, which controls how many exemplars are -used, and the *damping factor*. +used, and the *damping factor* which damps the responsibility and +availability messages to avoid numerical oscillations when updating these +messages. The main drawback of Affinity Propagation is its complexity. The algorithm has a time complexity of the order :math:`O(N^2 T)`, where :math:`N` @@ -350,6 +352,13 @@ to be the exemplar of sample :math:`i` is given by: To begin with, all values for :math:`r` and :math:`a` are set to zero, and the calculation of each iterates until convergence. +As discussed above, in order to avoid numerical oscillations when updating the +messages, the damping factor :math:`\lambda` is introduced to iteration process: + +.. math:: r_{t+1}(i, k) = \lambda\cdot r_{t}(i, k) + (1-\lambda)\cdot r_{t+1}(i, k) +.. math:: a_{t+1}(i, k) = \lambda\cdot a_{t}(i, k) + (1-\lambda)\cdot a_{t+1}(i, k) + +where :math:`t` indicates the iteration times. .. _mean_shift: @@ -1334,7 +1343,7 @@ mean of homogeneity and completeness**: .. topic:: References - .. [RH2007] `V-Measure: A conditional entropy-based external cluster evaluation + * `V-Measure: A conditional entropy-based external cluster evaluation measure `_ Andrew Rosenberg and Julia Hirschberg, 2007 diff --git a/doc/modules/covariance.rst b/doc/modules/covariance.rst index 88f40f3896190..2f95051ac9ea3 100644 --- a/doc/modules/covariance.rst +++ b/doc/modules/covariance.rst @@ -95,7 +95,7 @@ bias/variance trade-off, and is discussed below. Ledoit-Wolf shrinkage --------------------- -In their 2004 paper [1], O. Ledoit and M. Wolf propose a formula so as +In their 2004 paper [1]_, O. Ledoit and M. Wolf propose a formula so as to compute the optimal shrinkage coefficient :math:`\alpha` that minimizes the Mean Squared Error between the estimated and the real covariance matrix. @@ -112,10 +112,11 @@ fitting a :class:`LedoitWolf` object to the same sample. for visualizing the performances of the Ledoit-Wolf estimator in terms of likelihood. +.. topic:: References: -[1] O. Ledoit and M. Wolf, "A Well-Conditioned Estimator for Large-Dimensional - Covariance Matrices", Journal of Multivariate Analysis, Volume 88, Issue 2, - February 2004, pages 365-411. + .. [1] O. Ledoit and M. Wolf, "A Well-Conditioned Estimator for Large-Dimensional + Covariance Matrices", Journal of Multivariate Analysis, Volume 88, Issue 2, + February 2004, pages 365-411. .. _oracle_approximating_shrinkage: @@ -123,7 +124,7 @@ Oracle Approximating Shrinkage ------------------------------ Under the assumption that the data are Gaussian distributed, Chen et -al. [2] derived a formula aimed at choosing a shrinkage coefficient that +al. [2]_ derived a formula aimed at choosing a shrinkage coefficient that yields a smaller Mean Squared Error than the one given by Ledoit and Wolf's formula. The resulting estimator is known as the Oracle Shrinkage Approximating estimator of the covariance. @@ -141,8 +142,10 @@ object to the same sample. Bias-variance trade-off when setting the shrinkage: comparing the choices of Ledoit-Wolf and OAS estimators -[2] Chen et al., "Shrinkage Algorithms for MMSE Covariance Estimation", - IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010. +.. topic:: References: + + .. [2] Chen et al., "Shrinkage Algorithms for MMSE Covariance Estimation", + IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010. .. topic:: Examples: @@ -266,14 +269,14 @@ perform outlier detection and discard/downweight some observations according to further processing of the data. The ``sklearn.covariance`` package implements a robust estimator of covariance, -the Minimum Covariance Determinant [3]. +the Minimum Covariance Determinant [3]_. Minimum Covariance Determinant ------------------------------ The Minimum Covariance Determinant estimator is a robust estimator of -a data set's covariance introduced by P.J. Rousseeuw in [3]. The idea +a data set's covariance introduced by P.J. Rousseeuw in [3]_. The idea is to find a given proportion (h) of "good" observations which are not outliers and compute their empirical covariance matrix. This empirical covariance matrix is then rescaled to compensate the @@ -283,7 +286,7 @@ weights to observations according to their Mahalanobis distance, leading to a reweighted estimate of the covariance matrix of the data set ("reweighting step"). -Rousseeuw and Van Driessen [4] developed the FastMCD algorithm in order +Rousseeuw and Van Driessen [4]_ developed the FastMCD algorithm in order to compute the Minimum Covariance Determinant. This algorithm is used in scikit-learn when fitting an MCD object to data. The FastMCD algorithm also computes a robust estimate of the data set location at @@ -292,11 +295,13 @@ the same time. Raw estimates can be accessed as ``raw_location_`` and ``raw_covariance_`` attributes of a :class:`MinCovDet` robust covariance estimator object. -[3] P. J. Rousseeuw. Least median of squares regression. - J. Am Stat Ass, 79:871, 1984. -[4] A Fast Algorithm for the Minimum Covariance Determinant Estimator, - 1999, American Statistical Association and the American Society - for Quality, TECHNOMETRICS. +.. topic:: References: + + .. [3] P. J. Rousseeuw. Least median of squares regression. + J. Am Stat Ass, 79:871, 1984. + .. [4] A Fast Algorithm for the Minimum Covariance Determinant Estimator, + 1999, American Statistical Association and the American Society + for Quality, TECHNOMETRICS. .. topic:: Examples: diff --git a/doc/modules/cross_validation.rst b/doc/modules/cross_validation.rst index ab7d2227447b1..b47726979351f 100644 --- a/doc/modules/cross_validation.rst +++ b/doc/modules/cross_validation.rst @@ -270,12 +270,12 @@ The following sections list utilities to generate indices that can be used to generate dataset splits according to different cross validation strategies. -.. _iid_cv +.. _iid_cv: Cross-validation iterators for i.i.d. data ========================================== -Assuming that some data is Independent Identically Distributed (i.i.d.) is +Assuming that some data is Independent and Identically Distributed (i.i.d.) is making the assumption that all samples stem from the same generative process and that the generative process is assumed to have no memory of past generated samples. @@ -287,10 +287,10 @@ The following cross-validators can be used in such cases. While i.i.d. data is a common assumption in machine learning theory, it rarely holds in practice. If one knows that the samples have been generated using a time-dependent process, it's safer to -use a `time-series aware cross-validation scheme ` +use a :ref:`time-series aware cross-validation scheme ` Similarly if we know that the generative process has a group structure (samples from collected from different subjects, experiments, measurement -devices) it safer to use `group-wise cross-validation `. +devices) it safer to use :ref:`group-wise cross-validation `. K-fold @@ -506,7 +506,7 @@ Stratified Shuffle Split stratified splits, *i.e* which creates splits by preserving the same percentage for each target class as in the complete set. -.. _group_cv +.. _group_cv: Cross-validation iterators for grouped data. ============================================ @@ -532,11 +532,11 @@ parameter. Group k-fold ------------ -class:GroupKFold is a variation of k-fold which ensures that the same group is +:class:`GroupKFold` is a variation of k-fold which ensures that the same group is not represented in both testing and training sets. For example if the data is obtained from different subjects with several samples per-subject and if the model is flexible enough to learn from highly person specific features it -could fail to generalize to new subjects. class:GroupKFold makes it possible +could fail to generalize to new subjects. :class:`GroupKFold` makes it possible to detect this kind of overfitting situations. Imagine you have three subjects, each with an associated number from 1 to 3:: @@ -613,8 +613,6 @@ Example of Leave-2-Group Out:: Group Shuffle Split ------------------- -:class:`GroupShuffleSplit` - The :class:`GroupShuffleSplit` iterator behaves as a combination of :class:`ShuffleSplit` and :class:`LeavePGroupsOut`, and generates a sequence of randomized partitions in which a subset of groups are held @@ -655,7 +653,7 @@ e.g. when searching for hyperparameters. For example, when using a validation set, set the ``test_fold`` to 0 for all samples that are part of the validation set, and to -1 for all other samples. -.. _timeseries_cv +.. _timeseries_cv: Cross validation of time series data ==================================== @@ -725,8 +723,7 @@ to shuffle the data indices before splitting them. Note that: shuffling will be different every time ``KFold(..., shuffle=True)`` is iterated. However, ``GridSearchCV`` will use the same shuffling for each set of parameters validated by a single call to its ``fit`` method. -* To ensure results are repeatable (*on the same platform*), use a fixed value - for ``random_state``. +* To get identical results for each split, set ``random_state`` to an integer. Cross validation and model selection ==================================== diff --git a/doc/modules/ensemble.rst b/doc/modules/ensemble.rst index 12a0ff6a74ba0..b766f4dfd4d0c 100644 --- a/doc/modules/ensemble.rst +++ b/doc/modules/ensemble.rst @@ -246,7 +246,7 @@ amount of time (e.g., on large datasets). .. [B1998] L. Breiman, "Arcing Classifiers", Annals of Statistics 1998. - .. [GEW2006] P. Geurts, D. Ernst., and L. Wehenkel, "Extremely randomized + * P. Geurts, D. Ernst., and L. Wehenkel, "Extremely randomized trees", Machine Learning, 63(1), 3-42, 2006. .. _random_forest_feature_importance: @@ -915,10 +915,10 @@ averaged. .. _voting_classifier: -VotingClassifier +Voting Classifier ======================== -The idea behind the voting classifier implementation is to combine +The idea behind the :class:`VotingClassifier` is to combine conceptually different machine learning classifiers and use a majority vote or the average predicted probabilities (soft vote) to predict the class labels. Such a classifier can be useful for a set of equally well performing model diff --git a/doc/modules/feature_selection.rst b/doc/modules/feature_selection.rst index 0f0adecdd3cf3..f9b767bd2ae89 100644 --- a/doc/modules/feature_selection.rst +++ b/doc/modules/feature_selection.rst @@ -123,10 +123,11 @@ Given an external estimator that assigns weights to features (e.g., the coefficients of a linear model), recursive feature elimination (:class:`RFE`) is to select features by recursively considering smaller and smaller sets of features. First, the estimator is trained on the initial set of features and -weights are assigned to each one of them. Then, features whose absolute weights -are the smallest are pruned from the current set features. That procedure is -recursively repeated on the pruned set until the desired number of features to -select is eventually reached. +the importance of each feature is obtained either through a ``coef_`` attribute +or through a ``feature_importances_`` attribute. Then, the least important +features are pruned from current set of features.That procedure is recursively +repeated on the pruned set until the desired number of features to select is +eventually reached. :class:`RFECV` performs RFE in a cross-validation loop to find the optimal number of features. diff --git a/doc/modules/gaussian_process.rst b/doc/modules/gaussian_process.rst index 7fae49349f342..94cca8999e489 100644 --- a/doc/modules/gaussian_process.rst +++ b/doc/modules/gaussian_process.rst @@ -601,12 +601,7 @@ shown in the following figure: References ---------- - * `[RW2006] - `_ - **Gaussian Processes for Machine Learning**, - Carl Eduard Rasmussen and Christopher K.I. Williams, MIT Press 2006. - Link to an official complete PDF version of the book - `here `_ . +.. [RW2006] Carl Eduard Rasmussen and Christopher K.I. Williams, "Gaussian Processes for Machine Learning", MIT Press 2006, Link to an official complete PDF version of the book `here `_ . .. currentmodule:: sklearn.gaussian_process diff --git a/doc/modules/grid_search.rst b/doc/modules/grid_search.rst index 1867a66594ad4..3851392ed2d88 100644 --- a/doc/modules/grid_search.rst +++ b/doc/modules/grid_search.rst @@ -84,7 +84,7 @@ evaluated and the best combination is retained. dataset. This is the best practice for evaluating the performance of a model with grid search. - - See :ref:`sphx_glr_auto_examples_model_selection_plot_multi_metric_evaluation` + - See :ref:`sphx_glr_auto_examples_model_selection_plot_multi_metric_evaluation.py` for an example of :class:`GridSearchCV` being used to evaluate multiple metrics simultaneously. @@ -183,7 +183,7 @@ the ``best_estimator_`` on the whole dataset. If the search should not be refit, set ``refit=False``. Leaving refit to the default value ``None`` will result in an error when using multiple metrics. -See :ref:`sphx_glr_auto_examples_model_selection_plot_multi_metric_evaluation` +See :ref:`sphx_glr_auto_examples_model_selection_plot_multi_metric_evaluation.py` for an example usage. Composite estimators and parameter spaces diff --git a/doc/modules/linear_model.rst b/doc/modules/linear_model.rst index e6d0ea882f6d3..018ff884c4ae2 100644 --- a/doc/modules/linear_model.rst +++ b/doc/modules/linear_model.rst @@ -1141,7 +1141,7 @@ in the following ways. .. topic:: References: - .. [#f1] Peter J. Huber, Elvezio M. Ronchetti: Robust Statistics, Concomitant scale estimates, pg 172 + * Peter J. Huber, Elvezio M. Ronchetti: Robust Statistics, Concomitant scale estimates, pg 172 Also, this estimator is different from the R implementation of Robust Regression (http://www.ats.ucla.edu/stat/r/dae/rreg.htm) because the R implementation does a weighted least diff --git a/doc/modules/model_evaluation.rst b/doc/modules/model_evaluation.rst index 813a39339e848..4800569556758 100644 --- a/doc/modules/model_evaluation.rst +++ b/doc/modules/model_evaluation.rst @@ -81,6 +81,7 @@ Scoring Function 'v_measure_score' :func:`metrics.v_measure_score` **Regression** +'explained_variance' :func:`metrics.explained_variance_score` 'neg_mean_absolute_error' :func:`metrics.mean_absolute_error` 'neg_mean_squared_error' :func:`metrics.mean_squared_error` 'neg_mean_squared_log_error' :func:`metrics.mean_squared_log_error` @@ -101,7 +102,7 @@ Usage examples: >>> model = svm.SVC() >>> cross_val_score(model, X, y, scoring='wrong_choice') Traceback (most recent call last): - ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'completeness_score', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'mutual_info_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'normalized_mutual_info_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'v_measure_score'] + ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'completeness_score', 'explained_variance', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'mutual_info_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'normalized_mutual_info_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'v_measure_score'] .. note:: @@ -242,14 +243,14 @@ permitted and will require a wrapper to return a single metric:: >>> # A sample toy binary classification dataset >>> X, y = datasets.make_classification(n_classes=2, random_state=0) >>> svm = LinearSVC(random_state=0) - >>> tp = lambda y_true, y_pred: confusion_matrix(y_true, y_pred)[0, 0] - >>> tn = lambda y_true, y_pred: confusion_matrix(y_true, y_pred)[0, 0] - >>> fp = lambda y_true, y_pred: confusion_matrix(y_true, y_pred)[1, 0] - >>> fn = lambda y_true, y_pred: confusion_matrix(y_true, y_pred)[0, 1] + >>> def tp(y_true, y_pred): return confusion_matrix(y_true, y_pred)[0, 0] + >>> def tn(y_true, y_pred): return confusion_matrix(y_true, y_pred)[0, 0] + >>> def fp(y_true, y_pred): return confusion_matrix(y_true, y_pred)[1, 0] + >>> def fn(y_true, y_pred): return confusion_matrix(y_true, y_pred)[0, 1] >>> scoring = {'tp' : make_scorer(tp), 'tn' : make_scorer(tn), ... 'fp' : make_scorer(fp), 'fn' : make_scorer(fn)} >>> cv_results = cross_validate(svm.fit(X, y), X, y, scoring=scoring) - >>> # Getting the test set false positive scores + >>> # Getting the test set true positive scores >>> print(cv_results['test_tp']) # doctest: +NORMALIZE_WHITESPACE [12 13 15] >>> # Getting the test set false negative scores @@ -670,10 +671,6 @@ binary classification and multilabel indicator format. for an example of :func:`precision_recall_curve` usage to evaluate classifier output quality. - * See :ref:`sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py` - for an example of :func:`precision_recall_curve` usage to select - features for sparse linear models. - Binary classification ^^^^^^^^^^^^^^^^^^^^^ diff --git a/doc/modules/multiclass.rst b/doc/modules/multiclass.rst index 5094372aca960..2eec94f76b1c2 100644 --- a/doc/modules/multiclass.rst +++ b/doc/modules/multiclass.rst @@ -17,42 +17,42 @@ The :mod:`sklearn.multiclass` module implements *meta-estimators* to solve by decomposing such problems into binary classification problems. Multitarget regression is also supported. - - **Multiclass classification** means a classification task with more than - two classes; e.g., classify a set of images of fruits which may be oranges, - apples, or pears. Multiclass classification makes the assumption that each - sample is assigned to one and only one label: a fruit can be either an - apple or a pear but not both at the same time. - - - **Multilabel classification** assigns to each sample a set of target - labels. This can be thought as predicting properties of a data-point - that are not mutually exclusive, such as topics that are relevant for a - document. A text might be about any of religion, politics, finance or - education at the same time or none of these. - - - **Multioutput regression** assigns each sample a set of target - values. This can be thought of as predicting several properties - for each data-point, such as wind direction and magnitude at a - certain location. - - - **Multioutput-multiclass classification** and **multi-task classification** - means that a single estimator has to handle several joint classification - tasks. This is both a generalization of the multi-label classification - task, which only considers binary classification, as well as a - generalization of the multi-class classification task. *The output format - is a 2d numpy array or sparse matrix.* - - The set of labels can be different for each output variable. - For instance, a sample could be assigned "pear" for an output variable that - takes possible values in a finite set of species such as "pear", "apple"; - and "blue" or "green" for a second output variable that takes possible values - in a finite set of colors such as "green", "red", "blue", "yellow"... - - This means that any classifiers handling multi-output - multiclass or multi-task classification tasks, - support the multi-label classification task as a special case. - Multi-task classification is similar to the multi-output - classification task with different model formulations. For - more information, see the relevant estimator documentation. +- **Multiclass classification** means a classification task with more than + two classes; e.g., classify a set of images of fruits which may be oranges, + apples, or pears. Multiclass classification makes the assumption that each + sample is assigned to one and only one label: a fruit can be either an + apple or a pear but not both at the same time. + +- **Multilabel classification** assigns to each sample a set of target + labels. This can be thought as predicting properties of a data-point + that are not mutually exclusive, such as topics that are relevant for a + document. A text might be about any of religion, politics, finance or + education at the same time or none of these. + +- **Multioutput regression** assigns each sample a set of target + values. This can be thought of as predicting several properties + for each data-point, such as wind direction and magnitude at a + certain location. + +- **Multioutput-multiclass classification** and **multi-task classification** + means that a single estimator has to handle several joint classification + tasks. This is both a generalization of the multi-label classification + task, which only considers binary classification, as well as a + generalization of the multi-class classification task. *The output format + is a 2d numpy array or sparse matrix.* + + The set of labels can be different for each output variable. + For instance, a sample could be assigned "pear" for an output variable that + takes possible values in a finite set of species such as "pear", "apple"; + and "blue" or "green" for a second output variable that takes possible values + in a finite set of colors such as "green", "red", "blue", "yellow"... + + This means that any classifiers handling multi-output + multiclass or multi-task classification tasks, + support the multi-label classification task as a special case. + Multi-task classification is similar to the multi-output + classification task with different model formulations. For + more information, see the relevant estimator documentation. All scikit-learn classifiers are capable of multiclass classification, but the meta-estimators offered by :mod:`sklearn.multiclass` @@ -64,20 +64,69 @@ Below is a summary of the classifiers supported by scikit-learn grouped by strategy; you don't need the meta-estimators in this class if you're using one of these, unless you want custom multiclass behavior: - - Inherently multiclass: :ref:`Naive Bayes `, - :ref:`LDA and QDA `, - :ref:`Decision Trees `, :ref:`Random Forests `, - :ref:`Nearest Neighbors `, - setting ``multi_class='multinomial'`` in - :class:`sklearn.linear_model.LogisticRegression`. - - Support multilabel: :ref:`Decision Trees `, - :ref:`Random Forests `, :ref:`Nearest Neighbors `. - - One-Vs-One: :class:`sklearn.svm.SVC`. - - One-Vs-All: all linear models except :class:`sklearn.svm.SVC`. - -Some estimators also support multioutput-multiclass classification -tasks :ref:`Decision Trees `, :ref:`Random Forests `, -:ref:`Nearest Neighbors `. +- **Inherently multiclass:** + + - :class:`sklearn.naive_bayes.BernoulliNB` + - :class:`sklearn.tree.DecisionTreeClassifier` + - :class:`sklearn.tree.ExtraTreeClassifier` + - :class:`sklearn.ensemble.ExtraTreesClassifier` + - :class:`sklearn.naive_bayes.GaussianNB` + - :class:`sklearn.neighbors.KNeighborsClassifier` + - :class:`sklearn.semi_supervised.LabelPropagation` + - :class:`sklearn.semi_supervised.LabelSpreading` + - :class:`sklearn.discriminant_analysis.LinearDiscriminantAnalysis` + - :class:`sklearn.svm.LinearSVC` (setting multi_class="crammer_singer") + - :class:`sklearn.linear_model.LogisticRegression` (setting multi_class="multinomial") + - :class:`sklearn.linear_model.LogisticRegressionCV` (setting multi_class="multinomial") + - :class:`sklearn.neural_network.MLPClassifier` + - :class:`sklearn.neighbors.NearestCentroid` + - :class:`sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis` + - :class:`sklearn.neighbors.RadiusNeighborsClassifier` + - :class:`sklearn.ensemble.RandomForestClassifier` + - :class:`sklearn.linear_model.RidgeClassifier` + - :class:`sklearn.linear_model.RidgeClassifierCV` + + +- **Multiclass as One-Vs-One:** + + - :class:`sklearn.svm.NuSVC` + - :class:`sklearn.svm.SVC`. + - :class:`sklearn.gaussian_process.GaussianProcessClassifier` (setting multi_class = "one_vs_one") + + +- **Multiclass as One-Vs-All:** + + - :class:`sklearn.ensemble.GradientBoostingClassifier` + - :class:`sklearn.gaussian_process.GaussianProcessClassifier` (setting multi_class = "one_vs_rest") + - :class:`sklearn.svm.LinearSVC` (setting multi_class="ovr") + - :class:`sklearn.linear_model.LogisticRegression` (setting multi_class="ovr") + - :class:`sklearn.linear_model.LogisticRegressionCV` (setting multi_class="ovr") + - :class:`sklearn.linear_model.SGDClassifier` + - :class:`sklearn.linear_model.Perceptron` + - :class:`sklearn.linear_model.PassiveAggressiveClassifier` + + +- **Support multilabel:** + + - :class:`sklearn.tree.DecisionTreeClassifier` + - :class:`sklearn.tree.ExtraTreeClassifier` + - :class:`sklearn.ensemble.ExtraTreesClassifier` + - :class:`sklearn.neighbors.KNeighborsClassifier` + - :class:`sklearn.neural_network.MLPClassifier` + - :class:`sklearn.neighbors.RadiusNeighborsClassifier` + - :class:`sklearn.ensemble.RandomForestClassifier` + - :class:`sklearn.linear_model.RidgeClassifierCV` + + +- **Support multiclass-multioutput:** + + - :class:`sklearn.tree.DecisionTreeClassifier` + - :class:`sklearn.tree.ExtraTreeClassifier` + - :class:`sklearn.ensemble.ExtraTreesClassifier` + - :class:`sklearn.neighbors.KNeighborsClassifier` + - :class:`sklearn.neighbors.RadiusNeighborsClassifier` + - :class:`sklearn.ensemble.RandomForestClassifier` + .. warning:: @@ -202,8 +251,8 @@ Below is an example of multiclass learning using OvO:: .. topic:: References: - .. [1] "Pattern Recognition and Machine Learning. Springer", - Christopher M. Bishop, page 183, (First Edition) + * "Pattern Recognition and Machine Learning. Springer", + Christopher M. Bishop, page 183, (First Edition) .. _ecoc: @@ -266,19 +315,19 @@ Below is an example of multiclass learning using Output-Codes:: .. topic:: References: - .. [2] "Solving multiclass learning problems via error-correcting output codes", - Dietterich T., Bakiri G., - Journal of Artificial Intelligence Research 2, - 1995. + * "Solving multiclass learning problems via error-correcting output codes", + Dietterich T., Bakiri G., + Journal of Artificial Intelligence Research 2, + 1995. .. [3] "The error coding method and PICTs", James G., Hastie T., Journal of Computational and Graphical statistics 7, 1998. - .. [4] "The Elements of Statistical Learning", - Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) - 2008. + * "The Elements of Statistical Learning", + Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) + 2008. Multioutput regression ====================== @@ -353,7 +402,7 @@ Classifier Chain Classifier chains (see :class:`ClassifierChain`) are a way of combining a number of binary classifiers into a single multi-label model that is capable - of exploiting correlations among targets. +of exploiting correlations among targets. For a multi-label classification problem with N classes, N binary classifiers are assigned an integer between 0 and N-1. These integers @@ -373,5 +422,6 @@ typically many randomly ordered chains are fit and their predictions are averaged together. .. topic:: References: + Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank, - "Classifier Chains for Multi-label Classification", 2009. \ No newline at end of file + "Classifier Chains for Multi-label Classification", 2009. diff --git a/doc/modules/outlier_detection.rst b/doc/modules/outlier_detection.rst index 011bb6ea07889..db130403f9023 100644 --- a/doc/modules/outlier_detection.rst +++ b/doc/modules/outlier_detection.rst @@ -126,8 +126,8 @@ This strategy is illustrated below. .. topic:: References: - .. [RD1999] Rousseeuw, P.J., Van Driessen, K. "A fast algorithm for the minimum - covariance determinant estimator" Technometrics 41(3), 212 (1999) + * Rousseeuw, P.J., Van Driessen, K. "A fast algorithm for the minimum + covariance determinant estimator" Technometrics 41(3), 212 (1999) .. _isolation_forest: @@ -172,8 +172,8 @@ This strategy is illustrated below. .. topic:: References: - .. [LTZ2008] Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. "Isolation forest." - Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. + * Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. "Isolation forest." + Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. Local Outlier Factor @@ -228,7 +228,7 @@ This strategy is illustrated below. .. topic:: References: - .. [BKNS2000] Breunig, Kriegel, Ng, and Sander (2000) + * Breunig, Kriegel, Ng, and Sander (2000) `LOF: identifying density-based local outliers. `_ Proc. ACM SIGMOD @@ -272,16 +272,16 @@ multiple modes and :class:`ensemble.IsolationForest` and opposite, the decision rule based on fitting an :class:`covariance.EllipticEnvelope` learns an ellipse, which fits well the inlier distribution. The :class:`ensemble.IsolationForest` - and :class:`neighbors.LocalOutlierFactor` perform as well. + and :class:`neighbors.LocalOutlierFactor` perform as well. - |outlier1| * - As the inlier distribution becomes bimodal, the :class:`covariance.EllipticEnvelope` does not fit well the inliers. However, we can see that :class:`ensemble.IsolationForest`, - :class:`svm.OneClassSVM` and :class:`neighbors.LocalOutlierFactor` - have difficulties to detect the two modes, - and that the :class:`svm.OneClassSVM` + :class:`svm.OneClassSVM` and :class:`neighbors.LocalOutlierFactor` + have difficulties to detect the two modes, + and that the :class:`svm.OneClassSVM` tends to overfit: because it has no model of inliers, it interprets a region where, by chance some outliers are clustered, as inliers. @@ -292,7 +292,7 @@ multiple modes and :class:`ensemble.IsolationForest` and :class:`svm.OneClassSVM` is able to recover a reasonable approximation as well as :class:`ensemble.IsolationForest` and :class:`neighbors.LocalOutlierFactor`, - whereas the :class:`covariance.EllipticEnvelope` completely fails. + whereas the :class:`covariance.EllipticEnvelope` completely fails. - |outlier3| .. topic:: Examples: diff --git a/doc/modules/preprocessing.rst b/doc/modules/preprocessing.rst index a4e1364a85ae6..18ef7e004c8de 100644 --- a/doc/modules/preprocessing.rst +++ b/doc/modules/preprocessing.rst @@ -199,7 +199,7 @@ matrices as input, as long as ``with_mean=False`` is explicitly passed to the constructor. Otherwise a ``ValueError`` will be raised as silently centering would break the sparsity and would often crash the execution by allocating excessive amounts of memory unintentionally. -:class:`RobustScaler` cannot be fited to sparse inputs, but you can use +:class:`RobustScaler` cannot be fitted to sparse inputs, but you can use the ``transform`` method on sparse inputs. Note that the scalers accept both Compressed Sparse Rows and Compressed diff --git a/doc/modules/tree.rst b/doc/modules/tree.rst index f793c34b7f53d..3f577795e24be 100644 --- a/doc/modules/tree.rst +++ b/doc/modules/tree.rst @@ -481,7 +481,10 @@ Regression criteria If the target is a continuous value, then for node :math:`m`, representing a region :math:`R_m` with :math:`N_m` observations, common -criteria to minimise are +criteria to minimise as for determining locations for future +splits are Mean Squared Error, which minimizes the L2 error +using mean values at terminal nodes, and Mean Absolute Error, which +minimizes the L1 error using median values at terminal nodes. Mean Squared Error: diff --git a/doc/related_projects.rst b/doc/related_projects.rst index 877a6beeed60e..70971e934ccac 100644 --- a/doc/related_projects.rst +++ b/doc/related_projects.rst @@ -43,9 +43,6 @@ enhance the functionality of scikit-learn's estimators. **Experimentation frameworks** -- `PyMC `_ Bayesian statistical models and - fitting algorithms. - - `REP `_ Environment for conducting data-driven research in a consistent and reproducible way @@ -222,18 +219,19 @@ Other packages useful for data analysis and machine learning. statistical models. More focused on statistical tests and less on prediction than scikit-learn. +- `PyMC `_ Bayesian statistical models and + fitting algorithms. + - `Sacred `_ Tool to help you configure, organize, log and reproduce experiments -- `gensim `_ A library for topic modelling, - document indexing and similarity retrieval - - `Seaborn `_ Visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. - `Deep Learning `_ A curated list of deep learning software libraries. + Domain specific packages ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -243,6 +241,9 @@ Domain specific packages - `Natural language toolkit (nltk) `_ Natural language processing and some machine learning. +- `gensim `_ A library for topic modelling, + document indexing and similarity retrieval + - `NiLearn `_ Machine learning for neuro-imaging. - `AstroML `_ Machine learning for astronomy. diff --git a/doc/themes/scikit-learn/layout.html b/doc/themes/scikit-learn/layout.html index d659b9ce86179..9a2691c6b1fbb 100644 --- a/doc/themes/scikit-learn/layout.html +++ b/doc/themes/scikit-learn/layout.html @@ -85,9 +85,9 @@