Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RELEASE 0.19 #9502

Merged
merged 88 commits into from
Aug 11, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
d0a3185
[MRG+1] AffinityPropagation damping factor not explained (#9335)
SebastinSanty Jul 13, 2017
2cc0673
covariance.graph_lasso does not pass eps to linear_model.lars_path (#…
SebastinSanty Jul 13, 2017
66ef768
[MRG+1] Deprecating the use of size_threshold parameter in manhattan_…
pravarmahajan Jul 13, 2017
981af69
[MRG + 1] Too few arguments in formatting call (#9298)
SebastinSanty Jul 13, 2017
1a4e37c
[MRG+1] supress deprecation warnings for non_negative option (#9356)
minghui-liu Jul 15, 2017
bc40aae
Adding note to the docstring that BayesianGaussianMixture parameter w…
melgoetz Jul 15, 2017
7330bde
[MRG + 1] DOC developer quality of life notes (#9082)
vene Jul 15, 2017
f91d226
minor sphinx fixes (#9370)
amueller Jul 15, 2017
8282e4a
[MRG+3] Added examples to RandomForestClassifier and RandomForestRegr…
lodurality Jul 16, 2017
8d6439b
DOC markup fixes and grammar
jnothman Jul 17, 2017
cb35980
added examples to docstrings of PassiveAgressiveClassifier and Passiv…
lodurality Jul 16, 2017
807b526
added examples to docstrings of LinearSVC and LinearSVR (#9375)
lodurality Jul 16, 2017
a92fc7a
[MRG+1] copy not passed from linear_model/base.py:_pre_fit to _prepro…
Jul 16, 2017
b2fd683
misc
agramfort Jul 16, 2017
a19f0df
[MRG] Add few more tests + Documentation for re-entrant cross-validat…
raghavrv Jul 16, 2017
28f5123
DOC Move some things around in related projects
jnothman Jul 17, 2017
4cce374
DOC Use - instead of * for bullets
jnothman Jul 17, 2017
5d0b93e
DOC Fix typos (#9386)
taehoonlee Jul 17, 2017
fb5c85e
MISC: typo in rst
GaelVaroquaux Jul 17, 2017
565bc39
[MRG] Add Explanation of MSE vs Friedman MSE vs MAE criterion in Regr…
warut-vijit Jul 17, 2017
e37504c
use ignore_warnings to catch FutueWarning (#9374)
NickleDave Jul 17, 2017
54232c9
Markup in release notes
jnothman Jul 18, 2017
ecdc76c
DOC reorder what's new paragraphs
jnothman Jul 18, 2017
9aeab46
[MRG+1] - DeprecationWarning for n_components parameter for linkage_t…
sharanry Jul 18, 2017
6dd0ab9
Pass affinity to fix connectivity in linkage tree (#9357)
brenolf Jul 18, 2017
afa26da
[MRG + 1] FIX gil acquisition in dist_metric (#9311)
glemaitre Jul 18, 2017
da16809
[MRG+1] Added examples to docstrings of MinMaxScaler and StandardScal…
lodurality Jul 18, 2017
8c08f58
[MRG + 1] Fix wrong error message in StratifiedKFold (#9396)
Jul 18, 2017
f5c2a83
DOC Fix multi metric link in model selection (#9410)
Jul 19, 2017
4641cfc
[MRG+1] Add links for [RW2006] (#9412)
gamebusterz Jul 19, 2017
1dcaeb8
[MRG] Formatting error in cross_validation.rst (#9415)
SebastinSanty Jul 19, 2017
0d5d315
[MRG+1] Docstring parameters improvements for cross_decomposition and…
clemkoa Jul 19, 2017
d25d8f7
[MRG] DOC Fix known issues link in faq (#9418)
Jul 20, 2017
d4cd401
Note->Notes, fix underline in multioutput examples (#9416)
amueller Jul 20, 2017
6187426
[MRG] DOC add non support of COO safe indexing (#9423)
glemaitre Jul 20, 2017
32f452d
Add download_if_missing argument to fetch_20newsgroups_vectorized (#9…
lesteve Jul 20, 2017
477225e
Fix: typo in DistanceMetric docstring example (#9427)
filipj8 Jul 20, 2017
c1cf87e
[MRG+1] TST Add test coverage for countVectorizer with ngram_range > …
herilalaina Jul 21, 2017
d6ff52d
[MRG+1] - Voting classifier flatten transform (Continuation) (#9188)
herilalaina Jul 21, 2017
7a5da82
[MRG] Add Alfred P. Sloan foundation to sponsors and footer (#9402)
amueller Jul 22, 2017
1577a5e
[MRG] FIX Examples use int / int without __future__.division (#9426)
SebastinSanty Jul 23, 2017
2f8a0da
update grants funding info for CDS, Telecom + Inria (#9436)
agramfort Jul 23, 2017
bcd442a
[MRG] DOC Dedent what's new lists (#9349)
jnothman Jul 24, 2017
7b7cc61
Update partial_dependence.py (#9434)
Jul 24, 2017
3f7095f
remove depreated "plt.hold" that defaults to "on". (#9444)
amueller Jul 24, 2017
2603293
[MRG + 1] Multiclass Documentation update (#9419)
Jul 25, 2017
70cb5a7
[MRG+1] Chassifier chain example fix (#9408)
Jul 25, 2017
19c3ad7
Fixed incorrect docstring (#9446)
while Jul 25, 2017
d4bbadd
[MRG+1] retry mechanism for plot_stock_market.py (#9437)
hakaa1 Jul 25, 2017
6deb844
[MRG+1] BUG Fix the shrinkage implementation in NearestCentroid (#9219)
qinhanmin2014 Jul 25, 2017
b050a2c
[MRG] DOC use def instead of lambda in the multimetric example at mod…
raghavrv Jul 26, 2017
2cc6c52
[MRG+1] Rearrange modules in alphabetical order (#9449)
Jul 27, 2017
bacd7a5
[MRG+1] DOC improve RFE/RFECV estimator docstring (#9233)
qinhanmin2014 Jul 27, 2017
12cd3f7
Increase the max_iter for LabelPropagation. (#9441)
musically-ut Jul 27, 2017
6dbaa51
DOC Explicitly use https in index.rst links (#9462)
alanyee Jul 29, 2017
86893ef
DOC Clarify RobustScaler behavior with sparse input (#8858)
naoyak Jul 29, 2017
5146e88
[MRG + 1] DOC Fix Sphinx errors (#9420)
Jul 30, 2017
dd898b1
DOC Use :class: for first VotingClassifier reference
jnothman Jul 30, 2017
81b7bad
Credit University of Sydney sponsorship (#9466)
jnothman Aug 1, 2017
0465dec
[MRG+1] Added examples to docstrings of ElasticNet and ElasticNetCV (…
lodurality Aug 1, 2017
d012f05
[MRG+1] DOC Simplifying margin plotting in SVM examples (#8501) (#8875)
VathsalaAchar Aug 1, 2017
2edb5cf
[MRG+1] Issue#7998 : Consistent parameters between QDA and LDA (#8130)
mrbeann Aug 1, 2017
e14e313
Fix typos (#9476)
taehoonlee Aug 2, 2017
81e359e
DOC Update classification.py (#9478)
skrish13 Aug 2, 2017
d54815c
PEP8 fix blank line contains whitespace
jnothman Aug 2, 2017
c815caf
DOC Release date
jnothman Aug 6, 2017
3b2ae67
Merge branch '0.19.X' of github.com:scikit-learn/scikit-learn into 0.…
jnothman Aug 6, 2017
96f0857
Update version
jnothman Aug 6, 2017
0a28c6b
Merge branch '0.19.X' of github.com:scikit-learn/scikit-learn into 0.…
jnothman Aug 6, 2017
dc9ab80
DOC fix merge errors in what's new
jnothman Aug 6, 2017
e5b892e
FIX Convergence warning and n_iter_ in LabelPropagation (#5893)
musically-ut Aug 6, 2017
affcff4
DOC remove unnecessary line (#9504)
tobycheese Aug 6, 2017
8d7396c
DOC Correct what's new for #9108 (#9501)
qinhanmin2014 Aug 6, 2017
5fef319
DOC Fixup of linear svm separating hyperplane plot (#9471)
amueller Aug 6, 2017
11c6243
FIX Incorrent implementation of noise_variance_ in PCA._fit_truncated…
qinhanmin2014 Aug 6, 2017
02c496e
FIX Pass sample_weight as kwargs in VotingClassifier (#9493)
jschendel Aug 5, 2017
6ff8790
Bring last code block in line with the image. (#9488)
julesjulian Aug 4, 2017
ac85392
[MRG+1] FIX Add missing mixins to ClassifierChain (#9473)
jnothman Aug 4, 2017
b01e20b
fix wrong assert in test_validation (#9480)
amueller Aug 2, 2017
fd2e0f7
[MRG+1] add scorer based on explained_variance_score (#9259)
qinhanmin2014 Aug 8, 2017
fa794ea
Fix safe_indexing with read-only indices (#9507)
lesteve Aug 8, 2017
6c6d6a2
Use base.is_classifier instead instead of isinstance (#9482)
minghui-liu Aug 8, 2017
2cc156f
Update what's new for recent changes
jnothman Aug 8, 2017
0d6740d
DOC Remove some whitespace from what's new
jnothman Aug 8, 2017
8d36abf
DOC Change release date to Thursday
jnothman Aug 8, 2017
e4274b5
DOC list of contributors to 0.19
jnothman Aug 8, 2017
740d92d
DOC Update news and menu for 0.19 release
jnothman Aug 8, 2017
f2d66b8
DOC set release date to Friday
jnothman Aug 10, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 20 additions & 6 deletions doc/about.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Funding

`INRIA <https://www.inria.fr>`_ actively supports this project. It has
provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler
(2012-2013) and Olivier Grisel (2013-2015) to work on this project
(2012-2013) and Olivier Grisel (2013-2017) to work on this project
full-time. It also hosts coding sprints and other events.

.. image:: images/inria-logo.jpg
Expand All @@ -77,7 +77,7 @@ full-time. It also hosts coding sprints and other events.

`Paris-Saclay Center for Data Science <http://www.datascience-paris-saclay.fr>`_
funded one year for a developer to work on the project full-time
(2014-2015).
(2014-2015) and 50% of the time of Guillaume Lemaitre (2016-2017).

.. image:: images/cds-logo.png
:width: 200pt
Expand All @@ -94,23 +94,37 @@ Environment also funds several students to work on the project part-time.
:target: http://cds.nyu.edu/mooresloan/


`Télécom Paristech <http://www.telecom-paristech.com>`_ funds Manoj Kumar (2014),
Tom Dupré la Tour (2015), Raghav RV (2015-2016) and Thierry Guillemot (2016) to
work on scikit-learn.
`Télécom Paristech <http://www.telecom-paristech.com>`_ funded Manoj Kumar (2014),
Tom Dupré la Tour (2015), Raghav RV (2015-2017), Thierry Guillemot (2016-2017)
and Albert Thomas (2017) to work on scikit-learn.

.. image:: themes/scikit-learn/static/img/telecom.png
:width: 100pt
:align: center
:target: http://www.telecom-paristech.fr/


`Columbia University <http://columbia.edu>`_ funds Andreas Mueller since 2016.
`Columbia University <http://columbia.edu>`_ funds Andreas Müller since 2016.

.. image:: themes/scikit-learn/static/img/columbia.png
:width: 100pt
:align: center
:target: http://www.columbia.edu/

Andreas Müller also received a grant to improve scikit-learn from the `Alfred P. Sloan Foundation <https://sloan.org>`_ in 2017.

.. image:: images/sloan_banner.png
:width: 200pt
:align: center
:target: https://sloan.org/

`The University of Sydney <http://sydney.edu.au>`_ funds Joel Nothman since July 2017.

.. image:: themes/scikit-learn/static/img/sydney-primary.jpeg
:width: 200pt
:align: center
:target: http://www.sydney.edu.au/

The following students were sponsored by `Google <https://developers.google.com/open-source/>`_
to work on scikit-learn through the
`Google Summer of Code <https://en.wikipedia.org/wiki/Google_Summer_of_Code>`_
Expand Down
14 changes: 7 additions & 7 deletions doc/datasets/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ features::

.. topic:: Related links:

_`Public datasets in svmlight / libsvm format`: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
_`Public datasets in svmlight / libsvm format`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets

_`Faster API-compatible implementation`: https://github.com/mblondel/svmlight-loader

Expand All @@ -268,15 +268,15 @@ DataFrame are also acceptable.
Here are some recommended ways to load standard columnar data into a
format usable by scikit-learn:

* `pandas.io <http://pandas.pydata.org/pandas-docs/stable/io.html>`_
* `pandas.io <https://pandas.pydata.org/pandas-docs/stable/io.html>`_
provides tools to read data from common formats including CSV, Excel, JSON
and SQL. DataFrames may also be constructed from lists of tuples or dicts.
Pandas handles heterogeneous data smoothly and provides tools for
manipulation and conversion into a numeric array suitable for scikit-learn.
* `scipy.io <http://docs.scipy.org/doc/scipy/reference/io.html>`_
* `scipy.io <https://docs.scipy.org/doc/scipy/reference/io.html>`_
specializes in binary formats often used in scientific computing
context such as .mat and .arff
* `numpy/routines.io <http://docs.scipy.org/doc/numpy/reference/routines.io.html>`_
* `numpy/routines.io <https://docs.scipy.org/doc/numpy/reference/routines.io.html>`_
for standard loading of columnar data into numpy arrays
* scikit-learn's :func:`datasets.load_svmlight_file` for the svmlight or libSVM
sparse format
Expand All @@ -288,14 +288,14 @@ For some miscellaneous data such as images, videos, and audio, you may wish to
refer to:

* `skimage.io <http://scikit-image.org/docs/dev/api/skimage.io.html>`_ or
`Imageio <http://imageio.readthedocs.io/en/latest/userapi.html>`_
`Imageio <https://imageio.readthedocs.io/en/latest/userapi.html>`_
for loading images and videos to numpy arrays
* `scipy.misc.imread <http://docs.scipy.org/doc/scipy/reference/generated/scipy.
* `scipy.misc.imread <https://docs.scipy.org/doc/scipy/reference/generated/scipy.
misc.imread.html#scipy.misc.imread>`_ (requires the `Pillow
<https://pypi.python.org/pypi/Pillow>`_ package) to load pixel intensities
data from various image file formats
* `scipy.io.wavfile.read
<http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html>`_
<https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html>`_
for reading WAV files into a numpy array

Categorical (or nominal) features stored as strings (common in pandas DataFrames)
Expand Down
51 changes: 0 additions & 51 deletions doc/developers/debugging.rst

This file was deleted.

2 changes: 1 addition & 1 deletion doc/developers/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Developer's Guide
.. toctree::

contributing
debugging
tips
utilities
performance
advanced_installation
Expand Down
119 changes: 119 additions & 0 deletions doc/developers/tips.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
.. _developers-tips:

===========================
Developers' Tips and Tricks
===========================

Productivity and sanity-preserving tips
=======================================

In this section we gather some useful advice and tools that may increase your
quality-of-life when reviewing pull requests, running unit tests, and so forth.
Some of these tricks consist of userscripts that require a browser extension
such as `TamperMonkey`_ or `GreaseMonkey`_; to set up userscripts you must have
one of these extensions installed, enabled and running. We provide userscripts
as GitHub gists; to install them, click on the "Raw" button on the gist page.

.. _TamperMonkey: https://tampermonkey.net
.. _GreaseMonkey: http://www.greasespot.net

Viewing the rendered HTML documentation for a pull request
----------------------------------------------------------

We use CircleCI to build the HTML documentation for every pull request. To
access that documentation, we provide a redirect as described in the
:ref:`documentation section of the contributor guide
<contribute_documentation>`. Instead of typing the address by hand, we provide a
`userscript <https://gist.github.com/lesteve/470170f288884ec052bcf4bc4ffe958a>`_
that adds a button to every PR. After installing the userscript, navigate to any
GitHub PR; a new button labeled "See CircleCI doc for this PR" should appear in
the top-right area.

Folding and unfolding outdated diffs on pull requests
-----------------------------------------------------

GitHub hides discussions on PRs when the corresponding lines of code have been
changed in the mean while. This `userscript
<https://gist.github.com/lesteve/b4ef29bccd42b354a834>`_ provides a button to
unfold all such hidden discussions at once, so you can catch up.

Checking out pull requests as remote-tracking branches
------------------------------------------------------

In your local fork, add to your ``.git/config``, under the ``[remote
"upstream"]`` heading, the line::

fetch = +refs/pull/*/head:refs/remotes/upstream/pr/*

You may then use ``git checkout pr/PR_NUMBER`` to navigate to the code of the
pull-request with the given number. (`Read more in this gist.
<https://gist.github.com/piscisaureus/3342247>`_)

Display code coverage in pull requests
--------------------------------------

To overlay the code coverage reports generated by the CodeCov continuous
integration, consider `this browser extension
<https://github.com/codecov/browser-extension>`_. The coverage of each line
will be displayed as a color background behind the line number.

Useful pytest aliases and flags
-------------------------------

We recommend using pytest to run unit tests. When a unit tests fail, the
following tricks can make debugging easier:

1. The command line argument ``pytest -l`` instructs pytest to print the local
variables when a failure occurs.

2. The argument ``pytest --pdb`` drops into the Python debugger on failure. To
instead drop into the rich IPython debugger ``ipdb``, you may set up a
shell alias to::

pytest --pdbcls=IPython.terminal.debugger:TerminalPdb --capture no

Debugging memory errors in Cython with valgrind
===============================================

While python/numpy's built-in memory management is relatively robust, it can
lead to performance penalties for some routines. For this reason, much of
the high-performance code in scikit-learn in written in cython. This
performance gain comes with a tradeoff, however: it is very easy for memory
bugs to crop up in cython code, especially in situations where that code
relies heavily on pointer arithmetic.

Memory errors can manifest themselves a number of ways. The easiest ones to
debug are often segmentation faults and related glibc errors. Uninitialized
variables can lead to unexpected behavior that is difficult to track down.
A very useful tool when debugging these sorts of errors is
valgrind_.


Valgrind is a command-line tool that can trace memory errors in a variety of
code. Follow these steps:

1. Install `valgrind`_ on your system.

2. Download the python valgrind suppression file: `valgrind-python.supp`_.

3. Follow the directions in the `README.valgrind`_ file to customize your
python suppressions. If you don't, you will have spurious output coming
related to the python interpreter instead of your own code.

4. Run valgrind as follows::

$> valgrind -v --suppressions=valgrind-python.supp python my_test_script.py

.. _valgrind: http://valgrind.org
.. _`README.valgrind`: http://svn.python.org/projects/python/trunk/Misc/README.valgrind
.. _`valgrind-python.supp`: http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp


The result will be a list of all the memory-related errors, which reference
lines in the C-code generated by cython from your .pyx file. If you examine
the referenced lines in the .c file, you will see comments which indicate the
corresponding location in your .pyx source file. Hopefully the output will
give you clues as to the source of your memory error.

For more information on valgrind and the array of options it has, see the
tutorials and documentation on the `valgrind web site <http://valgrind.org>`_.
6 changes: 3 additions & 3 deletions doc/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ Apart from scikit-learn, another popular one is `scikit-image <http://scikit-ima
How can I contribute to scikit-learn?
-----------------------------------------
See :ref:`contributing`. Before wanting to add a new algorithm, which is
usually a major and lengthy undertaking, it is recommended to start with :ref:`known
issues <easy_issues>`_. Please do not contact the contributors of scikit-learn directly
regarding contributing to scikit-learn.
usually a major and lengthy undertaking, it is recommended to start with
:ref:`known issues <new_contributors>`. Please do not contact the contributors
of scikit-learn directly regarding contributing to scikit-learn.

What's the best way to get help on scikit-learn usage?
--------------------------------------------------------------
Expand Down
Binary file added doc/images/sloan_banner.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 7 additions & 4 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -207,14 +207,16 @@
<li><em>On-going development:</em>
<a href="/dev/whats_new.html"><em>What's new</em> (Changelog)</a>
</li>
<li><em>July 2017.</em> scikit-learn 0.19.0 is available for download (<a href="whats_new.html#version-0-19">Changelog</a>).
</li>
<li><em>June 2017.</em> scikit-learn 0.18.2 is available for download (<a href="whats_new.html#version-0-18-2">Changelog</a>).
</li>
<li><em>September 2016.</em> scikit-learn 0.18.0 is available for download (<a href="whats_new.html#version-0-18">Changelog</a>).
</li>
<li><em>November 2015.</em> scikit-learn 0.17.0 is available for download (<a href="whats_new.html#version-0-17">Changelog</a>).
</li>
<li><em>March 2015.</em> scikit-learn 0.16.0 is available for download (<a href="whats_new.html#version-0-16">Changelog</a>).
</li>
<li><em>July 2014.</em> scikit-learn 0.15.0 is available for download (<a href="whats_new.html#version-0-15">Changelog</a>).
</li>
<li><em>July 14-20th, 2014: international sprint.</em>
During this week-long sprint, we gathered 18 of the core
contributors in Paris.
Expand Down Expand Up @@ -323,14 +325,15 @@
Funding provided by INRIA and others.
</div>
<div class="span6">
<a class="reference internal" href="about.html#funding" style="text-decoration: none" >
<a class="reference internal" href="about.html#funding" style="text-decoration: none; white-space: nowrap" >
<img id="index-funding-logo-big" src="_static/img/inria-small.png" title="INRIA">
<img id="index-funding-logo-small" src="_static/img/google.png" title="Google">
<!--Due to Télécom ParisTech's logo text being smaller, a style has been added to improve readability-->
<img id="index-funding-logo-small" src="_static/img/telecom.png" title="Télécom ParisTech" style="max-height: 36px">
<img id="index-funding-logo-small" src="_static/img/FNRS-logo.png" title="FNRS">
<img id="index-funding-logo-small" src="_static/img/nyu_short_color.png" title="NYU CDS">
<img id="index-funding-logo-small" src="_static/img/sloan_logo.jpg" title="Alfred P. Sloan Foundation" style="max-height: 36px">
<img id="index-funding-logo-small" src="_static/img/columbia.png" title="Columbia University" style="max-height: 36px;">
<img id="index-funding-logo-small" src="_static/img/sydney-stacked.jpeg" title="The University of Sydney" style="max-height: 36px;">
</a>
</div>
<div class="span3">
Expand Down
24 changes: 12 additions & 12 deletions doc/modules/calibration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ with different biases per method:
* :class:`RandomForestClassifier` shows the opposite behavior: the histograms
show peaks at approximately 0.2 and 0.9 probability, while probabilities close to
0 or 1 are very rare. An explanation for this is given by Niculescu-Mizil
and Caruana [4]: "Methods such as bagging and random forests that average
and Caruana [4]_: "Methods such as bagging and random forests that average
predictions from a base set of models can have difficulty making predictions
near 0 and 1 because variance in the underlying base models will bias
predictions that should be near zero or one away from these values. Because
Expand All @@ -57,15 +57,15 @@ with different biases per method:
ensemble away from 0. We observe this effect most strongly with random
forests because the base-level trees trained with random forests have
relatively high variance due to feature subseting." As a result, the
calibration curve also referred to as the reliability diagram (Wilks 1995[5]) shows a
calibration curve also referred to as the reliability diagram (Wilks 1995 [5]_) shows a
characteristic sigmoid shape, indicating that the classifier could trust its
"intuition" more and return probabilties closer to 0 or 1 typically.

.. currentmodule:: sklearn.svm

* Linear Support Vector Classification (:class:`LinearSVC`) shows an even more sigmoid curve
as the RandomForestClassifier, which is typical for maximum-margin methods
(compare Niculescu-Mizil and Caruana [4]), which focus on hard samples
(compare Niculescu-Mizil and Caruana [4]_), which focus on hard samples
that are close to the decision boundary (the support vectors).

.. currentmodule:: sklearn.calibration
Expand Down Expand Up @@ -190,18 +190,18 @@ a similar decrease in log-loss.

.. topic:: References:

.. [1] Obtaining calibrated probability estimates from decision trees
and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001
* Obtaining calibrated probability estimates from decision trees
and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001

.. [2] Transforming Classifier Scores into Accurate Multiclass
Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002)
* Transforming Classifier Scores into Accurate Multiclass
Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002)

.. [3] Probabilistic Outputs for Support Vector Machines and Comparisons to
Regularized Likelihood Methods, J. Platt, (1999)
* Probabilistic Outputs for Support Vector Machines and Comparisons to
Regularized Likelihood Methods, J. Platt, (1999)

.. [4] Predicting Good Probabilities with Supervised Learning,
A. Niculescu-Mizil & R. Caruana, ICML 2005
A. Niculescu-Mizil & R. Caruana, ICML 2005

.. [5] On the combination of forecast probabilities for
consecutive precipitation periods. Wea. Forecasting, 5, 640–
650., Wilks, D. S., 1990a
consecutive precipitation periods. Wea. Forecasting, 5, 640–650.,
Wilks, D. S., 1990a
Loading