Skip to content

Commit e50beb7

Browse files
committed
DOC: Update docs for pandas.cut
1 parent 4131149 commit e50beb7

File tree

1 file changed

+45
-34
lines changed

1 file changed

+45
-34
lines changed

pandas/core/reshape/tile.py

Lines changed: 45 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -26,53 +26,64 @@
2626
def cut(x, bins, right=True, labels=None, retbins=False, precision=3,
2727
include_lowest=False):
2828
"""
29-
Return indices of half-open bins to which each value of `x` belongs.
29+
Return indices of half-open `bins` to which each value of `x` belongs.
30+
31+
Use `cut` when you need to segment and sort data values into bins or
32+
buckets of data. This function is also useful for going from a continuous
33+
variable to a categorical variable. For example, `cut` could convert ages
34+
to groups of age ranges.
3035
3136
Parameters
3237
----------
3338
x : array-like
3439
Input array to be binned. It has to be 1-dimensional.
35-
bins : int, sequence of scalars, or IntervalIndex
36-
If `bins` is an int, it defines the number of equal-width bins in the
37-
range of `x`. However, in this case, the range of `x` is extended
38-
by .1% on each side to include the min or max values of `x`. If
39-
`bins` is a sequence it defines the bin edges allowing for
40-
non-uniform bin width. No extension of the range of `x` is done in
41-
this case.
42-
right : bool, optional
43-
Indicates whether the bins include the rightmost edge or not. If
44-
right == True (the default), then the bins [1,2,3,4] indicate
40+
bins : int, sequence of scalars, or pandas.IntervalIndex
41+
If `bins` is an int, defines the number of equal-width bins in the
42+
range of `x`. The range of `x` is extended by .1% on each side to
43+
include the min or max values of `x`.
44+
If `bins` is a sequence, defines the bin edges allowing for
45+
non-uniform bin width. No extension of the range of `x` is done.
46+
right : bool, optional, default 'True'
47+
Indicates whether the `bins` include the rightmost edge or not. If
48+
`right == True` (the default), then the `bins` [1,2,3,4] indicate
4549
(1,2], (2,3], (3,4].
46-
labels : array or boolean, default None
47-
Used as labels for the resulting bins. Must be of the same length as
48-
the resulting bins. If False, return only integer indicators of the
49-
bins.
50-
retbins : bool, optional
51-
Whether to return the bins or not. Can be useful if bins is given
50+
labels : array or bool, optional
51+
Used as labels for the resulting `bins`. Must be of the same length as
52+
the resulting `bins`. If False, returns only integer indicators of the
53+
`bins`.
54+
retbins : bool, optional, default 'False'
55+
Whether to return the `bins` or not. Useful when `bins` is provided
5256
as a scalar.
53-
precision : int, optional
54-
The precision at which to store and display the bins labels
55-
include_lowest : bool, optional
57+
precision : int, optional, default '3'
58+
The precision at which to store and display the `bins` labels.
59+
include_lowest : bool, optional, default 'False'
5660
Whether the first interval should be left-inclusive or not.
5761
5862
Returns
5963
-------
60-
out : Categorical or Series or array of integers if labels is False
61-
The return type (Categorical or Series) depends on the input: a Series
62-
of type category if input is a Series else Categorical. Bins are
63-
represented as categories when categorical data is returned.
64-
bins : ndarray of floats
65-
Returned only if `retbins` is True.
64+
out : pandas.Categorical or Series, or array of int if `labels` is 'False'
65+
The return type depends on the input.
66+
If the input is a Series, a Series of type category is returned.
67+
Else - pandas.Categorical is returned. `Bins` are represented as
68+
categories when categorical data is returned.
69+
bins : numpy.ndarray of floats
70+
Returned only if `retbins` is 'True'.
71+
72+
See Also
73+
--------
74+
qcut : Discretize variable into equal-sized buckets based on rank
75+
or based on sample quantiles.
76+
pandas.Categorical : Represents a categorical variable in
77+
classic R / S-plus fashion.
78+
Series : One-dimensional ndarray with axis labels (including time series).
79+
pandas.IntervalIndex : Immutable Index implementing an ordered,
80+
sliceable set. IntervalIndex represents an Index of intervals that
81+
are all closed on the same side.
6682
6783
Notes
6884
-----
69-
The `cut` function can be useful for going from a continuous variable to
70-
a categorical variable. For example, `cut` could convert ages to groups
71-
of age ranges.
72-
73-
Any NA values will be NA in the result. Out of bounds values will be NA in
74-
the resulting Categorical object
75-
85+
Any NA values will be NA in the result. Out of bounds values will be NA in
86+
the resulting pandas.Categorical object.
7687
7788
Examples
7889
--------
@@ -88,7 +99,7 @@ def cut(x, bins, right=True, labels=None, retbins=False, precision=3,
8899
Categories (3, object): [good < medium < bad]
89100
90101
>>> pd.cut(np.ones(5), 4, labels=False)
91-
array([1, 1, 1, 1, 1])
102+
array([1, 1, 1, 1, 1], dtype=int64)
92103
"""
93104
# NOTE: this binning code is changed a bit from histogram for var(x) == 0
94105

0 commit comments

Comments
 (0)