Skip to content

DOC: update the pandas.DataFrame.notna and pandas.Series.notna docstring #20160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 13, 2018
Merged
107 changes: 103 additions & 4 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5458,13 +5458,63 @@ def asof(self, where, subset=None):
# Action Methods

_shared_docs['isna'] = """
Detect missing values.

Return a boolean same-sized object indicating if the values are NA.
NA values, such as None or :attr:`numpy.NaN`, get mapped to True
values.
Everything else get mapped to False values. Characters such as empty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get -> gets ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry, not my language. Fixed the typos and changed the link as proposed in .notna docstring.

strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this attr work? I don't know if its in our API docs. I think it'd be ok to jsut have unless you set ``pandas.options.mode.use_inf_as_na = True``

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have seen :attr: references throughout other docstrings (frame.py). I can change it, thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the attr itself, but the fact there is nothing to link to I think?

I would rather make this ``pandas.options.mode.use_inf_as_na = True``


Returns
-------
bool of type %(klass)s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this type annotation right? It looks like bool is the return variable name (which is not needed).
https://python-sprints.github.io/pandas/guide/pandas_docstring.html#section-3-parameters

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A complex type then? So it should be somehting like "dict of int", but it looks like it's inverted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get a DataFrame or a Series that contains boolean values.
It feels like writing DataFrame of boolor Series of bool may be harder to userstand for new users. Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidelines specify that complex types should be written like list of bool. Maybe you can just say that the return type is DataFrame and in the explanation specify that its dtype is bool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type should be just %(klass)s

%(klass)s
    Each element of the %(klass)s will be a boolean.

Copy link
Contributor

@villasv villasv Mar 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's the closest to the guidelines.

Mask of True/False values for each element in %(klass)s that
indicates whether an element is an NA value

See Also
--------
%(klass)s.notna : boolean inverse of isna
%(klass)s.isnull : alias of isna
%(klass)s.notna : boolean inverse of isna
%(klass)s.dropna : omit axes labels with missing values
isna : top-level isna

Examples
--------
Show which entries in a DataFrame are NA.

>>> df = pd.DataFrame({'age': [5, 6, np.NaN],
... 'born': [pd.NaT, pd.Timestamp('1939-05-27'),
... pd.Timestamp('1940-04-25')],
... 'name': ['Alfred', 'Batman', ''],
... 'toy': [None, 'Batmobile', 'Joker']})
>>> df
age born name toy
0 5.0 NaT Alfred None
1 6.0 1939-05-27 Batman Batmobile
2 NaN 1940-04-25 Joker

>>> df.isna()
age born name toy
0 False True False True
1 False False False False
2 True False False False

Show which entries in a Series are NA.

>>> ser = pd.Series([5, 6, np.NaN])
>>> ser
0 5.0
1 6.0
2 NaN
dtype: float64

>>> ser.isna()
0 False
1 False
2 True
dtype: bool
"""

@Appender(_shared_docs['isna'] % _shared_doc_kwargs)
Expand All @@ -5476,14 +5526,63 @@ def isnull(self):
return isna(self).__finalize__(self)

_shared_docs['notna'] = """
Return a boolean same-sized object indicating if the values are
not NA.
Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about the pandas option.

NA values, such as None or :attr:`numpy.NaN`, get mapped to False
values.

Returns
-------
bool of type %(klass)s
Mask of True/False values for each element in %(klass)s that
indicates whether an element is not an NA value

See Also
--------
%(klass)s.isna : boolean inverse of notna
%(klass)s.notnull : alias of notna
%(klass)s.isna : boolean inverse of notna
%(klass)s.dropna : omit axes labels with missing values
notna : top-level notna

Examples
--------
Show which entries in a DataFrame are not NA.

>>> df = pd.DataFrame({'age': [5, 6, np.NaN],
... 'born': [pd.NaT, pd.Timestamp('1939-05-27'),
... pd.Timestamp('1940-04-25')],
... 'name': ['Alfred', 'Batman', ''],
... 'toy': [None, 'Batmobile', 'Joker']})
>>> df
age born name toy
0 5.0 NaT Alfred None
1 6.0 1939-05-27 Batman Batmobile
2 NaN 1940-04-25 Joker

>>> df.notna()
age born name toy
0 True False True False
1 True True True True
2 False True True True

Show which entries in a Series are not NA.

>>> ser = pd.Series([5, 6, np.NaN])
>>> ser
0 5.0
1 6.0
2 NaN
dtype: float64

>>> ser.notna()
0 True
1 True
2 False
dtype: bool
"""

@Appender(_shared_docs['notna'] % _shared_doc_kwargs)
Expand Down
36 changes: 33 additions & 3 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2022,18 +2022,48 @@ def hasnans(self):

def isna(self):
"""
Detect missing values
Detect missing values.

Return a boolean same-sized object indicating if the values are NA.
NA values, such as None or :attr:`numpy.NaN`, get mapped to True
values.
Everything else get mapped to False values. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).

.. versionadded:: 0.20.0

Returns
-------
a boolean array of whether my values are NA
numpy.ndarray
A boolean array of whether my values are NA

See also
--------
isnull : alias of isna
pandas.Index.isnull : alias of isna
pandas.Index.notna : boolean inverse of isna
pandas.Index.dropna : omit entries with missing values
pandas.isna : top-level isna

Examples
--------
Show which entries in a pandas.Index are NA. The result is a
array.

>>> idx = pd.Index([5.2, 6.0, np.NaN])
>>> idx
Float64Index([5.2, 6.0, nan], dtype='float64')
>>> idx.isna()
array([False, False, True])

Empty strings are not considered NA values. None is considered a NA
value.

>>> idx = pd.Index(['black', '', 'red', None])
>>> idx
Index(['black', '', 'red', None], dtype='object')
>>> idx.isna()
array([False, False, False, True])
"""
return self._isnan
isnull = isna
Expand Down