Skip to content

DOC: improved the scatter method #20118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 14, 2018
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 62 additions & 8 deletions pandas/plotting/_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -2852,22 +2852,76 @@ def pie(self, y=None, **kwds):

def scatter(self, x, y, s=None, c=None, **kwds):
"""
Scatter plot
Create a scatter plot with varying marker point size and color.

The coordinates of each point are defined by two dataframe columns and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Muuuuch better!!!!!!

filled circles are used to represent each point. This kind of plot is
useful to see complex correlations between two variables. Points could
be for instance natural 2D coordinates like longitude and latitude in
a map or, in general, any pair of metrics that can be plotted against
each other.

Parameters
----------
x, y : label or position, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coordinates for each point.
s : scalar or array_like, optional
Size of each point.
c : label or position, optional
Color of each point.
`**kwds` : optional
x : int, str
Copy link
Contributor

@dukebody dukebody Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be int or str instead, according to the guide: "If more than one type is accepted, separate them by commas, except the last two types, that need to be separated by the word ‘or’"

The column name or column position to be used as horizontal
coordinates for each point.
y : int, str
The column name or column position to be used as vertical
coordinates for each point.
s : scalar, array_like, optional
The size of each point. Possible values are:

- A single scalar so all points have the same size.

- A sequence of scalars, which will be used for each point's size
recursively. For intance [2,14] all points will be size 2 or 14,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For instance, using [2, 14] all points will be of size ...". Typo in "instance", the rest is a proposal to make it more prose-English.

alternatively.

c : str, int, array_like, optional
The color of each point. Possible values are:

- A single color string referred to by name, RGB or RGBA code,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified if these bullet points render nicely in the final HTML? I'm not good at restructured text so I tend to be wary about these things. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need the blank lines to properly render the bullet points ;)
screen shot 2018-03-14 at 09 59 01

for instance 'red' or '#a98d19'.

- A sequence of color strings referred to by name, RGB or RGBA code,
which will be used for each point's color recursively. For intance
['green','yellow'] all points will be filled in green or yellow,
alternatively.

- A column name or position whose values will be used to color the
marker points according to a colormap.

**kwds : optional
Keyword arguments to pass on to :py:meth:`pandas.DataFrame.plot`.

Returns
-------
axes : matplotlib.AxesSubplot or np.array of them

See Also
--------
matplotlib.pyplot.scatter : scatter plot using multiple input data
formats.

Examples
--------
Let's see how to draw a scatter plot using coordinates and color from
the values in three DataFrame columns.

.. plot::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some text above the code explaining what are you doing here?

:context: close-figs

>>> df = pd.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1],
... [6.4, 3.2, 1], [5.9, 3.0, 2]],
... columns = ['length', 'width', 'species'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove spaces between the equal sign? columns=['length...

>>> ax1 = df.plot.scatter(x='length',
... y='width',
... c='DarkBlue')
>>> ax2 = df.plot.scatter(x='length',
... y='width',
... c='species',
... colormap='viridis')
"""
return self(kind='scatter', x=x, y=y, c=c, s=s, **kwds)

Expand Down