Skip to content

Commit cec30e8

Browse files
committed
doc improvements, particulary dataset docs
1 parent 2e27b86 commit cec30e8

File tree

1 file changed

+31
-22
lines changed

1 file changed

+31
-22
lines changed

doc/data-structures.rst

Lines changed: 31 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -74,9 +74,11 @@ in index values in the same way.
7474
Coordinates can take the following forms:
7575

7676
- A list of ``(dim, ticks[, attrs])`` pairs with length equal to the number of dimensions
77-
- A dictionary of ``{coord_name: coord}`` where the values are scaler values,
78-
1D arrays or tuples (tuples in the same form as above). This form lets you supply other
79-
coordinates than those corresponding to dimensions (more on these later).
77+
- A dictionary of ``{coord_name: coord}`` where the values are each a scalar value,
78+
a 1D array or a tuple. Tuples are be in the same form as the above, and
79+
multiple dimensions can be supplied with the form ``(dims, data[, attrs])``.
80+
Supplying as a tuple allows other coordinates than those corresponding to
81+
dimensions (more on these later).
8082

8183
As a list of tuples:
8284

@@ -92,6 +94,14 @@ As a dictionary:
9294
'ranking': ('space', [1, 2, 3])},
9395
dims=['time', 'space'])
9496
97+
As a dictionary with coords across multiple dimensions:
98+
99+
.. ipython:: python
100+
101+
xray.DataArray(data, coords={'time': times, 'space': locs, 'const': 42,
102+
'ranking': (('space', 'time'), np.arange(12).reshape(4,3))},
103+
dims=['time', 'space'])
104+
95105
If you create a ``DataArray`` by supplying a pandas
96106
:py:class:`~pandas.Series`, :py:class:`~pandas.DataFrame` or
97107
:py:class:`~pandas.Panel`, any non-specified arguments in the
@@ -194,8 +204,7 @@ to access any variable in a dataset, datasets have four key properties:
194204
each dimension (e.g., ``{'x': 6, 'y': 6, 'time': 8}``)
195205
- ``data_vars``: a dict-like container of DataArrays corresponding to variables
196206
- ``coords``: another dict-like container of DataArrays intended to label points
197-
used in ``data_vars`` (e.g., 1-dimensional arrays of numbers, datetime
198-
objects or strings)
207+
used in ``data_vars`` (e.g., arrays of numbers, datetime objects or strings)
199208
- ``attrs``: an ``OrderedDict`` to hold arbitrary metadata
200209

201210
The distinction between whether a variables falls in data or coordinates
@@ -223,18 +232,16 @@ Creating a Dataset
223232
~~~~~~~~~~~~~~~~~~
224233

225234
To make an :py:class:`~xray.Dataset` from scratch, supply dictionaries for any
226-
variables, coordinates and attributes you would like to insert into the
227-
dataset.
235+
variables (``data_vars``), coordinates (``coords``) and attributes (``attrs``).
228236

229-
For the ``data_vars`` and ``coords`` arguments, keys should be the name of the
230-
variable and values should be scalars, 1d arrays or tuples of the form
231-
``(dims, data[, attrs])`` sufficient to label each array:
237+
``data_vars`` are supplied as a dictionary with each key as the name of the variable and each
238+
value as one of:
239+
- A :py:class:`~xray.DataArray`
240+
- A tuple of the form ``(dims, data[, attrs])``
241+
- A pandas object
232242

233-
- ``dims`` should be a sequence of strings.
234-
- ``data`` should be a numpy.ndarray (or array-like object) that has a
235-
dimensionality equal to the length of ``dims``.
236-
- ``attrs`` is an arbitrary Python dictionary for storing metadata associated
237-
with a particular array.
243+
``coords`` are supplied as dictionary of ``{coord_name: coord}`` where the values are scalar values,
244+
arrays or tuples in the form of ``(dims, data[, attrs])``.
238245

239246
Let's create some fake data for the example we show above:
240247

@@ -259,8 +266,8 @@ Notice that we did not explicitly include coordinates for the "x" or "y"
259266
dimensions, so they were filled in array of ascending integers of the proper
260267
length.
261268

262-
We can also pass :py:class:`xray.DataArray` objects or a pandas object as values
263-
in the dictionary instead of tuples:
269+
Here we pass :py:class:`xray.DataArray` objects or a pandas object as values
270+
in the dictionary:
264271

265272
.. ipython:: python
266273
@@ -271,13 +278,15 @@ in the dictionary instead of tuples:
271278
272279
xray.Dataset({'bar': foo.to_pandas()})
273280
274-
Where a pandas object is supplied, the names of its indexes are used as dimension
281+
Where a pandas object is supplied as a value, the names of its indexes are used as dimension
275282
names, and its data is aligned to any existing dimensions.
276283

277-
You can also create an dataset from a :py:class:`pandas.DataFrame` with
278-
:py:meth:`Dataset.from_dataframe <xray.Dataset.from_dataframe>` or from a
279-
netCDF file on disk with :py:func:`~xray.open_dataset`. See
280-
:ref:`pandas` and :ref:`io`.
284+
You can also create an dataset from:
285+
- A :py:class:`pandas.DataFrame` or :py:class:`pandas.Panel` along its columns and items
286+
respectively, by passing it into the :py:class:`xray.Dataset` directly
287+
- A :py:class:`pandas.DataFrame` with :py:meth:`Dataset.from_dataframe <xray.Dataset.from_dataframe>`,
288+
which will additionally handle MultiIndexes See :ref:`pandas`
289+
- A netCDF file on disk with :py:func:`~xray.open_dataset`. See :ref:`io`.
281290

282291
Dataset contents
283292
~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)