Skip to content

[Do not merge] Adding crossrefs to standard library #7624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

hoefling
Copy link
Contributor

@hoefling hoefling commented Oct 4, 2019

This is the second part of adding referencing to mypy docs (first part in #7623, this one is based on the first part). Replaced verbatim texts with crossrefs to standard library where applicable. Example: a passage

The ``typing`` module defines various protocol classes that correspond
to common Python protocols, such as ``Iterable[T]``.  If a class
defines a suitable ``__iter__`` method, mypy understands that it
implements the iterable protocol and is compatible with ``Iterable[T]``.
For example, ``IntList`` below is iterable, over ``int`` values:

image
becomes

The :py:mod:`typing` module defines various protocol classes that
correspond to common Python protocols, such as
:py:class:`Iterable[T] <typing.Iterable>`.
If a class defines a suitable :py:meth:`__iter__ <object.__iter__>`
method, mypy understands that it implements the iterable protocol
and is compatible with :py:class:`Iterable[T] <typing.Iterable>`.
For example, ``IntList`` below is iterable, over :py:class:`int`
values:

image
where each "bold verbatim" block references to the relevant spot in the standard library. IMO this has several advantages:

  1. When reading mypy docs, one doesn't need to keep a separate tab with typing docs opened, each term is immediately available for lookup.
  2. Sometimes the highlighting is already enough - I can e.g. hover the link over SupportsRound and immediately get the hint - aha, this is part of the stdlib, typing module, can read it up later. Otherwise, especially an unexperienced reader will first have to google SupportsRound in separate tab.

@hoefling
Copy link
Contributor Author

hoefling commented Oct 4, 2019

Open points:

  1. First of all, I'm not sure whether you find the suggestion in this PR worth it at all. While Replace hardcoded links with intersphinx refs in docs #7623 is IMO important, this one is not. Especially that it will probably become a hassle to review as I have read and edited throughout the whole documentation.

  2. I have not replaced all references to basic stuff like int, float, str, bytes etc as I'm unsure whether this wouldn't be an overkill - however, this can easily be done with sed, e.g. for int:
    find docs/source -name "*.rst" | xargs sed -i 's/\`\`int\`\`/:py:class:`int`/g'

  3. I haven't added references to expressions with types, like Mapping[str, int], would want to ask for permission first. Sometimes a reference is context-dependent, e.g. in the text:

    the parameter type ``slice`` and the return type ``Sequence[T]`` are compatible with ``Union[int, slice]`` and ``Union[T, Sequence]``
    

    linking both Union[int, slice] and Union[T, Sequence] to https://docs.python.org/3/library/typing.html#typing.Union doesn't add to readability IMO.

    However, in another example:

    result of calling an ``async def`` function *without awaiting* will be a value of type ``typing.Coroutine[Any, Any, T]``, which is a subtype of ``Awaitable[T]``
    

    linking both Coroutine and Awaitable would increase the readability.

  4. I have used the Python 3.7 docs (current release) for referencing. With Python 3.8, the typing module will be extended with more features from typing_extensions (TypedDict, Literal, Protocol etc). When using Python 3.8-rc docs already (hope Python 3.8 will be released soon anyway), I could reference those as well. This way, we would have up-to-date docs for 3.8 release already. Also, right now the description of TypedDict, Protocol and Literal suggests to install typing_extensions - I could add notes mentioning their availability in 3.8, similar to the one I found under Simple user-defined protocols.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think None should be replaced with a link. It has the same kind of status as True and False, which you don't change.

Also perhaps not every mention of Any needs to be linked? This gets kind of busy, and there are a lot of occurrences of it.

@gvanrossum
Copy link
Member

Also, I merged your first PR, so this one now has conflicts to resolve. (Probably just remove the first commit.)

@hoefling
Copy link
Contributor Author

hoefling commented Oct 4, 2019

@gvanrossum first of all, thank you for reviewing the other PR! I'm not happy with this first iteration myself - played too much with sed at the beginning and often inserted the refs just because they can be inserted, not because they should be inserted. Let me reiterate this until Monday. I would propose the following rules:

  1. Most common terms (int, None, Any etc) are referenced once per document, first occurence only.
  2. Less common terms (list, typing.Dict, third-party libraries etc) are referenced once per chapter/section, first occurence only. Ideally, the same term is referenced once per each screen no matter what page anchor was selected. This way, one can easily spot a needed reference if coming to the docs from google search.

Also, I have discovered doc8 for checking documentation, would fix some errors it emits.

@gvanrossum
Copy link
Member

Thanks, those rules seem reasonable, except I would personally categorize list as "most common terms". (But not typing.Dict.)

@hoefling hoefling force-pushed the stdlib-crossref branch 4 times, most recently from e3d146c to 4618e9e Compare October 5, 2019 20:56
@hoefling
Copy link
Contributor Author

hoefling commented Oct 5, 2019

I think I have it ready for a review now; sadly, these are still a lot of changes. However, all refs are valid as Sphinx doesn't emit any warnings when building docs. To verify locally, install Sphinx via pip install sphinx and issue sphinx-build -Ean docs/source docs/build to force a complete rebuild and run checks in nitpicky mode.

As the overall diff is enormously big, it will probably take a lot of time to verify in one go - I will gladly break the changes up into multiple PRs if you wish (e.g. one PR per document).

I have also discovered and fixed minor issues like using markdown instead of ReST or syntax errors in code examples. This adds up to the total diffs, sadly.

I have decided not to reference None at all, as advised by @gvanrossum; common classes like int, str, Any etc. are referenced once per .rst document. The rest of entities in typing module and third-party entities, if occurring often, are referenced once per chapter or enough to be visible once per screen. Example:

image

Last but not least: refs to py38 features

I have also referenced the new features in Python 3.8 (TypedDict, Protocol, Literal, Final etc) against the 3.8 docs. Thus:

  1. The imports from typing_extensions in examples had to be updated too (example). I have tested the changed examples out, but the error messages often differ from the ones in the examples - I hope this is no big deal. If it is, I would propose to revisit the examples in another PR.
  2. I have added or updated the notes reflecting the 3.8 changes. Relevant diffs:

Is this too soon / should this be rather done in a separate PR?

@hoefling
Copy link
Contributor Author

hoefling commented Oct 5, 2019

Additional ideas:

doctest code examples?

In our company's docs, we require that each example is inside in a .. doctest:: directive (using .. code-block:: python is actually prohibited). This way, we execute doctests with Sphinx on code changes and verify the examples in the docs still work. Marrying doctests with mypy, however, would require some work, probably best would be adding a custom doctest runner that will parse code in doctest blocks and lint them by executing mypy.api.run - a custom runner like this can be added e.g. with Sybil. This may be not worth it, depending on whether you have to edit the examples often when the code changes. #2393 suggests it may be an issue, though.

API docs?

I saw mypy mentions its own code in the Extending and integrating mypy chapter. Sphinx can generate an API docs section from code (Sphinx has a built-in autodoc extension for that), so mypy.api.run or mypy.plugin.Plugin can be referenced to internal API and the docstrings from code will be automatically inserted in the docs. This will e.g. cover typos like mypy.plugins.Plugin in the High-level overview section (fixed that already in this PR). However, this suggestion may be too soon since mypys API is not stable yet.

@hoefling
Copy link
Contributor Author

hoefling commented Oct 6, 2019

Here's an example documentation showing how the rendered API could look like, with a minimal subset of the codebase made available.

@gvanrossum
Copy link
Member

Before I go review this in detail, let me suggest that you try to focus on a single issue per PR. If you have additional ideas on which you'd like to receive feedback (e.g. API docs) please open a new issue. If you have fixes that are unrelated to the core change here (making things link to the Python docs), submit them as a separate PR (we can land those immediately).

I would also skip linking to Python 3.8 -- even after that's released, most people will not immediately upgrade, so typing_extensions still remains relevant.

Finally, you have a few links (e.g. https://github.com/python/mypy/pull/7624/files#diff-6671e4e194a310d82efeca7d2f55670cL426-L429) that don't point anywhere interesting -- this doesn't scroll to anything that looks like L426-L429.

@gvanrossum
Copy link
Member

Another suggestion: don't be too clever with not linking to the docs more than once per page / section / file / whatever. The non-link markup actually looks more hideous (red) than the link markup (black bold). And if you look through Python's own docs, in general when something is linked, there's no fear to link to it multiple times. (Example: in Doc/library/dataclasses.rst, every occurrence of __init__ is spelled as

:meth:`__init__`

I haven't checked if the Python docs are consistent in this (it's all done manually and I'm sure different authors use slightly different conventions) but I'd say stick to the KISS principle (Keep It Simple, Stupid), which in this case I interpret as "for a particular term, either never link or always link".

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this is too much to review in one go. Would it make sense to split it up in, say, 5-6 PRs, for groups of 5-6 files each?

That enables this to work:
easier. :py:func:`attr.ib` and :py:class:`attr.Factory` actually
return objects, but the annotation says these return the types that
they expect to be assigned to. That enables this to work:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diff chunks like this one look worse than they are because the paragraph was reformatted. We don't particularly care about line lengths in the doc sources. Perhaps the diff would be simpler if you just let the lines become longer, and kept the line breaks where they were? (GitHub has a nice feature where it uses two different shares of red and green to indicate text that was just moved vs. text that was actually changed, but reflowing a paragraph is considered to be an actual change. :-( )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the job a lot easier, knowing this!

As an experimental mypy extension, you can specify
:py:data:`~typing.Callable` types that support keyword arguments,
optional arguments, and more. When you specify the arguments of a
``Callable``, you can choose to supply just the type of a nameless
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example of the KISS principle I mentioned, this second occurrence of Callable could easily be linked as well, even though it's in the same paragraph as the first.

@@ -22,19 +22,22 @@ Type Description
``Any`` dynamically typed value with an arbitrary type
====================== ===============================

The type ``Any`` and type constructors such as ``List``, ``Dict``,
``Iterable`` and ``Sequence`` are defined in the ``typing`` module.
The type :py:data:`~typing.Any` and type constructors such as :py:class:`~typing.List`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking that all the types mentioned here are just too common to bother linking to their docs. I would only link the typing module itself.

Another idea: just above here is a table showing the most common built-in types. Maybe the items in the table should be linked instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about linking the table items instead, but the classes listed contain types, e.g. Tuple[int, int]. The typing module docs do not link those, e.g. Optional docs. I guess it counts as an example, which are not linked in the stdlib docs.

@@ -34,8 +34,8 @@ quite understand what is going on.
assert isinstance(o, int)
print(o + 5) # OK: type of 'o' is 'int' here

You don't need a cast for expressions with type ``Any``, or when
assigning to a variable with type ``Any``, as was explained earlier.
You don't need a cast for expressions with type :py:data:`~typing.Any`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, Any should probably not be linked. It's just too common. Also, the explanation of this type in the Python docs is rather lame. If anything, it should probably link to an explanation of the concept in the mypy docs (if there is one), or perhaps in PEP 483.

@@ -24,8 +25,8 @@ initialized within the class. Mypy infers the types of attributes:
a.y = 3 # Error: 'A' has no attribute 'y'

This is a bit like each class having an implicitly defined
``__slots__`` attribute. This is only enforced during type
checking and not when your program is running.
:py:data:`__slots__ <object.__slots__>` attribute. This is only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of one where I do think the link is useful.

@hoefling
Copy link
Contributor Author

hoefling commented Oct 7, 2019

Thank you for guiding me through this - I have never contributed anything more significant than a bugfix, so I'm still learning to find the right balance between "this change should be included no matter what" and "any sane person should be able to review it in the end". I suggest to keep this PR open while I land more related PRs and once I verified each change has found its place, I will simply close this one (and open new issues for proposals that I'm not sure whether or how to solve).

@hoefling
Copy link
Contributor Author

Closing this since the changes were reworked and landed in linked PRs.

@hoefling hoefling closed this Oct 13, 2019
@hoefling hoefling deleted the stdlib-crossref branch October 13, 2019 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants