Skip to content

Commit 863b1e4

Browse files
mwidjajaethanfurman
authored andcommitted
bpo-29237: Create enum for pstats sorting options (GH-5103)
1 parent 4666ec5 commit 863b1e4

File tree

5 files changed

+144
-61
lines changed

5 files changed

+144
-61
lines changed

Doc/library/profile.rst

Lines changed: 62 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,7 @@ The :mod:`pstats` module's :class:`~pstats.Stats` class has a variety of methods
139139
for manipulating and printing the data saved into a profile results file::
140140

141141
import pstats
142+
from pstats import SortKey
142143
p = pstats.Stats('restats')
143144
p.strip_dirs().sort_stats(-1).print_stats()
144145

@@ -148,14 +149,14 @@ entries according to the standard module/line/name string that is printed. The
148149
:meth:`~pstats.Stats.print_stats` method printed out all the statistics. You
149150
might try the following sort calls::
150151

151-
p.sort_stats('name')
152+
p.sort_stats(SortKey.NAME)
152153
p.print_stats()
153154

154155
The first call will actually sort the list by function name, and the second call
155156
will print out the statistics. The following are some interesting calls to
156157
experiment with::
157158

158-
p.sort_stats('cumulative').print_stats(10)
159+
p.sort_stats(SortKey.CUMULATIVE).print_stats(10)
159160

160161
This sorts the profile by cumulative time in a function, and then only prints
161162
the ten most significant lines. If you want to understand what algorithms are
@@ -164,20 +165,20 @@ taking time, the above line is what you would use.
164165
If you were looking to see what functions were looping a lot, and taking a lot
165166
of time, you would do::
166167

167-
p.sort_stats('time').print_stats(10)
168+
p.sort_stats(SortKey.TIME).print_stats(10)
168169

169170
to sort according to time spent within each function, and then print the
170171
statistics for the top ten functions.
171172

172173
You might also try::
173174

174-
p.sort_stats('file').print_stats('__init__')
175+
p.sort_stats(SortKey.FILENAME).print_stats('__init__')
175176

176177
This will sort all the statistics by file name, and then print out statistics
177178
for only the class init methods (since they are spelled with ``__init__`` in
178179
them). As one final example, you could try::
179180

180-
p.sort_stats('time', 'cumulative').print_stats(.5, 'init')
181+
p.sort_stats(SortKey.TIME, SortKey.CUMULATIVE).print_stats(.5, 'init')
181182

182183
This line sorts statistics with a primary key of time, and a secondary key of
183184
cumulative time, and then prints out some of the statistics. To be specific, the
@@ -250,12 +251,13 @@ functions:
250251
without writing the profile data to a file::
251252

252253
import cProfile, pstats, io
254+
from pstats import SortKey
253255
pr = cProfile.Profile()
254256
pr.enable()
255257
# ... do something ...
256258
pr.disable()
257259
s = io.StringIO()
258-
sortby = 'cumulative'
260+
sortby = SortKey.CUMULATIVE
259261
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
260262
ps.print_stats()
261263
print(s.getvalue())
@@ -361,60 +363,65 @@ Analysis of the profiler data is done using the :class:`~pstats.Stats` class.
361363
.. method:: sort_stats(*keys)
362364

363365
This method modifies the :class:`Stats` object by sorting it according to
364-
the supplied criteria. The argument is typically a string identifying the
365-
basis of a sort (example: ``'time'`` or ``'name'``).
366+
the supplied criteria. The argument can be either a string or a SortKey
367+
enum identifying the basis of a sort (example: ``'time'``, ``'name'``,
368+
``SortKey.TIME`` or ``SortKey.NAME``). The SortKey enums argument have
369+
advantage over the string argument in that it is more robust and less
370+
error prone.
366371

367372
When more than one key is provided, then additional keys are used as
368373
secondary criteria when there is equality in all keys selected before
369-
them. For example, ``sort_stats('name', 'file')`` will sort all the
370-
entries according to their function name, and resolve all ties (identical
371-
function names) by sorting by file name.
372-
373-
Abbreviations can be used for any key names, as long as the abbreviation
374-
is unambiguous. The following are the keys currently defined:
375-
376-
+------------------+----------------------+
377-
| Valid Arg | Meaning |
378-
+==================+======================+
379-
| ``'calls'`` | call count |
380-
+------------------+----------------------+
381-
| ``'cumulative'`` | cumulative time |
382-
+------------------+----------------------+
383-
| ``'cumtime'`` | cumulative time |
384-
+------------------+----------------------+
385-
| ``'file'`` | file name |
386-
+------------------+----------------------+
387-
| ``'filename'`` | file name |
388-
+------------------+----------------------+
389-
| ``'module'`` | file name |
390-
+------------------+----------------------+
391-
| ``'ncalls'`` | call count |
392-
+------------------+----------------------+
393-
| ``'pcalls'`` | primitive call count |
394-
+------------------+----------------------+
395-
| ``'line'`` | line number |
396-
+------------------+----------------------+
397-
| ``'name'`` | function name |
398-
+------------------+----------------------+
399-
| ``'nfl'`` | name/file/line |
400-
+------------------+----------------------+
401-
| ``'stdname'`` | standard name |
402-
+------------------+----------------------+
403-
| ``'time'`` | internal time |
404-
+------------------+----------------------+
405-
| ``'tottime'`` | internal time |
406-
+------------------+----------------------+
374+
them. For example, ``sort_stats(SortKey.NAME, SortKey.FILE)`` will sort
375+
all the entries according to their function name, and resolve all ties
376+
(identical function names) by sorting by file name.
377+
378+
For the string argument, abbreviations can be used for any key names, as
379+
long as the abbreviation is unambiguous.
380+
381+
The following are the valid string and SortKey:
382+
383+
+------------------+---------------------+----------------------+
384+
| Valid String Arg | Valid enum Arg | Meaning |
385+
+==================+=====================+======================+
386+
| ``'calls'`` | SortKey.CALLS | call count |
387+
+------------------+---------------------+----------------------+
388+
| ``'cumulative'`` | SortKey.CUMULATIVE | cumulative time |
389+
+------------------+---------------------+----------------------+
390+
| ``'cumtime'`` | N/A | cumulative time |
391+
+------------------+---------------------+----------------------+
392+
| ``'file'`` | N/A | file name |
393+
+------------------+---------------------+----------------------+
394+
| ``'filename'`` | SortKey.FILENAME | file name |
395+
+------------------+---------------------+----------------------+
396+
| ``'module'`` | N/A | file name |
397+
+------------------+---------------------+----------------------+
398+
| ``'ncalls'`` | N/A | call count |
399+
+------------------+---------------------+----------------------+
400+
| ``'pcalls'`` | SortKey.PCALLS | primitive call count |
401+
+------------------+---------------------+----------------------+
402+
| ``'line'`` | SortKey.LINE | line number |
403+
+------------------+---------------------+----------------------+
404+
| ``'name'`` | SortKey.NAME | function name |
405+
+------------------+---------------------+----------------------+
406+
| ``'nfl'`` | SortKey.NFL | name/file/line |
407+
+------------------+---------------------+----------------------+
408+
| ``'stdname'`` | SortKey.STDNAME | standard name |
409+
+------------------+---------------------+----------------------+
410+
| ``'time'`` | SortKey.TIME | internal time |
411+
+------------------+---------------------+----------------------+
412+
| ``'tottime'`` | N/A | internal time |
413+
+------------------+---------------------+----------------------+
407414

408415
Note that all sorts on statistics are in descending order (placing most
409416
time consuming items first), where as name, file, and line number searches
410417
are in ascending order (alphabetical). The subtle distinction between
411-
``'nfl'`` and ``'stdname'`` is that the standard name is a sort of the
412-
name as printed, which means that the embedded line numbers get compared
413-
in an odd way. For example, lines 3, 20, and 40 would (if the file names
414-
were the same) appear in the string order 20, 3 and 40. In contrast,
415-
``'nfl'`` does a numeric compare of the line numbers. In fact,
416-
``sort_stats('nfl')`` is the same as ``sort_stats('name', 'file',
417-
'line')``.
418+
``SortKey.NFL`` and ``SortKey.STDNAME`` is that the standard name is a
419+
sort of the name as printed, which means that the embedded line numbers
420+
get compared in an odd way. For example, lines 3, 20, and 40 would (if
421+
the file names were the same) appear in the string order 20, 3 and 40.
422+
In contrast, ``SortKey.NFL`` does a numeric compare of the line numbers.
423+
In fact, ``sort_stats(SortKey.NFL)`` is the same as
424+
``sort_stats(SortKey.NAME, SortKey.FILENAME, SortKey.LINE)``.
418425

419426
For backward-compatibility reasons, the numeric arguments ``-1``, ``0``,
420427
``1``, and ``2`` are permitted. They are interpreted as ``'stdname'``,
@@ -424,6 +431,8 @@ Analysis of the profiler data is done using the :class:`~pstats.Stats` class.
424431

425432
.. For compatibility with the old profiler.
426433
434+
.. versionadded:: 3.7
435+
Added the SortKey enum.
427436

428437
.. method:: reverse_order()
429438

Lib/pstats.py

Lines changed: 38 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,32 @@
2525
import time
2626
import marshal
2727
import re
28+
from enum import Enum
2829
from functools import cmp_to_key
2930

30-
__all__ = ["Stats"]
31+
__all__ = ["Stats", "SortKey"]
32+
33+
34+
class SortKey(str, Enum):
35+
CALLS = 'calls', 'ncalls'
36+
CUMULATIVE = 'cumulative', 'cumtime'
37+
FILENAME = 'filename', 'module'
38+
LINE = 'line'
39+
NAME = 'name'
40+
NFL = 'nfl'
41+
PCALLS = 'pcalls'
42+
STDNAME = 'stdname'
43+
TIME = 'time', 'tottime'
44+
45+
def __new__(cls, *values):
46+
obj = str.__new__(cls)
47+
48+
obj._value_ = values[0]
49+
for other_value in values[1:]:
50+
cls._value2member_map_[other_value] = obj
51+
obj._all_values = values
52+
return obj
53+
3154

3255
class Stats:
3356
"""This class is used for creating reports from data generated by the
@@ -49,13 +72,14 @@ class Stats:
4972
5073
The sort_stats() method now processes some additional options (i.e., in
5174
addition to the old -1, 0, 1, or 2 that are respectively interpreted as
52-
'stdname', 'calls', 'time', and 'cumulative'). It takes an arbitrary number
53-
of quoted strings to select the sort order.
75+
'stdname', 'calls', 'time', and 'cumulative'). It takes either an
76+
arbitrary number of quoted strings or SortKey enum to select the sort
77+
order.
5478
55-
For example sort_stats('time', 'name') sorts on the major key of 'internal
56-
function time', and on the minor key of 'the name of the function'. Look at
57-
the two tables in sort_stats() and get_sort_arg_defs(self) for more
58-
examples.
79+
For example sort_stats('time', 'name') or sort_stats(SortKey.TIME,
80+
SortKey.NAME) sorts on the major key of 'internal function time', and on
81+
the minor key of 'the name of the function'. Look at the two tables in
82+
sort_stats() and get_sort_arg_defs(self) for more examples.
5983
6084
All methods return self, so you can string together commands like:
6185
Stats('foo', 'goo').strip_dirs().sort_stats('calls').\
@@ -161,7 +185,6 @@ def dump_stats(self, filename):
161185
"ncalls" : (((1,-1), ), "call count"),
162186
"cumtime" : (((3,-1), ), "cumulative time"),
163187
"cumulative": (((3,-1), ), "cumulative time"),
164-
"file" : (((4, 1), ), "file name"),
165188
"filename" : (((4, 1), ), "file name"),
166189
"line" : (((5, 1), ), "line number"),
167190
"module" : (((4, 1), ), "file name"),
@@ -202,12 +225,19 @@ def sort_stats(self, *field):
202225
0: "calls",
203226
1: "time",
204227
2: "cumulative"}[field[0]] ]
228+
elif len(field) >= 2:
229+
for arg in field[1:]:
230+
if type(arg) != type(field[0]):
231+
raise TypeError("Can't have mixed argument type")
205232

206233
sort_arg_defs = self.get_sort_arg_defs()
234+
207235
sort_tuple = ()
208236
self.sort_type = ""
209237
connector = ""
210238
for word in field:
239+
if isinstance(word, SortKey):
240+
word = word.value
211241
sort_tuple = sort_tuple + sort_arg_defs[word][0]
212242
self.sort_type += connector + sort_arg_defs[word][1]
213243
connector = ", "

Lib/test/test_pstats.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
from test import support
33
from io import StringIO
44
import pstats
5+
from pstats import SortKey
56

67

78

@@ -33,6 +34,47 @@ def test_add(self):
3334
stats = pstats.Stats(stream=stream)
3435
stats.add(self.stats, self.stats)
3536

37+
def test_sort_stats_int(self):
38+
valid_args = {-1: 'stdname',
39+
0: 'calls',
40+
1: 'time',
41+
2: 'cumulative'}
42+
for arg_int, arg_str in valid_args.items():
43+
self.stats.sort_stats(arg_int)
44+
self.assertEqual(self.stats.sort_type,
45+
self.stats.sort_arg_dict_default[arg_str][-1])
46+
47+
def test_sort_stats_string(self):
48+
for sort_name in ['calls', 'ncalls', 'cumtime', 'cumulative',
49+
'filename', 'line', 'module', 'name', 'nfl', 'pcalls',
50+
'stdname', 'time', 'tottime']:
51+
self.stats.sort_stats(sort_name)
52+
self.assertEqual(self.stats.sort_type,
53+
self.stats.sort_arg_dict_default[sort_name][-1])
54+
55+
def test_sort_stats_partial(self):
56+
sortkey = 'filename'
57+
for sort_name in ['f', 'fi', 'fil', 'file', 'filen', 'filena',
58+
'filenam', 'filename']:
59+
self.stats.sort_stats(sort_name)
60+
self.assertEqual(self.stats.sort_type,
61+
self.stats.sort_arg_dict_default[sortkey][-1])
62+
63+
def test_sort_stats_enum(self):
64+
for member in SortKey:
65+
self.stats.sort_stats(member)
66+
self.assertEqual(
67+
self.stats.sort_type,
68+
self.stats.sort_arg_dict_default[member.value][-1])
69+
70+
def test_sort_starts_mix(self):
71+
self.assertRaises(TypeError, self.stats.sort_stats,
72+
'calls',
73+
SortKey.TIME)
74+
self.assertRaises(TypeError, self.stats.sort_stats,
75+
SortKey.TIME,
76+
'calls')
77+
3678

3779
if __name__ == "__main__":
3880
unittest.main()

Misc/ACKS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1706,6 +1706,7 @@ Jeff Wheeler
17061706
Christopher White
17071707
David White
17081708
Mats Wichmann
1709+
Marcel Widjaja
17091710
Truida Wiedijk
17101711
Felix Wiemann
17111712
Gerry Wiener
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Create enum for pstats sorting options

0 commit comments

Comments
 (0)