Skip to content

Commit 7a0e8bc

Browse files
author
Jim Fulton
committed
Updated documentation to:
- point out the importance of reassigning data members before assigning thier values - correct my missconception about return values from visitprocs. Sigh. - mention the labor saving Py_VISIT and Py_CLEAR macros.
1 parent a643b65 commit 7a0e8bc

File tree

4 files changed

+201
-45
lines changed

4 files changed

+201
-45
lines changed

Doc/ext/newtypes.tex

Lines changed: 166 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -239,8 +239,8 @@ \section{The Basics
239239
\class{Noddy} instances by calling the \class{Noddy} class:
240240

241241
\begin{verbatim}
242-
import noddy
243-
mynoddy = noddy.Noddy()
242+
>>> import noddy
243+
>>> mynoddy = noddy.Noddy()
244244
\end{verbatim}
245245

246246
That's it! All that remains is to build it; put the above code in a
@@ -382,7 +382,7 @@ \subsection{Adding data and methods to the Basic example}
382382
\member{last} are not \NULL. If we didn't care whether the initial
383383
values were \NULL, we could have used \cfunction{PyType_GenericNew()} as
384384
our new method, as we did before. \cfunction{PyType_GenericNew()}
385-
initializes all of the instance variable members to NULLs.
385+
initializes all of the instance variable members to \NULL.
386386

387387
The new method is a static method that is passed the type being
388388
instantiated and any arguments passed when the type was called,
@@ -407,14 +407,13 @@ \subsection{Adding data and methods to the Basic example}
407407
(Specifically, you may not be able to create instances of
408408
such subclasses without getting a \exception{TypeError}.)}
409409

410-
411410
We provide an initialization function:
412411

413412
\begin{verbatim}
414413
static int
415414
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
416415
{
417-
PyObject *first=NULL, *last=NULL;
416+
PyObject *first=NULL, *last=NULL, *tmp;
418417
419418
static char *kwlist[] = {"first", "last", "number", NULL};
420419
@@ -424,15 +423,17 @@ \subsection{Adding data and methods to the Basic example}
424423
return -1;
425424
426425
if (first) {
427-
Py_XDECREF(self->first);
426+
tmp = self->first;
428427
Py_INCREF(first);
429428
self->first = first;
429+
Py_XDECREF(tmp);
430430
}
431431
432432
if (last) {
433-
Py_XDECREF(self->last);
433+
tmp = self->last;
434434
Py_INCREF(last);
435435
self->last = last;
436+
Py_XDECREF(tmp);
436437
}
437438
438439
return 0;
@@ -453,6 +454,44 @@ \subsection{Adding data and methods to the Basic example}
453454
to provide initial values for our instance. Initializers always accept
454455
positional and keyword arguments.
455456

457+
Initializers can be called multiple times. Anyone can call the
458+
\method{__init__()} method on our objects. For this reason, we have
459+
to be extra careful when assigning the new values. We might be
460+
tempted, for example to assign the \member{first} member like this:
461+
462+
\begin{verbatim}
463+
if (first) {
464+
Py_XDECREF(self->first);
465+
Py_INCREF(first);
466+
self->first = first;
467+
}
468+
\end{verbatim}
469+
470+
But this would be risky. Our type doesn't restrict the type of the
471+
\member{first} member, so it could be any kind of object. It could
472+
have a destructor that causes code to be executed that tries to
473+
access the \member{first} member. To be paranoid and protect
474+
ourselves against this possibility, we almost always reassign members
475+
before decrementing their reference counts. When don't we have to do
476+
this?
477+
\begin{itemize}
478+
\item when we absolutely know that the reference count is greater than
479+
1
480+
\item when we know that deallocation of the object\footnote{This is
481+
true when we know that the object is a basic type, like a string or
482+
a float} will not cause any
483+
calls back into our type's code
484+
\item when decrementing a reference count in a \member{tp_dealloc}
485+
handler when garbage-collections is not supported\footnote{We relied
486+
on this in the \member{tp_dealloc} handler in this example, because
487+
our type doesn't support garbage collection. Even if a type supports
488+
garbage collection, there are calls that can be made to ``untrack''
489+
the object from garbage collection, however, these calls are
490+
advanced and not covered here.}
491+
\item
492+
\end{itemize}
493+
494+
456495
We want to want to expose our instance variables as attributes. There
457496
are a number of ways to do that. The simplest way is to define member
458497
definitions:
@@ -682,6 +721,45 @@ \subsection{Providing finer control over data attributes}
682721
};
683722
\end{verbatim}
684723

724+
We also need to update the \member{tp_init} handler to only allow
725+
strings\footnote{We now know that the first and last members are strings,
726+
so perhaps we could be less careful about decrementing their
727+
reference counts, however, we accept instances of string subclasses.
728+
Even though deallocating normal strings won't call back into our
729+
objects, we can't guarantee that deallocating an instance of a string
730+
subclass won't. call back into out objects.} to be passed:
731+
732+
\begin{verbatim}
733+
static int
734+
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
735+
{
736+
PyObject *first=NULL, *last=NULL, *tmp;
737+
738+
static char *kwlist[] = {"first", "last", "number", NULL};
739+
740+
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist,
741+
&first, &last,
742+
&self->number))
743+
return -1;
744+
745+
if (first) {
746+
tmp = self->first;
747+
Py_INCREF(first);
748+
self->first = first;
749+
Py_DECREF(tmp);
750+
}
751+
752+
if (last) {
753+
tmp = self->last;
754+
Py_INCREF(last);
755+
self->last = last;
756+
Py_DECREF(tmp);
757+
}
758+
759+
return 0;
760+
}
761+
\end{verbatim}
762+
685763
With these changes, we can assure that the \member{first} and
686764
\member{last} members are never NULL so we can remove checks for \NULL
687765
values in almost all cases. This means that most of the
@@ -713,8 +791,10 @@ \subsection{Supporting cyclic garbage collection}
713791

714792
In the second version of the \class{Noddy} example, we allowed any
715793
kind of object to be stored in the \member{first} or \member{last}
716-
attributes. This means that \class{Noddy} objects can participate in
717-
cycles:
794+
attributes\footnote{Even in the third version, we aren't guaranteed to
795+
avoid cycles. Instances of string subclasses are allowed and string
796+
subclasses could allow cycles even if normal strings don't.}. This
797+
means that \class{Noddy} objects can participate in cycles:
718798

719799
\begin{verbatim}
720800
>>> import noddy2
@@ -737,10 +817,18 @@ \subsection{Supporting cyclic garbage collection}
737817
static int
738818
Noddy_traverse(Noddy *self, visitproc visit, void *arg)
739819
{
740-
if (self->first && visit(self->first, arg) < 0)
741-
return -1;
742-
if (self->last && visit(self->last, arg) < 0)
743-
return -1;
820+
int vret;
821+
822+
if (self->first) {
823+
vret = visit(self->first, arg);
824+
if (vret != 0)
825+
return vret;
826+
}
827+
if (self->last) {
828+
vret = visit(self->last, arg);
829+
if (vret != 0)
830+
return vret;
831+
}
744832
745833
return 0;
746834
}
@@ -749,7 +837,24 @@ \subsection{Supporting cyclic garbage collection}
749837
For each subobject that can participate in cycles, we need to call the
750838
\cfunction{visit()} function, which is passed to the traversal method.
751839
The \cfunction{visit()} function takes as arguments the subobject and
752-
the extra argument \var{arg} passed to the traversal method.
840+
the extra argument \var{arg} passed to the traversal method. It
841+
returns an integer value that must be returned if it is non-zero.
842+
843+
844+
Python 2.4 and higher provide a \cfunction{Py_VISIT()} that automates
845+
calling visit functions. With \cfunction{Py_VISIT()}, the
846+
\cfunction{Noddy_traverse()} can be simplified:
847+
848+
849+
\begin{verbatim}
850+
static int
851+
Noddy_traverse(Noddy *self, visitproc visit, void *arg)
852+
{
853+
Py_VISIT(self->first);
854+
Py_VISIT(self->last);
855+
return 0;
856+
}
857+
\end{verbatim}
753858

754859
We also need to provide a method for clearing any subobjects that can
755860
participate in cycles. We implement the method and reimplement the
@@ -759,10 +864,15 @@ \subsection{Supporting cyclic garbage collection}
759864
static int
760865
Noddy_clear(Noddy *self)
761866
{
762-
Py_XDECREF(self->first);
867+
PyObject *tmp;
868+
869+
tmp = self->first;
763870
self->first = NULL;
764-
Py_XDECREF(self->last);
871+
Py_XDECREF(tmp);
872+
873+
tmp = self->last;
765874
self->last = NULL;
875+
Py_XDECREF(tmp);
766876
767877
return 0;
768878
}
@@ -775,6 +885,33 @@ \subsection{Supporting cyclic garbage collection}
775885
}
776886
\end{verbatim}
777887

888+
Notice the use of a temporary variable in \cfunction{Noddy_clear()}.
889+
We use the temporary variable so that we can set each member to \NULL
890+
before decrementing it's reference count. We do this because, as was
891+
discussed earlier, if the reference count drops to zero, we might
892+
cause code to run that calls back into the object. In addition,
893+
because we now support garbage collection, we also have to worry about
894+
code being run that triggers garbage collection. If garbage
895+
collection is run, our \member{tp_traverse} handler could get called.
896+
We can't take a chance of having \cfunction{Noddy_traverse()} called
897+
when a member's reference count has dropped to zero and it's value
898+
hasn't been set to \NULL.
899+
900+
Python 2.4 and higher provide a \cfunction{Py_CLEAR()} that automates
901+
the careful decrementing of reference counts. With
902+
\cfunction{Py_CLEAR()}, the \cfunction{Noddy_clear()} function can be
903+
simplified:
904+
905+
\begin{verbatim}
906+
static int
907+
Noddy_clear(Noddy *self)
908+
{
909+
Py_CLEAR(self->first);
910+
Py_CLEAR(self->last);
911+
return 0;
912+
}
913+
\end{verbatim}
914+
778915
Finally, we add the \constant{Py_TPFLAGS_HAVE_GC} flag to the class
779916
flags:
780917

@@ -806,7 +943,7 @@ \section{Type Methods
806943
more information about the various handlers. We won't go in the order
807944
they are defined in the structure, because there is a lot of
808945
historical baggage that impacts the ordering of the fields; be sure
809-
your type initializaion keeps the fields in the right order! It's
946+
your type initialization keeps the fields in the right order! It's
810947
often easiest to find an example that includes all the fields you need
811948
(even if they're initialized to \code{0}) and then change the values
812949
to suit your new type.
@@ -824,7 +961,7 @@ \section{Type Methods
824961
\end{verbatim}
825962

826963
These fields tell the runtime how much memory to allocate when new
827-
objects of this type are created. Python has some builtin support
964+
objects of this type are created. Python has some built-in support
828965
for variable length structures (think: strings, lists) which is where
829966
the \member{tp_itemsize} field comes in. This will be dealt with
830967
later.
@@ -835,7 +972,7 @@ \section{Type Methods
835972

836973
Here you can put a string (or its address) that you want returned when
837974
the Python script references \code{obj.__doc__} to retrieve the
838-
docstring.
975+
doc string.
839976

840977
Now we come to the basic type methods---the ones most extension types
841978
will implement.
@@ -915,7 +1052,7 @@ \subsection{Object Presentation}
9151052

9161053
In Python, there are three ways to generate a textual representation
9171054
of an object: the \function{repr()}\bifuncindex{repr} function (or
918-
equivalent backtick syntax), the \function{str()}\bifuncindex{str}
1055+
equivalent back-tick syntax), the \function{str()}\bifuncindex{str}
9191056
function, and the \keyword{print} statement. For most objects, the
9201057
\keyword{print} statement is equivalent to the \function{str()}
9211058
function, but it is possible to special-case printing to a
@@ -983,7 +1120,7 @@ \subsection{Object Presentation}
9831120
The print function receives a file object as an argument. You will
9841121
likely want to write to that file object.
9851122

986-
Here is a sampe print function:
1123+
Here is a sample print function:
9871124

9881125
\begin{verbatim}
9891126
static int
@@ -1138,10 +1275,10 @@ \subsubsection{Generic Attribute Management}
11381275

11391276
An interesting advantage of using the \member{tp_members} table to
11401277
build descriptors that are used at runtime is that any attribute
1141-
defined this way can have an associated docstring simply by providing
1278+
defined this way can have an associated doc string simply by providing
11421279
the text in the table. An application can use the introspection API
11431280
to retrieve the descriptor from the class object, and get the
1144-
docstring using its \member{__doc__} attribute.
1281+
doc string using its \member{__doc__} attribute.
11451282

11461283
As with the \member{tp_methods} table, a sentinel entry with a
11471284
\member{name} value of \NULL{} is required.
@@ -1286,7 +1423,7 @@ \subsection{Abstract Protocol Support}
12861423
additional slots in the main type object, with a flag bit being set to
12871424
indicate that the slots are present and should be checked by the
12881425
interpreter. (The flag bit does not indicate that the slot values are
1289-
non-\NULL. The flag may be set to indicate the presense of a slot,
1426+
non-\NULL. The flag may be set to indicate the presence of a slot,
12901427
but a slot may still be unfilled.)
12911428

12921429
\begin{verbatim}
@@ -1309,7 +1446,7 @@ \subsection{Abstract Protocol Support}
13091446
\end{verbatim}
13101447

13111448
This function, if you choose to provide it, should return a hash
1312-
number for an instance of your datatype. Here is a moderately
1449+
number for an instance of your data type. Here is a moderately
13131450
pointless example:
13141451

13151452
\begin{verbatim}
@@ -1327,16 +1464,16 @@ \subsection{Abstract Protocol Support}
13271464
ternaryfunc tp_call;
13281465
\end{verbatim}
13291466

1330-
This function is called when an instance of your datatype is "called",
1331-
for example, if \code{obj1} is an instance of your datatype and the Python
1467+
This function is called when an instance of your data type is "called",
1468+
for example, if \code{obj1} is an instance of your data type and the Python
13321469
script contains \code{obj1('hello')}, the \member{tp_call} handler is
13331470
invoked.
13341471

13351472
This function takes three arguments:
13361473

13371474
\begin{enumerate}
13381475
\item
1339-
\var{arg1} is the instance of the datatype which is the subject of
1476+
\var{arg1} is the instance of the data type which is the subject of
13401477
the call. If the call is \code{obj1('hello')}, then \var{arg1} is
13411478
\code{obj1}.
13421479

@@ -1430,7 +1567,7 @@ \subsection{More Suggestions}
14301567
Python.
14311568

14321569
In order to learn how to implement any specific method for your new
1433-
datatype, do the following: Download and unpack the Python source
1570+
data type, do the following: Download and unpack the Python source
14341571
distribution. Go the \file{Objects} directory, then search the
14351572
C source files for \code{tp_} plus the function you want (for
14361573
example, \code{tp_print} or \code{tp_compare}). You will find

Doc/ext/noddy2.c

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
4646
static int
4747
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
4848
{
49-
PyObject *first=NULL, *last=NULL;
49+
PyObject *first=NULL, *last=NULL, *tmp;
5050

5151
static char *kwlist[] = {"first", "last", "number", NULL};
5252

@@ -56,15 +56,17 @@ Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
5656
return -1;
5757

5858
if (first) {
59-
Py_XDECREF(self->first);
59+
tmp = self->first;
6060
Py_INCREF(first);
6161
self->first = first;
62+
Py_XDECREF(tmp);
6263
}
6364

6465
if (last) {
65-
Py_XDECREF(self->last);
66+
tmp = self->last;
6667
Py_INCREF(last);
6768
self->last = last;
69+
Py_XDECREF(tmp);
6870
}
6971

7072
return 0;

0 commit comments

Comments
 (0)