Skip to content

Commit 22d8ef1

Browse files
committed
Add CONSTANTS.empty_attribute_default, fix #44
1 parent 4016b7a commit 22d8ef1

File tree

6 files changed

+66
-27
lines changed

6 files changed

+66
-27
lines changed

docs/customize.rst

Lines changed: 29 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,21 +6,6 @@ matching the lower case characters of a name piece with pre-defined sets
66
of strings located in :py:mod:`nameparser.config`. You can adjust
77
these predefined sets to help fine tune the parser for your dataset.
88

9-
Editable CONSTANTS sets:
10-
11-
* `titles` - Pieces that come before the name. Cannot include things that may be first names
12-
* `first_name_titles` - Titles that, when followed by a single name, that name is a first name, e.g. "King David"
13-
* `suffix_acronyms` - Pieces that come at the end of the name that may or may not have periods separating the letters, e.g. "m.d."
14-
* `suffix_not_acronyms` - Pieces that come at the end of the name that never have periods separating the letters, e.g. "Jr."
15-
* `conjunctions` - Connectors like "and" that join the preceeding piece to the following piece.
16-
* `prefixes` - Connectors like "del" and "bin" that join to the following piece but not the preceeding
17-
* `capitalization_exceptions` - Dictionary of pieces that do not capitalize the first letter, e.g. "Ph.D"
18-
* `regexes` - Regular expressions used to find words, initials, nicknames, etc.
19-
20-
Each set of constants comes with `add()` and `remove()` methods for tuning
21-
the constants for your project. These methods automatically lower case and
22-
remove punctuation to normalize them for comparison.
23-
249
Changing the Parser Constants
2510
-----------------------------
2611

@@ -49,14 +34,39 @@ Both places are usually a reference to the same shared module-level
4934
:py:class:`~nameparser.config.CONSTANTS` instance, depending on how you
5035
instantiate the :py:class:`~nameparser.parser.HumanName` class (see below).
5136

52-
Take a look at the :py:mod:`nameparser.config` documentation to see what's
53-
in the constants. Here's a quick walk through of some examples where you
54-
might want to adjust them.
37+
38+
39+
Editable attributes of nameparser.config.CONSTANTS
40+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41+
42+
* :py:attr:`~nameparser.config.Constants.titles` - Pieces that come before the name. Cannot include things that may be first names
43+
* :py:attr:`~nameparser.config.Constants.first_name_titles` - Titles that, when followed by a single name, that name is a first name, e.g. "King David"
44+
* :py:attr:`~nameparser.config.Constants.suffix_acronyms` - Pieces that come at the end of the name that may or may not have periods separating the letters, e.g. "m.d."
45+
* :py:attr:`~nameparser.config.Constants.suffix_not_acronyms` - Pieces that come at the end of the name that never have periods separating the letters, e.g. "Jr."
46+
* :py:attr:`~nameparser.config.Constants.conjunctions` - Connectors like "and" that join the preceeding piece to the following piece.
47+
* :py:attr:`~nameparser.config.Constants.prefixes` - Connectors like "del" and "bin" that join to the following piece but not the preceeding
48+
* :py:attr:`~nameparser.config.Constants.capitalization_exceptions` - Dictionary of pieces that do not capitalize the first letter, e.g. "Ph.D"
49+
* :py:attr:`~nameparser.config.Constants.regexes` - Regular expressions used to find words, initials, nicknames, etc.
50+
51+
Each set of constants comes with `add()` and `remove()` methods for tuning
52+
the constants for your project. These methods automatically lower case and
53+
remove punctuation to normalize them for comparison.
54+
55+
Other editable attributes
56+
~~~~~~~~~~~~~~~~~~~~~~~~~~
57+
58+
* :py:attr:`~nameparser.config.Constants.string_format`
59+
* :py:attr:`~nameparser.config.Constants.empty_attribute_default`
60+
5561

5662

5763
Parser Customization Examples
5864
-----------------------------
5965

66+
Take a look at the :py:mod:`nameparser.config` documentation to see what's
67+
in the constants. Here's a quick walk through of some examples where you
68+
might want to adjust them.
69+
6070
"Hon" is a common abbreviation for "Honorable", a title used when
6171
addressing judges, and is included in the default tiles constants. This
6272
means it will never be considered a first name, because titles are the
@@ -99,7 +109,7 @@ constant so that "Hon" can be parsed as a first name.
99109
constant. But in some contexts it is more common as a title. If you would
100110
like "Dean" to be parsed as a title, simply add it to the titles constant.
101111

102-
You can pass multiple strings to both the :py:func:`~nameparser.config.SetManager.add`
112+
You can pass multiple strings to both the :py:func:`~nameparser.config.SetManager.add`
103113
and :py:func:`~nameparser.config.SetManager.remove`
104114
methods and each string will be added or removed. Both functions
105115
automatically normalize the strings for the parser's comparison method by

docs/release_log.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
Release Log
22
===========
3+
* 0.3.14 - March 18, 2016
4+
- Add `CONSTANTS.empty_attribute_default` to customize value returned for empty attributes (#44)
35
* 0.3.13 - March 14, 2016
46
- Improve string format handling (#41)
57
* 0.3.12 - March 13, 2016

nameparser/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
VERSION = (0, 3, 13)
1+
VERSION = (0, 3, 14)
22
__version__ = '.'.join(map(str, VERSION))
33
__author__ = "Derek Gulbranson"
44
__author_email__ = '[email protected]'

nameparser/config/__init__.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,23 @@ class Constants(object):
150150

151151
string_format = "{title} {first} {middle} {last} {suffix} ({nickname})"
152152
"""
153-
The default string format use for all new HumanName instances.
153+
The default string format use for all new `HumanName` instances.
154+
"""
155+
empty_attribute_default = ''
156+
"""
157+
Default return value for empty attributes. Setting this to something other than empty
158+
string will causes :py:attr:`string_format` not to work.
159+
160+
.. doctest::
161+
162+
>>> from nameparser.config import CONSTANTS
163+
>>> CONSTANTS.empty_attribute_default = None
164+
>>> name = HumanName("John Doe")
165+
>>> name.title
166+
None
167+
>>>name.first
168+
'John'
169+
154170
"""
155171

156172

nameparser/parser.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -180,31 +180,31 @@ def title(self):
180180
:py:mod:`~nameparser.config.titles` or :py:mod:`~nameparser.config.conjunctions`
181181
at the beginning of :py:attr:`full_name`.
182182
"""
183-
return " ".join(self.title_list)
183+
return " ".join(self.title_list) or self.C.empty_attribute_default
184184

185185
@property
186186
def first(self):
187187
"""
188188
The person's first name. The first name piece after any known
189189
:py:attr:`title` pieces parsed from :py:attr:`full_name`.
190190
"""
191-
return " ".join(self.first_list)
191+
return " ".join(self.first_list) or self.C.empty_attribute_default
192192

193193
@property
194194
def middle(self):
195195
"""
196196
The person's middle names. All name pieces after the first name and before
197197
the last name parsed from :py:attr:`full_name`.
198198
"""
199-
return " ".join(self.middle_list)
199+
return " ".join(self.middle_list) or self.C.empty_attribute_default
200200

201201
@property
202202
def last(self):
203203
"""
204204
The person's last name. The last name piece parsed from
205205
:py:attr:`full_name`.
206206
"""
207-
return " ".join(self.last_list)
207+
return " ".join(self.last_list) or self.C.empty_attribute_default
208208

209209
@property
210210
def suffix(self):
@@ -214,15 +214,15 @@ def suffix(self):
214214
of comma separated formats, e.g. "Lastname, Title Firstname Middle[,] Suffix
215215
[, Suffix]" parsed from :py:attr:`full_name`.
216216
"""
217-
return ", ".join(self.suffix_list)
217+
return ", ".join(self.suffix_list) or self.C.empty_attribute_default
218218

219219
@property
220220
def nickname(self):
221221
"""
222222
The person's nicknames. Any text found inside of quotes (``""``) or
223223
parenthesis (``()``)
224224
"""
225-
return " ".join(self.nickname_list)
225+
return " ".join(self.nickname_list) or self.C.empty_attribute_default
226226

227227
### setter methods
228228

tests.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1261,6 +1261,17 @@ def test_chain_multiple_arguments(self):
12611261
self.m(hn.middle,"Hon", hn)
12621262
self.m(hn.last,"Solo", hn)
12631263

1264+
def test_empty_attribute_default(self):
1265+
from nameparser.config import CONSTANTS
1266+
_orig = CONSTANTS.empty_attribute_default
1267+
CONSTANTS.empty_attribute_default = None
1268+
hn = HumanName("Benjamin Franklin")
1269+
self.m(hn.first,"Benjamin", hn)
1270+
self.m(hn.middle,None, hn)
1271+
self.m(hn.last,"Franklin", hn)
1272+
CONSTANTS.empty_attribute_default = _orig
1273+
1274+
12641275
class HumanNameNicknameTestCase(HumanNameTestBase):
12651276
# https://code.google.com/p/python-nameparser/issues/detail?id=33
12661277
def test_nickname_in_parenthesis(self):

0 commit comments

Comments
 (0)