Skip to content

feature: underline property on Run #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
scanny opened this issue Mar 1, 2014 · 12 comments
Closed

feature: underline property on Run #19

scanny opened this issue Mar 1, 2014 · 12 comments
Milestone

Comments

@scanny
Copy link
Contributor

scanny commented Mar 1, 2014

Provide an underline property on Run with semantics similar to .bold and .italic, allowing simple underline formatting to be applied to a run, but not precluding the broader set of possible underline styles that are possible, such as dashed, wavy, and double-underline.

@onlyjus
Copy link
Contributor

onlyjus commented Mar 24, 2014

Should it be exactly the same as .bold=True and .italic=True? i.e. .underline=True

Or should it be something like .underline('double')

Or .underline = 'double'

@scanny
Copy link
Contributor Author

scanny commented Mar 24, 2014

I'm inclined to make it both, so the protocol would be something like this:

>>> run = paragraph.add_run()
>>> run.underline
False
>>> run.underline = True
>>> run.underline
True
>>> run.underline = WD_UNDERLINE.DOUBLE
>>> run.underline
3

It looks like there are 18 different underline styles, if you include no underline. So no underline would correspond to False, single would correspond to True, and all the others would match their MS API WdUnderline enumeration value, supported by a 'symbolic constant' set named like above.

The rationale here would be that 99% of the time, folks just want to set underline on for a run, meaning a single underline like in HTML. The least surprising API for that would be the same as it is for bold and italic. For the less common cases the same API works, just a broader set of values are used.

The only trickiness I can think of is that when other than single underline is used, is True evaluates False while the value itself evaluates True, which is probably on the subtle side for most folks.

>>> run.underline = WD_UNDERLINE.SINGLE
>>> 'underline evaluates True' if run.underline else 'underline evaluates False'
'underline evaluates True'
>>> run.underline is True
True
>>> run.underline = WD_UNDERLINE.DOUBLE
>>> 'underline evaluates True' if run.underline else 'underline evaluates False'
'underline evaluates True'
>>> run.underline is True
False

I'm inclined to consider that acceptable subtlety though.

@onlyjus
Copy link
Contributor

onlyjus commented Mar 25, 2014

I think that is a good approach. However, why not just use the strings instead of WD_UNDERLINE.SINGLE? The xml is <w:u w:val="single"/>:

single, words, double, thick, dotted, dottedHeavy, dash, dashedHeavy, dashLong, dashLongHeavy, dotDash, dashDotHeavy, dotDotDash, dashDotDotHeavy, wave, wavyHeavy, wavyDouble, none

I am trying to implement it however I am getting lost going from docx/text.py to docx/oxml/text.py.

@scanny
Copy link
Contributor Author

scanny commented Mar 25, 2014

Ah, Grasshopper, you must seek enlightenment at the temple of the DRY principle:
http://en.wikipedia.org/wiki/Don't_repeat_yourself

On the oxml bit, you'll need to understand a bit about lxml before that layer is likely to make sense. If you can formulate a few more specific questions I can probably help you find your way a bit further. The short story is that the docx.text module contains proxy objects that provide API level methods for interacting with document objects, things like paragraphs and runs in this case. The docx.oxml.text module is strictly concerned with manipulating the underlying XML.

You might find it useful to poke around in the analysis documentation here:
http://python-docx.readthedocs.org/en/latest/dev/analysis/index.html

It isn't a direct answer to your question, but might provide you some insight.

See what you can do with forming a specific question on where you're getting stuck understanding what's happening there and we can go from there.

@onlyjus
Copy link
Contributor

onlyjus commented Mar 25, 2014

What is being repeated? The user is supplying the string, the code puts it in the right spot...precisely how the style attributes are used: merge request (the only reason I have the list in that code is for error checking, however word doesn't care what the actual value is, only uses values it recognizes. The only other use of the strings is a convenience for a boolean: True='single', False='none'. str(None) works too.)

As a user, I hate looking up all the WD_UNDERLINE.DOUBLE attributes. I am lazy.

I am starting to connect the dots in the code. Sorry for being a noob. I'm an engineer trying to get some work done that was due yesterday...

@scanny
Copy link
Contributor Author

scanny commented Mar 25, 2014

No problem, we've all been there :)

What would be repeated is the string literal itself. When it's encapsulated by a "constant", what's getting used is a reference to that string, both in the XML and any code. If a string literal is used directly, there's no telling how many times it would be repeated in the code, both in the library and client code.

If you're asking why to use an integer for the actual value of the symbolic constant rather than the string itself, that's a little bit different question. I'm not sure I have a really convincing answer for this case, but here are a few reasons I find convincing enough:

  1. The MS API does it that way, and they might know something I don't know yet. That's happened a surprising number of times along the way during development of this library and its companions :)
  2. The string value, e.g. 'dottedHeavy', is an implementation detail of the Open XML standard, as distinct from an abstract user option. So a mapping to the abstract user option rather than identity with it seems more appropriate to me. In principle at least, there could be other mappings having different values in contexts other than the XML spellings.
  3. The developer shouldn't have to know or be tempted to care what the actual value is. That and I would like to reserve the right to change the actual value of the 'constant' depending on later needs. The API guarantees the constant works. Using anything else, the developer is on their own.
  4. Integers are all that is required to provide the necessary number of distinctions, and may be more efficient to evaluate in certain circumstances; not a big optimization I'm sure, but as long as we're choosing it seems a smart choice.

In summary I'd say it's more about principle rather than real need in this particular case. But following principles makes the design decision-making simpler and faster and holds options open for an uncertain future.

@scanny
Copy link
Contributor Author

scanny commented Mar 26, 2014

I don't see the original comment this is in response to, so figuring you found this out on your own :), but just in case, the enumeration should go in the docx.enum.text module. You can just follow the example for WD_BREAK_TYPE there and add a WD_UNDERLINE class.

I like to use the same integers as the MS API, except for the True and False options in this case for single and no underline respectively. That's probably reflects a perverse sense of consistency, but I somehow think one day I might be glad I did :) Identifiers should be upper snake-cased versions of camel case MS API names, e.g. wdUnderlineDottedHeavy becomes WD_UNDERLINE.DOTTED_HEAVY.
http://msdn.microsoft.com/en-us/library/office/ff822388(v=office.15).aspx

Let me know if you need help finding anything else :)

@onlyjus
Copy link
Contributor

onlyjus commented Mar 26, 2014

Yea I found it and figured it out. I also included a dictionary as part of that class to do the conversion from number to string. I tried giving word the numbers and it seems to have ignored them.

Please let me know if the code I am working on is in the right direction: merge request

I also threw in a class for table cell properties with column span.

I am trying to work on adding the underline to the tests. I am going to beat my head against it a little more and try to figure it out ;)

@scanny
Copy link
Contributor Author

scanny commented Mar 26, 2014

Best place to start with the tests is the acceptance test. It will have to be a distinct test file because it's more than a boolean property, but should look similar to run-bool-props.feature.
https://github.com/python-openxml/python-docx/blob/master/features/run-bool-props.feature

Some folks have found the behave tutorial useful on understanding what's happening there:
http://pythonhosted.org/behave/

I'll take a look at the code this evening when I free up and provide some feedback.

Btw, the acceptance test can and should be submitted as a distinct commit, check out the commit history for items starting with 'acpt:' for examples. Any pull requests should be off of a distinct feature branch, say 'feature/underline', rooted off of the develop branch. The acceptance test should be the first commit. A failing but properly written acceptance test is a contribution all on its own that I will commit. Then we can work through the implementation details one by one after that as needed to get to where the acceptance test passes, supported by unit tests of course :).

@onlyjus
Copy link
Contributor

onlyjus commented Mar 27, 2014

Thanks scanny. I am aware of so called "test driven" development, however I have never been personally involved with a project that functioned on that theory. I need to work on changing my workflow to follow this.

@scanny
Copy link
Contributor Author

scanny commented Mar 28, 2014

No worries, I know it was a big shift for me a couple years ago when I started up with it seriously. Let me know if I can help you sort anything out :) You're doing a great job.

@scanny scanny modified the milestones: v0.6.0, 0.6.1 May 1, 2014
@scanny scanny modified the milestones: v0.5.3, v0.6.0 May 8, 2014
scanny pushed a commit that referenced this issue May 10, 2014
scanny pushed a commit that referenced this issue May 10, 2014
scanny pushed a commit that referenced this issue May 10, 2014
scanny pushed a commit that referenced this issue May 10, 2014
scanny pushed a commit that referenced this issue May 10, 2014
scanny pushed a commit that referenced this issue May 10, 2014
scanny pushed a commit that referenced this issue May 10, 2014
@scanny
Copy link
Contributor Author

scanny commented May 10, 2014

Completed and released as v0.5.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants