SLEP013: n_features_out_ #29

adrinjalali · 2020-02-12T14:54:39Z

n_features_out_ is closely related to n_features_in_, which is accepted and close to being merged.

scikit-learn/scikit-learn#14241 by @amueller is the implementation of this SLEP.

This is short, but it makes me feel I'm missing some obvious important issue...

NicolasHug

Thanks Adrin.

There should be a link to the n_features_in SLEP somewhere, for reference

What's the interaction with n_features_in_ in pipelines?
Let's say after fit, step 2 of a pipeline has inferred n_features_in_ to be 3 but n_features_out_ of step 1 claims it's 4. One of them is wrong, do we error?

Not convinced about the Mixins yet but I'll check out the discussions on the PR

slep012/proposal.rst

NicolasHug · 2020-02-12T15:15:28Z

slep012/proposal.rst

+########
+
+The proposed solution is for the ``n_features_out_`` attribute to be set once a
+call to ``fit`` is done. In most cases the value of ``n_features_out_`` is the


"in most cases": is that true? I can think of many transformers where this doesn't apply, like StandardScaler, LogTransformer, OneHotEncoder etc

I got the most cases feeling from looking and andy's PR.

Maybe change it for In some cases?

This seems to apply to the dimensionality reduction estimators and the projectors, but not to most imputers / pre-processors

Seems nitpicky ;) How about "in many cases"? But any of the three variants seem fine to me. Usually it corresponds to some attribute, the scalers often have scale for example and you're saying e.g. not i.e. so it's only an example...

I don't believe this to be nitpicky. "most" and "some" have significantly different meanings and they do not imply the same things.
Either "most" applies, and in that case, I'm only trying to understand what I'm missing. Or it does not, and I'm suggesting an alternative that would not cause the confusion I'm having.

I like "many".

whatever the reason, such review comments make me want to just close the PR. They may be constructive from an objective point of view, but they're not when it comes to my feelings when working on a PR, and I know I'm not alone.

I'm sorry to hear that, really. It would help me if you could tell me what you would have expected from me here. I was making a comment where I tried to i) better understand the SLEP and ii) potentially improve the document. Would you prefer me not making such comments? Or prefix it to make my intentions clearer? Something else?

Meanwhile, and this is a genuine question, not a taunt: is it possible that you've been reading more than there is into my comments? Occurrences like this one or e.g. #17 (comment) give me the impression that this might be the case.

Basically, none of your comments on their own would be somethind I'd mind even the slightest. It's the accumulation of them which makes me break.

I may fail at this, but what I try to do reviewing is to focus on the main issues first, and do my best to understand what the contributor wants to say/do. I always first assume it's on me if I have a hard time understanding somebody's contribution, unless it passes a certain threshold where I may be kinda certain that the work would be very hard to be understood by most people.

From one perspective, you could say that I'm okay with contributions as long as they improve the status quo, even if they're not perfect.

And when I see reviews on my work. I'm happy to discuss the main ideas, but when it comes to the details such as word choices, especially on SLEPs, my position is "who cares" lol. It's a document for us to come to a consensus, and that's the main purpose of it. As long as it achieves that purpose, it doesn't matter if I include a funny gif in it or if I write poetry. For instance, the comment you refer to, was something which comes from me playing with the SLEP in my mind. When I'm writing it, I bargain with it, it's something I talk to, and I give life to it. From that perspective, of course it could imagine something, cause I'm not the one saying it, it's the SLEP suggesting something :P Or when I use the word lubricate, of course I know the connotation of the word, but I know (or at least in my experience) that most people kinda laugh when they see that, and yet they understand what the text means; again that's me playing around and having fun writing the SLEP.

When it comes to language issues, I know I'm not a native speaker and I'm pretty sure if a native speaker goes through any text I write, they'll find tons of minor issues. Take the example of propositions, I'm sure I have a ton of mistakes just in this single comment. But as long as people understand, we're okay with it as a community, for the sake of not exhausting people's patience; or at least that's what I'd like to believe.

Thanks for your reply.

I think I understand your point of view about good-enough PRs that aren't absolutely perfect, and for the most part, I share this mindset.

My approach differs in minor aspects, which I'll try to explain. Basically my line of thought is: if it's likely to improve the document and if it takes a few seconds to address, it's worth making a comment. It may save time/work for future reviewers and readers (this is even more true when it's about the docs where a small clarification can save many, many hours when you add up all the users). Of course, I don't apply the same criteria depending on who I'm reviewing: beginners and non-paid contributors don't have the same resources.

Now, we might disagree on what defines an actual improvement, and that's fine. I often find myself thinking "meh, I don't get it but ok" on some comments being made to my PRs, but I address anyway. And I'm probably more picky than the average reviewer, I'll definitely give you that, especially when it comes to the docs. I believe that typos and language clarifications are worth pointing out.

I just looked back at the comments that I made on this PR. One is a legit comment for something that was missing (#29 (comment)), one is a nitpick (#29 (comment)), one is a suggestion to something I didn't get at first and that you clarified (#29 (comment)), and the rest are questions, sometimes followed by a suggestion. These suggestions all follow my approach detailed above: they're low effort to address, and they're likely to improve the doc (IMHO), hopefully being beneficial to others.

I believe that what didn't help is that I was reviewing your PRs only a few minutes after they were posted. It might have given you the impression that I was eager to criticize your work, and maybe that I was on a personal mission. As I said on my gitter message from last week: that's not the case. I was reviewing early because I happened to be available at the time, and because I believe that timely feedback is good for productivity.

In any case, I'm not saying you're wrong for feeling the way you do. I'm just trying to share my perspective and clarify my intents when I review.

slep012/proposal.rst

adrinjalali · 2020-02-13T14:46:46Z

What's the interaction with n_features_in_ in pipelines?
Let's say after fit, step 2 of a pipeline has inferred n_features_in_ to be 3 but n_features_out_ of step 1 claims it's 4. One of them is wrong, do we error?

I think we do either try to enforce consistency of that between fit and transform in estimators, or some operation raises an error. Checking the consistency could be added, not sure if we should put that in this SLEP. To me it's kind of a next step.

slep012/proposal.rst

NicolasHug · 2020-02-13T14:57:42Z

Checking the consistency could be added, not sure if we should put that in this SLEP. To me it's kind of a next step.

OK Agreed. That can get tricky because the discrepancy might come from a user-input issue, or from a wrong computation on our side in which case we don't want to error

amueller

Looks good, thanks @adrinjalali

amueller · 2020-02-17T22:11:18Z

slep012/proposal.rst

+########
+
+The proposed solution is for the ``n_features_out_`` attribute to be set once a
+call to ``fit`` is done. In most cases the value of ``n_features_out_`` is the


Seems nitpicky ;) How about "in many cases"? But any of the three variants seem fine to me. Usually it corresponds to some attribute, the scalers often have scale for example and you're saying e.g. not i.e. so it's only an example...

adrinjalali · 2020-02-19T16:29:29Z

Seems like this one is going to be an easy one? I almost can't believe it :P should we merge and take a vote then?

NicolasHug

12 vs 13 but LGTM otherwise, feel free to merge when numbering is fixed

NicolasHug · 2020-02-19T16:32:19Z

slep013/proposal.rst

+.. _slep_013:
+
+======================================
+SLEP012: ``n_features_out_`` attribute


SLEP012 but ref and file say 13 ;)

oops, forgot this one.

(PR title too BTW)

NicolasHug · 2020-02-19T16:40:01Z

Thanks Adrin. Wanna call for a vote? We can ask for votes to be made on a PR / issue to avoid flooding the mailing list, as suggested

adrinjalali added 2 commits February 12, 2020 15:51

SLEP012: n_features_out_

09e1275

mention n_features_in_

5d9f404

NicolasHug reviewed Feb 12, 2020

View reviewed changes

add link to n_features_in_ and mention the new test

2d5e88f

NicolasHug reviewed Feb 13, 2020

View reviewed changes

slep012/proposal.rst Outdated Show resolved Hide resolved

NicolasHug reviewed Feb 13, 2020

View reviewed changes

slep012/proposal.rst Outdated Show resolved Hide resolved

adrinjalali added 2 commits February 13, 2020 16:22

more Nicolas's changes

01bab58

most -> some

da14bcd

amueller approved these changes Feb 17, 2020

View reviewed changes

adrinjalali added 3 commits February 18, 2020 14:54

most -> many

a4a84eb

rename to slep13

58354ba

merge upstream/master

852d0a8

NicolasHug approved these changes Feb 19, 2020

View reviewed changes

missed slep12->slep13

3e7b460

adrinjalali changed the title ~~SLEP012: n_features_out_~~ SLEP013: n_features_out_ Feb 19, 2020

NicolasHug merged commit 8445d6c into scikit-learn:master Feb 19, 2020

Uh oh!

SLEP013: n_features_out_ #29

SLEP013: n_features_out_ #29

Uh oh!

Conversation

adrinjalali commented Feb 12, 2020

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adrinjalali commented Feb 13, 2020

Uh oh!

Uh oh!

Uh oh!

NicolasHug commented Feb 13, 2020

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Feb 19, 2020

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Feb 19, 2020

Uh oh!

Uh oh!