Skip to content

Discrepancy in NgramExtractorTrasform, NgramExtractingTransformer and NgramExtractingEstimator. #2895

Open
@zeahmed

Description

@zeahmed

If you search for NgramExtract in the solution, the following three main classes pop up.

  1. NgramExtractorTransform (in WordBagTransform.cs)
  2. NgramExtractingTransformer (in NgramTransform.cs)
  3. NgramExtractingEstimator (in NgramTrasnform.cs)

2 and 3 seem to be the actual classes where ngram extraction logic is written. However, 1 uses 2 and 3 with a pre-processing step where if input is text it is first converted to terms using ValueToKeyMappingTransformer.

First, NgramExtractorTransform does not seem to be in correct file i.e filename and class name do not match.
Second, the NgramExtractorTransform is not doing ngram extraction instead composing two different estimators (NgramExtractingEstimator and ValueToKeyMappingEstimator).

I think NgramExtractorTransform be renamed to WordBagTransform or something appropirate.

CC: @Ivanidzo4ka, @TomFinley, @sfilipi, @rogancarr.

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIIssues pertaining the friendly APIP2Priority of the issue for triage purpose: Needs to be fixed at some point.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions