Skip to content

FastForest to yes actually relly yes have a probability column. #7398

Open
@superichmann

Description

@superichmann

why no Probability?

polyglot vscode c# notebook:

#r "nuget:Microsoft.ML"
#r "nuget:Microsoft.ML.LightGbm"
#r "nuget:Microsoft.ML.FastTree"
using System;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;

public class ModelInput
{
public float Feature1 { get; set; }
public float Feature2 { get; set; }
public bool Label { get; set; }
}


// Create a new MLContext
var mlContext = new MLContext();

// Define the training data schema
var data = new[]
{
    new ModelInput { Feature1 = 1f, Feature2 = 2f, Label = true },
    new ModelInput { Feature1 = 3f, Feature2 = 4f, Label = false },
    new ModelInput { Feature1 = 5f, Feature2 = 6f, Label = true },
    new ModelInput { Feature1 = 7f, Feature2 = 8f, Label = false },
    new ModelInput { Feature1 = 9f, Feature2 = 10f, Label = true }
};

// Load the training data
var trainData = mlContext.Data.LoadFromEnumerable(data);

// Define the LightGBM binary classification trainer
var trainer = mlContext.BinaryClassification.Trainers.FastForest();

// Train the model
var pipeline = mlContext.Transforms.Concatenate("Features", nameof(ModelInput.Feature1), nameof(ModelInput.Feature2))
    .Append(trainer);

var model = pipeline.Fit(trainData);

// Define new data points for prediction
var newData = new[]
{
    new ModelInput { Feature1 = 2f, Feature2 = 3f },
    new ModelInput { Feature1 = 4f, Feature2 = 5f },
    new ModelInput { Feature1 = 6f, Feature2 = 7f },
    new ModelInput { Feature1 = 8f, Feature2 = 9f },
    new ModelInput { Feature1 = 10f, Feature2 = 11f }
};

// Load the new data
var newDataView = mlContext.Data.LoadFromEnumerable(newData);

// Make predictions on the new data
var transformedNewData = model.Transform(newDataView);

// Extract the Probability column
var probabilities = transformedNewData.GetColumn<float>("Probability").ToArray();

// Extract the Feature1 and Feature2 columns
var feature1 = newData.Select(x => x.Feature1).ToArray();
var feature2 = newData.Select(x => x.Feature2).ToArray();

// Print the Probability scores for each prediction
for (int i = 0; i < probabilities.Length; i++)
{
    Console.WriteLine($"Feature1: {feature1[i]}, Feature2: {feature2[i]}, Probability: {probabilities[i]}");
}

Error: System.ArgumentOutOfRangeException: Column 'Probability' not found (Parameter 'name')
at Microsoft.ML.DataViewSchema.get_Item(String name)
at Microsoft.ML.Data.ColumnCursorExtensions.GetColumn[T](IDataView data, String columnName)
at Submission#18.<>d__0.MoveNext()
--- End of stack trace from previous location ---
at Microsoft.CodeAnalysis.Scripting.ScriptExecutionState.RunSubmissionsAsync[TResult](ImmutableArray1 precedingExecutors, Func2 currentExecutor, StrongBox1 exceptionHolderOpt, Func2 catchExceptionOpt, CancellationToken cancellationToken)

Documentation says yes have probability. why not have?

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationRelated to documentation of ML.NET

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions