org.apache.spark.ml.evaluation.Evaluator

org.apache.spark.ml.evaluation.ClusteringEvaluator

All Implemented Interfaces:: Serializable, Params, HasFeaturesCol, HasPredictionCol, HasWeightCol, DefaultParamsWritable, Identifiable, MLWritable, scala.Serializable

public class ClusteringEvaluator extends Evaluator implements HasPredictionCol, HasFeaturesCol, HasWeightCol, DefaultParamsWritable

Evaluator for clustering results. The metric computes the Silhouette measure using the specified distance measure.

The Silhouette is a measure for the validation of the consistency within clusters. It ranges between 1 and -1, where a value close to 1 means that the points in a cluster are close to the other points in the same cluster and far from the points of the other clusters.

See Also:

Serialized Form

Constructor Summary

Constructors

Constructor

Description

ClusteringEvaluator()

ClusteringEvaluator(String uid)
Method Summary

Modifier and Type

Method

Description

ClusteringEvaluator

copy(ParamMap pMap)

Creates a copy of this instance with the same UID and some extra params.

Param<String>

distanceMeasure()

param for distance measure to be used in evaluation (supports "squaredEuclidean" (default), "cosine")

double

evaluate(Dataset<?> dataset)

Evaluates model output and returns a scalar metric.

final Param<String>

featuresCol()

Param for features column name.

String

getDistanceMeasure()

String

getMetricName()

ClusteringMetrics

getMetrics(Dataset<?> dataset)

Get a ClusteringMetrics, which can be used to get clustering metrics such as silhouette score.

boolean

isLargerBetter()

Indicates whether the metric returned by evaluate should be maximized (true, default) or minimized (false).

static ClusteringEvaluator

load(String path)

Param<String>

metricName()

param for metric name in evaluation (supports "silhouette" (default))

final Param<String>

predictionCol()

Param for prediction column name.

static MLReader<T>

read()

ClusteringEvaluator

setDistanceMeasure(String value)

ClusteringEvaluator

setFeaturesCol(String value)

ClusteringEvaluator

setMetricName(String value)

ClusteringEvaluator

setPredictionCol(String value)

ClusteringEvaluator

setWeightCol(String value)

String

toString()

String

uid()

An immutable unique ID for the object and its derivatives.

final Param<String>

weightCol()

Param for weight column name.

Methods inherited from class org.apache.spark.ml.evaluation.Evaluator
evaluate, params

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable
write

Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
getFeaturesCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol
getWeightCol

Methods inherited from interface org.apache.spark.ml.util.MLWritable
save

Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn

Constructor Details
- ClusteringEvaluator
  
  public ClusteringEvaluator(String uid)
- ClusteringEvaluator
  
  public ClusteringEvaluator()
Method Details
- load
  
  public static ClusteringEvaluator load(String path)
- read
  
  public static MLReader<T> read()
- weightCol
  
  public final Param<String> weightCol()
  
  Description copied from interface: HasWeightCol
  
  Param for weight column name. If this is not set or empty, we treat all instance weights as 1.0.
  
  Specified by:
  
  weightCol in interface HasWeightCol
  
  Returns:
  
  (undocumented)
- featuresCol
  
  public final Param<String> featuresCol()
  
  Description copied from interface: HasFeaturesCol
  
  Param for features column name.
  
  Specified by:
  
  featuresCol in interface HasFeaturesCol
  
  Returns:
  
  (undocumented)
- predictionCol
  
  public final Param<String> predictionCol()
  
  Description copied from interface: HasPredictionCol
  
  Param for prediction column name.
  
  Specified by:
  
  predictionCol in interface HasPredictionCol
  
  Returns:
  
  (undocumented)
- uid
  
  public String uid()
  
  Description copied from interface: Identifiable
  
  An immutable unique ID for the object and its derivatives.
  
  Specified by:
  
  uid in interface Identifiable
  
  Returns:
  
  (undocumented)
- copy
  
  public ClusteringEvaluator copy(ParamMap pMap)
  
  Description copied from interface: Params
  
  Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy().
  
  Specified by:
  
  copy in interface Params
  
  Specified by:
  
  copy in class Evaluator
  
  Parameters:
  
  pMap - (undocumented)
  
  Returns:
  
  (undocumented)
- isLargerBetter
  
  public boolean isLargerBetter()
  
  Description copied from class: Evaluator
  
  Indicates whether the metric returned by evaluate should be maximized (true, default) or minimized (false). A given evaluator may support multiple metrics which may be maximized or minimized.
  
  Overrides:
  
  isLargerBetter in class Evaluator
  
  Returns:
  
  (undocumented)
- setPredictionCol
  
  public ClusteringEvaluator setPredictionCol(String value)
- setFeaturesCol
  
  public ClusteringEvaluator setFeaturesCol(String value)
- setWeightCol
  
  public ClusteringEvaluator setWeightCol(String value)
- metricName
  
  public Param<String> metricName()
  
  param for metric name in evaluation (supports "silhouette" (default))
  
  Returns:
  
  (undocumented)
- getMetricName
  
  public String getMetricName()
- setMetricName
  
  public ClusteringEvaluator setMetricName(String value)
- distanceMeasure
  
  public Param<String> distanceMeasure()
  
  param for distance measure to be used in evaluation (supports "squaredEuclidean" (default), "cosine")
  
  Returns:
  
  (undocumented)
- getDistanceMeasure
  
  public String getDistanceMeasure()
- setDistanceMeasure
  
  public ClusteringEvaluator setDistanceMeasure(String value)
- evaluate
  
  public double evaluate(Dataset<?> dataset)
  
  Description copied from class: Evaluator
  
  Evaluates model output and returns a scalar metric. The value of Evaluator.isLargerBetter() specifies whether larger values are better.
  
  Specified by:
  
  evaluate in class Evaluator
  
  Parameters:
  
  dataset - a dataset that contains labels/observations and predictions.
  
  Returns:
  
  metric
- getMetrics
  
  public ClusteringMetrics getMetrics(Dataset<?> dataset)
  
  Get a ClusteringMetrics, which can be used to get clustering metrics such as silhouette score.
  
  Parameters:
  
  dataset - a dataset that contains labels/observations and predictions.
  
  Returns:
  
  ClusteringMetrics
- toString
  
  public String toString()
  
  Specified by:
  
  toString in interface Identifiable
  
  Overrides:
  
  toString in class Object

Class ClusteringEvaluator

Constructor Summary

Method Summary

Methods inherited from class org.apache.spark.ml.evaluation.Evaluator

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable

Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol

Methods inherited from interface org.apache.spark.ml.util.MLWritable

Methods inherited from interface org.apache.spark.ml.param.Params

Constructor Details

ClusteringEvaluator

ClusteringEvaluator

Method Details

load

read

weightCol

featuresCol

predictionCol

uid

copy

isLargerBetter

setPredictionCol

setFeaturesCol

setWeightCol

metricName

getMetricName

setMetricName

distanceMeasure

getDistanceMeasure

setDistanceMeasure

evaluate

getMetrics

toString