Package org.apache.spark.ml.evaluation
Class ClusteringEvaluator
Object
org.apache.spark.ml.evaluation.Evaluator
org.apache.spark.ml.evaluation.ClusteringEvaluator
- All Implemented Interfaces:
Serializable
,Params
,HasFeaturesCol
,HasPredictionCol
,HasWeightCol
,DefaultParamsWritable
,Identifiable
,MLWritable
,scala.Serializable
public class ClusteringEvaluator
extends Evaluator
implements HasPredictionCol, HasFeaturesCol, HasWeightCol, DefaultParamsWritable
Evaluator for clustering results.
The metric computes the Silhouette measure using the specified distance measure.
The Silhouette is a measure for the validation of the consistency within clusters. It ranges between 1 and -1, where a value close to 1 means that the points in a cluster are close to the other points in the same cluster and far from the points of the other clusters.
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionCreates a copy of this instance with the same UID and some extra params.param for distance measure to be used in evaluation (supports"squaredEuclidean"
(default),"cosine"
)double
Evaluates model output and returns a scalar metric.Param for features column name.getMetrics
(Dataset<?> dataset) Get a ClusteringMetrics, which can be used to get clustering metrics such as silhouette score.boolean
Indicates whether the metric returned byevaluate
should be maximized (true, default) or minimized (false).static ClusteringEvaluator
param for metric name in evaluation (supports"silhouette"
(default))Param for prediction column name.static MLReader<T>
read()
setDistanceMeasure
(String value) setFeaturesCol
(String value) setMetricName
(String value) setPredictionCol
(String value) setWeightCol
(String value) toString()
uid()
An immutable unique ID for the object and its derivatives.Param for weight column name.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable
write
Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
getFeaturesCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol
getWeightCol
Methods inherited from interface org.apache.spark.ml.util.MLWritable
save
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Constructor Details
-
ClusteringEvaluator
-
ClusteringEvaluator
public ClusteringEvaluator()
-
-
Method Details
-
load
-
read
-
weightCol
Description copied from interface:HasWeightCol
Param for weight column name. If this is not set or empty, we treat all instance weights as 1.0.- Specified by:
weightCol
in interfaceHasWeightCol
- Returns:
- (undocumented)
-
featuresCol
Description copied from interface:HasFeaturesCol
Param for features column name.- Specified by:
featuresCol
in interfaceHasFeaturesCol
- Returns:
- (undocumented)
-
predictionCol
Description copied from interface:HasPredictionCol
Param for prediction column name.- Specified by:
predictionCol
in interfaceHasPredictionCol
- Returns:
- (undocumented)
-
uid
Description copied from interface:Identifiable
An immutable unique ID for the object and its derivatives.- Specified by:
uid
in interfaceIdentifiable
- Returns:
- (undocumented)
-
copy
Description copied from interface:Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy()
. -
isLargerBetter
public boolean isLargerBetter()Description copied from class:Evaluator
Indicates whether the metric returned byevaluate
should be maximized (true, default) or minimized (false). A given evaluator may support multiple metrics which may be maximized or minimized.- Overrides:
isLargerBetter
in classEvaluator
- Returns:
- (undocumented)
-
setPredictionCol
-
setFeaturesCol
-
setWeightCol
-
metricName
param for metric name in evaluation (supports"silhouette"
(default))- Returns:
- (undocumented)
-
getMetricName
-
setMetricName
-
distanceMeasure
param for distance measure to be used in evaluation (supports"squaredEuclidean"
(default),"cosine"
)- Returns:
- (undocumented)
-
getDistanceMeasure
-
setDistanceMeasure
-
evaluate
Description copied from class:Evaluator
Evaluates model output and returns a scalar metric. The value ofEvaluator.isLargerBetter()
specifies whether larger values are better. -
getMetrics
Get a ClusteringMetrics, which can be used to get clustering metrics such as silhouette score.- Parameters:
dataset
- a dataset that contains labels/observations and predictions.- Returns:
- ClusteringMetrics
-
toString
- Specified by:
toString
in interfaceIdentifiable
- Overrides:
toString
in classObject
-