public class CrossValidator extends Estimator<CrossValidatorModel> implements CrossValidatorParams, HasParallelism, HasCollectSubModels, MLWritable, org.apache.spark.internal.Logging
Constructor and Description |
---|
CrossValidator() |
CrossValidator(String uid) |
Modifier and Type | Method and Description |
---|---|
BooleanParam |
collectSubModels()
Param for whether to collect a list of sub-models trained during tuning.
|
CrossValidator |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<Estimator<?>> |
estimator()
param for the estimator to be validated
|
Param<ParamMap[]> |
estimatorParamMaps()
param for estimator param maps
|
Param<Evaluator> |
evaluator()
param for the evaluator used to select hyper-parameters that maximize the validated metric
|
CrossValidatorModel |
fit(Dataset<?> dataset)
Fits a model to the input data.
|
Param<String> |
foldCol()
Param for the column name of user specified fold number.
|
static CrossValidator |
load(String path) |
IntParam |
numFolds()
Param for number of folds for cross validation.
|
IntParam |
parallelism()
The number of threads to use when running parallel algorithms.
|
static MLReader<CrossValidator> |
read() |
LongParam |
seed()
Param for random seed.
|
CrossValidator |
setCollectSubModels(boolean value)
Whether to collect submodels when fitting.
|
CrossValidator |
setEstimator(Estimator<?> value) |
CrossValidator |
setEstimatorParamMaps(ParamMap[] value) |
CrossValidator |
setEvaluator(Evaluator value) |
CrossValidator |
setFoldCol(String value) |
CrossValidator |
setNumFolds(int value) |
CrossValidator |
setParallelism(int value)
Set the maximum level of parallelism to evaluate models in parallel.
|
CrossValidator |
setSeed(long value) |
StructType |
transformSchema(StructType schema)
Check transform validity and derive the output schema from the input schema.
|
String |
uid()
An immutable unique ID for the object and its derivatives.
|
MLWriter |
write()
Returns an
MLWriter instance for this ML instance. |
params
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFoldCol, getNumFolds
getEstimator, getEstimatorParamMaps, getEvaluator, logTuningParams, transformSchemaImpl
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
toString
getExecutionContext, getParallelism
getCollectSubModels
save
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public CrossValidator(String uid)
public CrossValidator()
public static MLReader<CrossValidator> read()
public static CrossValidator load(String path)
public final BooleanParam collectSubModels()
HasCollectSubModels
collectSubModels
in interface HasCollectSubModels
public IntParam parallelism()
HasParallelism
parallelism
in interface HasParallelism
public IntParam numFolds()
CrossValidatorParams
numFolds
in interface CrossValidatorParams
public Param<String> foldCol()
CrossValidatorParams
CrossValidator
won't do random k-fold split. Note that this column should be
integer type with range [0, numFolds) and Spark will throw exception on out-of-range
fold numbers.foldCol
in interface CrossValidatorParams
public Param<Estimator<?>> estimator()
ValidatorParams
estimator
in interface ValidatorParams
public Param<ParamMap[]> estimatorParamMaps()
ValidatorParams
estimatorParamMaps
in interface ValidatorParams
public Param<Evaluator> evaluator()
ValidatorParams
evaluator
in interface ValidatorParams
public final LongParam seed()
HasSeed
public String uid()
Identifiable
uid
in interface Identifiable
public CrossValidator setEstimator(Estimator<?> value)
public CrossValidator setEstimatorParamMaps(ParamMap[] value)
public CrossValidator setEvaluator(Evaluator value)
public CrossValidator setNumFolds(int value)
public CrossValidator setSeed(long value)
public CrossValidator setFoldCol(String value)
public CrossValidator setParallelism(int value)
value
- (undocumented)public CrossValidator setCollectSubModels(boolean value)
Note: If set this param, when you save the returned model, you can set an option
"persistSubModels" to be "true" before saving, in order to save these submodels.
You can check documents of
CrossValidatorModel.CrossValidatorModelWriter
for more information.
value
- (undocumented)public CrossValidatorModel fit(Dataset<?> dataset)
Estimator
fit
in class Estimator<CrossValidatorModel>
dataset
- (undocumented)public StructType transformSchema(StructType schema)
PipelineStage
We check validity for interactions between parameters during transformSchema
and
raise an exception if any parameter value is invalid. Parameter value checks which
do not depend on other parameters are handled by Param.validate()
.
Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
transformSchema
in class PipelineStage
schema
- (undocumented)public CrossValidator copy(ParamMap extra)
Params
defaultCopy()
.copy
in interface Params
copy
in class Estimator<CrossValidatorModel>
extra
- (undocumented)public MLWriter write()
MLWritable
MLWriter
instance for this ML instance.write
in interface MLWritable