Package org.apache.spark.ml.regression
Interface GeneralizedLinearRegressionBase
- All Superinterfaces:
HasAggregationDepth
,HasFeaturesCol
,HasFitIntercept
,HasLabelCol
,HasMaxIter
,HasPredictionCol
,HasRegParam
,HasSolver
,HasTol
,HasWeightCol
,Identifiable
,org.apache.spark.internal.Logging
,Params
,PredictorParams
,Serializable
,scala.Serializable
- All Known Implementing Classes:
GeneralizedLinearRegression
,GeneralizedLinearRegressionModel
public interface GeneralizedLinearRegressionBase
extends PredictorParams, HasFitIntercept, HasMaxIter, HasTol, HasRegParam, HasWeightCol, HasSolver, HasAggregationDepth, org.apache.spark.internal.Logging
Params for Generalized Linear Regression.
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Method Summary
Modifier and TypeMethodDescriptionfamily()
Param for the name of family which is a description of the error distribution to be used in the model.getLink()
double
double
boolean
Checks whether we should output link prediction.boolean
Checks whether offset column is set and nonempty.boolean
Checks whether weight column is set and nonempty.link()
Param for the name of link function which provides the relationship between the linear predictor and the mean of the distribution function.Param for the index in the power link function.Param for link prediction (linear predictor) column name.Param for offset column name.solver()
The solver algorithm for optimization.validateAndTransformSchema
(StructType schema, boolean fitting, DataType featuresDataType) Validates and transforms the input schema with the provided param map.Param for the power in the variance function of the Tweedie distribution which provides the relationship between the variance and mean of the distribution.Methods inherited from interface org.apache.spark.ml.param.shared.HasAggregationDepth
aggregationDepth, getAggregationDepth
Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
featuresCol, getFeaturesCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasFitIntercept
fitIntercept, getFitIntercept
Methods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol
getLabelCol, labelCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasMaxIter
getMaxIter, maxIter
Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol, predictionCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasRegParam
getRegParam, regParam
Methods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol
getWeightCol, weightCol
Methods inherited from interface org.apache.spark.ml.util.Identifiable
toString, uid
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Method Details
-
family
Param for the name of family which is a description of the error distribution to be used in the model. Supported options: "gaussian", "binomial", "poisson", "gamma" and "tweedie". Default is "gaussian".- Returns:
- (undocumented)
-
getFamily
String getFamily() -
getLink
String getLink() -
getLinkPower
double getLinkPower() -
getLinkPredictionCol
String getLinkPredictionCol() -
getOffsetCol
String getOffsetCol() -
getVariancePower
double getVariancePower() -
hasLinkPredictionCol
boolean hasLinkPredictionCol()Checks whether we should output link prediction. -
hasOffsetCol
boolean hasOffsetCol()Checks whether offset column is set and nonempty. -
hasWeightCol
boolean hasWeightCol()Checks whether weight column is set and nonempty. -
link
Param for the name of link function which provides the relationship between the linear predictor and the mean of the distribution function. Supported options: "identity", "log", "inverse", "logit", "probit", "cloglog" and "sqrt". This is used only when family is not "tweedie". The link function for the "tweedie" family must be specified throughlinkPower()
.- Returns:
- (undocumented)
-
linkPower
DoubleParam linkPower()Param for the index in the power link function. Only applicable to the Tweedie family. Note that link power 0, 1, -1 or 0.5 corresponds to the Log, Identity, Inverse or Sqrt link, respectively. When not set, this value defaults to 1 -variancePower()
, which matches the R "statmod" package.- Returns:
- (undocumented)
-
linkPredictionCol
Param for link prediction (linear predictor) column name. Default is not set, which means we do not output link prediction.- Returns:
- (undocumented)
-
offsetCol
Param for offset column name. If this is not set or empty, we treat all instance offsets as 0.0. The feature specified as offset has a constant coefficient of 1.0.- Returns:
- (undocumented)
-
solver
The solver algorithm for optimization. Supported options: "irls" (iteratively reweighted least squares). Default: "irls" -
validateAndTransformSchema
StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType) Description copied from interface:PredictorParams
Validates and transforms the input schema with the provided param map.- Specified by:
validateAndTransformSchema
in interfacePredictorParams
- Parameters:
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType. E.g.,VectorUDT
for vector features.- Returns:
- output schema
-
variancePower
DoubleParam variancePower()Param for the power in the variance function of the Tweedie distribution which provides the relationship between the variance and mean of the distribution. Only applicable to the Tweedie family. (see Tweedie Distribution (Wikipedia)) Supported values: 0 and [1, Inf). Note that variance power 0, 1, or 2 corresponds to the Gaussian, Poisson or Gamma family, respectively.- Returns:
- (undocumented)
-