public class LinearRegression extends Predictor<FeaturesType,Learner,M> implements DefaultParamsWritable, Logging
The learning objective is to minimize the squared error, with regularization. The specific squared error loss function used is:
$$ L = 1/2n ||A coefficients - y||^2^ $$
This supports multiple types of regularization: - none (a.k.a. ordinary least squares) - L2 (ridge regression) - L1 (Lasso) - L2 + L1 (elastic net)
Constructor and Description |
---|
LinearRegression() |
LinearRegression(String uid) |
Modifier and Type | Method and Description |
---|---|
static IntParam |
aggregationDepth() |
static Params |
clear(Param<?> param) |
LinearRegression |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
static DoubleParam |
elasticNetParam() |
static String |
explainParam(Param<?> param) |
static String |
explainParams() |
static ParamMap |
extractParamMap() |
static ParamMap |
extractParamMap(ParamMap extra) |
static Param<String> |
featuresCol() |
Param<String> |
featuresCol()
Param for features column name.
|
static M |
fit(Dataset<?> dataset) |
static M |
fit(Dataset<?> dataset,
ParamMap paramMap) |
static scala.collection.Seq<M> |
fit(Dataset<?> dataset,
ParamMap[] paramMaps) |
static M |
fit(Dataset<?> dataset,
ParamPair<?> firstParamPair,
ParamPair<?>... otherParamPairs) |
static M |
fit(Dataset<?> dataset,
ParamPair<?> firstParamPair,
scala.collection.Seq<ParamPair<?>> otherParamPairs) |
static BooleanParam |
fitIntercept() |
static <T> scala.Option<T> |
get(Param<T> param) |
static int |
getAggregationDepth() |
static <T> scala.Option<T> |
getDefault(Param<T> param) |
static double |
getElasticNetParam() |
static String |
getFeaturesCol() |
String |
getFeaturesCol() |
static boolean |
getFitIntercept() |
static String |
getLabelCol() |
String |
getLabelCol() |
static int |
getMaxIter() |
static <T> T |
getOrDefault(Param<T> param) |
static Param<Object> |
getParam(String paramName) |
static String |
getPredictionCol() |
String |
getPredictionCol() |
static double |
getRegParam() |
static String |
getSolver() |
static boolean |
getStandardization() |
static double |
getTol() |
static String |
getWeightCol() |
static <T> boolean |
hasDefault(Param<T> param) |
static boolean |
hasParam(String paramName) |
static boolean |
isDefined(Param<?> param) |
static boolean |
isSet(Param<?> param) |
static Param<String> |
labelCol() |
Param<String> |
labelCol()
Param for label column name.
|
static LinearRegression |
load(String path) |
static int |
MAX_FEATURES_FOR_NORMAL_SOLVER()
When using
LinearRegression.solver == "normal", the solver must limit the number of
features to at most this number. |
static IntParam |
maxIter() |
static Param<?>[] |
params() |
static Param<String> |
predictionCol() |
Param<String> |
predictionCol()
Param for prediction column name.
|
static DoubleParam |
regParam() |
static void |
save(String path) |
static <T> Params |
set(Param<T> param,
T value) |
LinearRegression |
setAggregationDepth(int value)
Suggested depth for treeAggregate (greater than or equal to 2).
|
LinearRegression |
setElasticNetParam(double value)
Set the ElasticNet mixing parameter.
|
static Learner |
setFeaturesCol(String value) |
LinearRegression |
setFitIntercept(boolean value)
Set if we should fit the intercept.
|
static Learner |
setLabelCol(String value) |
LinearRegression |
setMaxIter(int value)
Set the maximum number of iterations.
|
static Learner |
setPredictionCol(String value) |
LinearRegression |
setRegParam(double value)
Set the regularization parameter.
|
LinearRegression |
setSolver(String value)
Set the solver algorithm used for optimization.
|
LinearRegression |
setStandardization(boolean value)
Whether to standardize the training features before fitting the model.
|
LinearRegression |
setTol(double value)
Set the convergence tolerance of iterations.
|
LinearRegression |
setWeightCol(String value)
Whether to over-/under-sample training instances according to the given weights in weightCol.
|
static Param<String> |
solver() |
static BooleanParam |
standardization() |
static DoubleParam |
tol() |
static String |
toString() |
static StructType |
transformSchema(StructType schema) |
String |
uid()
An immutable unique ID for the object and its derivatives.
|
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
static Param<String> |
weightCol() |
static MLWriter |
write() |
fit, setFeaturesCol, setLabelCol, setPredictionCol, transformSchema
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
write
save
initializeLogging, initializeLogIfNecessary, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
toString
public LinearRegression(String uid)
public LinearRegression()
public static LinearRegression load(String path)
public static int MAX_FEATURES_FOR_NORMAL_SOLVER()
LinearRegression.solver
== "normal", the solver must limit the number of
features to at most this number. The entire covariance matrix X^T^X will be collected
to the driver. This limit helps prevent memory overflow errors.public static String toString()
public static Param<?>[] params()
public static String explainParam(Param<?> param)
public static String explainParams()
public static final boolean isSet(Param<?> param)
public static final boolean isDefined(Param<?> param)
public static boolean hasParam(String paramName)
public static Param<Object> getParam(String paramName)
public static final <T> scala.Option<T> get(Param<T> param)
public static final <T> T getOrDefault(Param<T> param)
public static final <T> scala.Option<T> getDefault(Param<T> param)
public static final <T> boolean hasDefault(Param<T> param)
public static final ParamMap extractParamMap()
public static M fit(Dataset<?> dataset, ParamPair<?> firstParamPair, scala.collection.Seq<ParamPair<?>> otherParamPairs)
public static M fit(Dataset<?> dataset, ParamPair<?> firstParamPair, ParamPair<?>... otherParamPairs)
public static final Param<String> labelCol()
public static final String getLabelCol()
public static final Param<String> featuresCol()
public static final String getFeaturesCol()
public static final Param<String> predictionCol()
public static final String getPredictionCol()
public static Learner setLabelCol(String value)
public static Learner setFeaturesCol(String value)
public static Learner setPredictionCol(String value)
public static M fit(Dataset<?> dataset)
public static StructType transformSchema(StructType schema)
public static final DoubleParam regParam()
public static final double getRegParam()
public static final DoubleParam elasticNetParam()
public static final double getElasticNetParam()
public static final IntParam maxIter()
public static final int getMaxIter()
public static final DoubleParam tol()
public static final double getTol()
public static final BooleanParam fitIntercept()
public static final boolean getFitIntercept()
public static final BooleanParam standardization()
public static final boolean getStandardization()
public static final Param<String> weightCol()
public static final String getWeightCol()
public static final Param<String> solver()
public static final String getSolver()
public static final IntParam aggregationDepth()
public static final int getAggregationDepth()
public static void save(String path) throws java.io.IOException
java.io.IOException
public static MLWriter write()
public String uid()
Identifiable
uid
in interface Identifiable
public LinearRegression setRegParam(double value)
value
- (undocumented)public LinearRegression setFitIntercept(boolean value)
value
- (undocumented)public LinearRegression setStandardization(boolean value)
value
- (undocumented)public LinearRegression setElasticNetParam(double value)
value
- (undocumented)public LinearRegression setMaxIter(int value)
value
- (undocumented)public LinearRegression setTol(double value)
value
- (undocumented)public LinearRegression setWeightCol(String value)
value
- (undocumented)public LinearRegression setSolver(String value)
LinearRegression.MAX_FEATURES_FOR_NORMAL_SOLVER
.
- "auto" (default) means that the solver algorithm is selected automatically.
The Normal Equations solver will be used when possible, but this will automatically fall
back to iterative optimization methods when needed.
value
- (undocumented)public LinearRegression setAggregationDepth(int value)
value
- (undocumented)public LinearRegression copy(ParamMap extra)
Params
defaultCopy()
.copy
in interface Params
copy
in class Predictor<Vector,LinearRegression,LinearRegressionModel>
extra
- (undocumented)public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType.
E.g., VectorUDT
for vector features.public Param<String> labelCol()
public String getLabelCol()
public Param<String> featuresCol()
public String getFeaturesCol()
public Param<String> predictionCol()
public String getPredictionCol()