public class NaiveBayes extends ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
)
which can handle finitely supported discrete data. For example, by converting documents into
TF-IDF vectors, it can be used for document classification. By making every vector a
binary (0/1) data, it can also be used as Bernoulli NB
(http://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html
).
The input feature values must be nonnegative.Constructor and Description |
---|
NaiveBayes() |
NaiveBayes(java.lang.String uid) |
Modifier and Type | Method and Description |
---|---|
NaiveBayes |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<java.lang.String> |
featuresCol()
Param for features column name.
|
java.lang.String |
getFeaturesCol() |
java.lang.String |
getLabelCol() |
java.lang.String |
getModelType() |
java.lang.String |
getPredictionCol() |
java.lang.String |
getRawPredictionCol() |
double |
getSmoothing() |
Param<java.lang.String> |
labelCol()
Param for label column name.
|
static NaiveBayes |
load(java.lang.String path) |
Param<java.lang.String> |
modelType()
The model type which is a string (case-sensitive).
|
Param<java.lang.String> |
predictionCol()
Param for prediction column name.
|
Param<java.lang.String> |
rawPredictionCol()
Param for raw prediction (a.k.a.
|
NaiveBayes |
setModelType(java.lang.String value)
Set the model type using a string (case-sensitive).
|
NaiveBayes |
setSmoothing(double value)
Set the smoothing parameter.
|
DoubleParam |
smoothing()
The smoothing parameter.
|
protected NaiveBayesModel |
train(DataFrame dataset)
Train a model using the given dataset and parameters.
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType) |
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
MLWriter |
write()
Returns an
MLWriter instance for this ML instance. |
setProbabilityCol, setThresholds
setRawPredictionCol
extractLabeledPoints, fit, setFeaturesCol, setLabelCol, setPredictionCol, transformSchema
transformSchema
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
save
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams
toString
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public NaiveBayes(java.lang.String uid)
public NaiveBayes()
public static NaiveBayes load(java.lang.String path)
public java.lang.String uid()
Identifiable
uid
in interface Identifiable
public NaiveBayes setSmoothing(double value)
value
- (undocumented)public NaiveBayes setModelType(java.lang.String value)
value
- (undocumented)protected NaiveBayesModel train(DataFrame dataset)
Predictor
fit()
to avoid dealing with schema validation
and copying parameters into the model.
train
in class Predictor<Vector,NaiveBayes,NaiveBayesModel>
dataset
- Training datasetpublic NaiveBayes copy(ParamMap extra)
Params
copy
in interface Params
copy
in class Predictor<Vector,NaiveBayes,NaiveBayesModel>
extra
- (undocumented)defaultCopy()
public DoubleParam smoothing()
public double getSmoothing()
public Param<java.lang.String> modelType()
public java.lang.String getModelType()
public MLWriter write()
MLWritable
MLWriter
instance for this ML instance.write
in interface MLWritable
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
public Param<java.lang.String> rawPredictionCol()
public java.lang.String getRawPredictionCol()
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType.
E.g., VectorUDT
for vector features.public Param<java.lang.String> labelCol()
public java.lang.String getLabelCol()
public Param<java.lang.String> featuresCol()
public java.lang.String getFeaturesCol()
public Param<java.lang.String> predictionCol()
public java.lang.String getPredictionCol()