public class VectorIndexerModel extends Model<VectorIndexerModel> implements MLWritable
This maintains vector sparsity.
param: numFeatures Number of features, i.e., length of Vectors which this transforms param: categoryMaps Feature value index. Keys are categorical feature indices (column indices). Values are maps from original features values to 0-based category indices. If a feature is not in this map, it is treated as continuous.
Modifier and Type | Method and Description |
---|---|
scala.collection.immutable.Map<java.lang.Object,scala.collection.immutable.Map<java.lang.Object,java.lang.Object>> |
categoryMaps() |
VectorIndexerModel |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
int |
getMaxCategories() |
java.util.Map<java.lang.Integer,java.util.Map<java.lang.Double,java.lang.Integer>> |
javaCategoryMaps()
Java-friendly version of
categoryMaps |
static VectorIndexerModel |
load(java.lang.String path) |
IntParam |
maxCategories()
Threshold for the number of values a categorical feature can take.
|
int |
numFeatures() |
static MLReader<VectorIndexerModel> |
read() |
VectorIndexerModel |
setInputCol(java.lang.String value) |
VectorIndexerModel |
setOutputCol(java.lang.String value) |
DataFrame |
transform(DataFrame dataset)
Transforms the input dataset.
|
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
MLWriter |
write()
Returns an
MLWriter instance for this ML instance. |
transform, transform, transform
transformSchema
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams
toString
save
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public static MLReader<VectorIndexerModel> read()
public static VectorIndexerModel load(java.lang.String path)
public java.lang.String uid()
Identifiable
uid
in interface Identifiable
public int numFeatures()
public scala.collection.immutable.Map<java.lang.Object,scala.collection.immutable.Map<java.lang.Object,java.lang.Object>> categoryMaps()
public java.util.Map<java.lang.Integer,java.util.Map<java.lang.Double,java.lang.Integer>> javaCategoryMaps()
categoryMaps
public VectorIndexerModel setInputCol(java.lang.String value)
public VectorIndexerModel setOutputCol(java.lang.String value)
public DataFrame transform(DataFrame dataset)
Transformer
transform
in class Transformer
dataset
- (undocumented)public StructType transformSchema(StructType schema)
PipelineStage
Derives the output schema from the input schema.
transformSchema
in class PipelineStage
schema
- (undocumented)public VectorIndexerModel copy(ParamMap extra)
Params
copy
in interface Params
copy
in class Model<VectorIndexerModel>
extra
- (undocumented)defaultCopy()
public MLWriter write()
MLWritable
MLWriter
instance for this ML instance.write
in interface MLWritable
public IntParam maxCategories()
(default = 20)
public int getMaxCategories()