public class HashingTF
extends java.lang.Object
implements scala.Serializable
param: numFeatures number of features (default: 2^20^)
Modifier and Type | Method and Description |
---|---|
int |
indexOf(java.lang.Object term)
Returns the index of the input term.
|
int |
numFeatures() |
Vector |
transform(java.lang.Iterable<?> document)
Transforms the input document into a sparse term frequency vector (Java version).
|
Vector |
transform(scala.collection.Iterable<java.lang.Object> document)
Transforms the input document into a sparse term frequency vector.
|
<D extends java.lang.Iterable<?>> |
transform(JavaRDD<D> dataset)
Transforms the input document to term frequency vectors (Java version).
|
<D extends scala.collection.Iterable<java.lang.Object>> |
transform(RDD<D> dataset)
Transforms the input document to term frequency vectors.
|
public int numFeatures()
public int indexOf(java.lang.Object term)
term
- (undocumented)public Vector transform(scala.collection.Iterable<java.lang.Object> document)
document
- (undocumented)public Vector transform(java.lang.Iterable<?> document)
document
- (undocumented)public <D extends scala.collection.Iterable<java.lang.Object>> RDD<Vector> transform(RDD<D> dataset)
dataset
- (undocumented)