- Accumulable<R,T> - Class in org.apache.spark
-
A data type that can be accumulated, ie has an commutative and associative "add" operation,
but where the result type, R
, may be different from the element type being added, T
.
- Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
-
- Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
-
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, to which tasks can add values
with
+=
.
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, with a name for display in the
Spark UI.
- accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
-
Create an accumulator from a "mutable collection" type.
- AccumulableInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about an
Accumulable
modified during a task or stage.
- AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
-
- AccumulableParam<R,T> - Interface in org.apache.spark
-
Helper object defining how to accumulate values of a particular type.
- accumulables() - Method in class org.apache.spark.scheduler.StageInfo
-
Terminal values of accumulables updated during this stage.
- accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
-
Intermediate updates to accumulables during this task.
- Accumulator<T> - Class in org.apache.spark
-
A simpler value of
Accumulable
where the result type being accumulated is the same
as the types of elements being merged, i.e.
- Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
-
- Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
-
- accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
+=
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, with a name for display
in the Spark UI.
- AccumulatorParam<T> - Interface in org.apache.spark
-
A simpler version of
AccumulableParam
where the only data type you can add
in is the same type as the accumulated value.
- active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- ActorHelper - Interface in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A receiver trait to be mixed in with your Actor to gain access to
the API for pushing received data into Spark Streaming for being processed.
- actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A helper with set of defaults for supervisor strategy
- ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- actorSystem() - Method in class org.apache.spark.SparkEnv
-
- add(T) - Method in class org.apache.spark.Accumulable
-
Add more data to this accumulator / accumulable
- add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Adds a new document.
- add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Add a new sample to this summarizer, and update the statistical summary.
- add(Vector) - Method in class org.apache.spark.util.Vector
-
- addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
-
Add additional data to the accumulator value.
- addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
-
- addedFiles() - Method in class org.apache.spark.SparkContext
-
- addedJars() - Method in class org.apache.spark.SparkContext
-
- addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
-
Merge two accumulated values together.
- addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- addInPlace(Vector) - Method in class org.apache.spark.util.Vector
-
- addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
-
- addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addJar(String) - Method in class org.apache.spark.SparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
-
Add Hadoop configuration specific to a single partition and attempt.
- addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Add a callback function to be executed on task completion.
- addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Register a listener to receive up-calls from events that happen during execution.
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
-
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
-
Add a (Java friendly) listener to be executed on task completion.
- addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Add a listener in the form of a Scala closure to be executed on task completion.
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- Aggregate - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Groups input data by groupingExpressions
and computes the aggregateExpressions
for each
group.
- Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Aggregate
-
- aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
-
- aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
-
Performs an aggregation over all Rows in this RDD.
- Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution
-
An aggregate that needs to be computed for each row in a group.
- Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
-
- Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
-
- Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
-
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- AggregateEvaluation - Class in org.apache.spark.sql.execution
-
- AggregateEvaluation(Seq<Attribute>, Seq<Expression>, Seq<Expression>, Expression) - Constructor for class org.apache.spark.sql.execution.AggregateEvaluation
-
- aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
-
- aggregateExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- Aggregator<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
A set of functions used to aggregate data.
- Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
-
- aggregator() - Method in class org.apache.spark.ShuffleDependency
-
- Algo - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to select the algorithm for the decision tree
- Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
-
- algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- ALL_COMPRESSION_CODECS() - Method in interface org.apache.spark.io.CompressionCodec
-
- AlphaComponent - Annotation Type in org.apache.spark.annotation
-
A new component of Spark which may have unstable API's.
- alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
-
- ALS - Class in org.apache.spark.mllib.recommendation
-
Alternating Least Squares matrix factorization.
- ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
-
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10,
lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
- ALS.BlockStats - Class in org.apache.spark.mllib.recommendation
-
:: DeveloperApi ::
Statistics of a block in ALS computation.
- ALS.BlockStats(String, int, long, long, long, long) - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats
-
- ALS.BlockStats$ - Class in org.apache.spark.mllib.recommendation
-
- ALS.BlockStats$() - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats$
-
- analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- analyzeBlocks(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
:: DeveloperApi ::
Given an RDD of ratings, number of user blocks, and number of product blocks, computes the
statistics of each block in ALS computation.
- analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
-
- AnalyzeTable - Class in org.apache.spark.sql.hive.execution
-
:: DeveloperApi ::
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
-
- ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Returns a new vector with 1.0
(bias) appended to the input vector.
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Gets the (i, j)-th element.
- apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Gets the value of the ith element.
- apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(String) - Static method in class org.apache.spark.storage.BlockId
-
Converts a BlockId "name" String back into a BlockId.
- apply(String, String, int, int) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object without setting useOffHeap.
- apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object.
- apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object from its integer representation.
- apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
-
- apply(long) - Static method in class org.apache.spark.streaming.Minutes
-
- apply(long) - Static method in class org.apache.spark.streaming.Seconds
-
- apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values.
- apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values passed as variable-length arguments.
- apply(int) - Method in class org.apache.spark.util.Vector
-
- applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
Applies a schema to an RDD of Java Beans.
- applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
:: DeveloperApi ::
Creates a JavaSchemaRDD from an RDD containing Rows by applying a schema to this RDD.
- applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
:: DeveloperApi ::
Creates a
SchemaRDD
from an
RDD
containing
Row
s by applying a schema to this RDD.
- appName() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appName() - Method in class org.apache.spark.SparkContext
-
- ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the precision-recall curve.
- areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the receiver operating characteristic (ROC) curve.
- ArrayType - Class in org.apache.spark.sql.api.java
-
The data type representing Lists.
- as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD
-
Applies a qualifier to the attributes of this relation.
- asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator.
- asRDDId() - Method in class org.apache.spark.storage.BlockId
-
- AsyncRDDActions<T> - Class in org.apache.spark.rdd
-
:: Experimental ::
A set of asynchronous RDD actions available through an implicit conversion.
- AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
-
- attempt() - Method in class org.apache.spark.scheduler.TaskInfo
-
- attemptId() - Method in class org.apache.spark.scheduler.StageInfo
-
- attemptId() - Method in class org.apache.spark.TaskContext
-
- attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- CacheCommand - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- CacheCommand(String, boolean, SQLContext) - Constructor for class org.apache.spark.sql.execution.CacheCommand
-
- cacheManager() - Method in class org.apache.spark.SparkEnv
-
- cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Caches the specified table in-memory.
- cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
variance calculation
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
variance calculation
- calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for regression
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
variance calculation
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
-
- call(T1) - Method in interface org.apache.spark.api.java.function.Function
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
-
- call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
-
- call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
-
- call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
-
- call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
-
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
-
- call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
-
- call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
-
- call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
-
- call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
-
- cancel() - Method in class org.apache.spark.ComplexFutureAction
-
- cancel() - Method in interface org.apache.spark.FutureAction
-
Cancels the execution of this action.
- cancel() - Method in class org.apache.spark.SimpleFutureAction
-
- cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelAllJobs() - Method in class org.apache.spark.SparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel active jobs for the specified group.
- cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
-
Cancel active jobs for the specified group.
- cancelled() - Method in class org.apache.spark.ComplexFutureAction
-
Returns whether the promise has been cancelled.
- canEqual(Object) - Method in class org.apache.spark.sql.api.java.Row
-
- canEqual(Object) - Method in class org.apache.spark.util.MutablePair
-
- cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- CartesianProduct - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.CartesianProduct
-
- Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- categories() - Method in class org.apache.spark.mllib.tree.model.Split
-
- category() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
-
- checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Mark this RDD for checkpointing.
- checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.RDD
-
Mark this RDD for checkpointing.
- checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Enable periodic checkpointing of RDDs of this DStream.
- checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets the context to periodically checkpoint the DStream operations for master
fault-tolerance.
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Enable periodic checkpointing of RDDs of this DStream
- checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Set the context to periodically checkpoint the DStream operations for driver
fault-tolerance.
- checkpointData() - Method in class org.apache.spark.rdd.RDD
-
- checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDir() - Method in class org.apache.spark.SparkContext
-
- checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
-
- checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
-
- child() - Method in class org.apache.spark.sql.execution.Aggregate
-
- child() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
-
- child() - Method in class org.apache.spark.sql.execution.DescribeCommand
-
- child() - Method in class org.apache.spark.sql.execution.Distinct
-
- child() - Method in class org.apache.spark.sql.execution.EvaluatePython
-
- child() - Method in class org.apache.spark.sql.execution.Exchange
-
- child() - Method in class org.apache.spark.sql.execution.Filter
-
- child() - Method in class org.apache.spark.sql.execution.Generate
-
- child() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- child() - Method in class org.apache.spark.sql.execution.Limit
-
- child() - Method in class org.apache.spark.sql.execution.OutputFaker
-
- child() - Method in class org.apache.spark.sql.execution.Project
-
- child() - Method in class org.apache.spark.sql.execution.Sample
-
- child() - Method in class org.apache.spark.sql.execution.Sort
-
- child() - Method in class org.apache.spark.sql.execution.TakeOrdered
-
- child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
-
- children() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
-
- children() - Method in class org.apache.spark.sql.execution.OutputFaker
-
- children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
-
- children() - Method in class org.apache.spark.sql.execution.Union
-
- chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's chi-squared goodness of fit test of the observed data against the
expected distribution.
- chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform
distribution, with each category having an expected frequency of 1 / observed.size
.
- chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's independence test on the input contingency matrix, which cannot contain
negative entries or columns or rows that sum up to 0.
- chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's independence test for every feature against the label across the input RDD.
- ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Object containing the test results for the chi-squared hypothesis test.
- Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- ClassificationModel - Interface in org.apache.spark.mllib.classification
-
:: Experimental ::
Represents a classification model that predicts to which of a set of categories an example
belongs.
- className() - Method in class org.apache.spark.ExceptionFailure
-
- classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaRDD
-
- classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- cleaner() - Method in class org.apache.spark.SparkContext
-
- clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- clearCallSite() - Method in class org.apache.spark.SparkContext
-
Clear the thread-local property for overriding the call sites
of actions and RDDs.
- clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearFiles() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the current thread's job group ID and its description.
- clearJobGroup() - Method in class org.apache.spark.SparkContext
-
Clear the current thread's job group ID and its description.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- clone() - Method in class org.apache.spark.SparkConf
-
Copy this object
- clone() - Method in class org.apache.spark.storage.StorageLevel
-
- clone() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- clone() - Method in class org.apache.spark.util.random.PoissonSampler
-
- clone() - Method in interface org.apache.spark.util.random.RandomSampler
-
- cloneComplement() - Method in class org.apache.spark.util.random.BernoulliSampler
-
Return a sampler that is the complement of the range specified of the current sampler.
- close() - Method in class org.apache.spark.serializer.DeserializationStream
-
- close() - Method in class org.apache.spark.serializer.SerializationStream
-
- closureSerializer() - Method in class org.apache.spark.SparkEnv
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
-
- codegenEnabled() - Method in class org.apache.spark.sql.execution.SparkPlan
-
- cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- CoGroupedRDD<K> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD that cogroups its parents.
- CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
-
- collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in this RDD.
- collect() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD that contains all matching values by applying f
.
- collect() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- collect() - Method in class org.apache.spark.sql.SchemaRDD
-
- collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving all elements of this RDD.
- collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in a specific partition of this RDD.
- colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Computes column-wise summary statistics for the input RDD[Vector].
- columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing
partitioner/parallelism level.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKey that hash-partitions the resulting RDD using the
existing partitioner/parallelism level.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Combine elements of each key in DStream's RDDs using custom functions.
- combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
-
- combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- Command - Interface in org.apache.spark.sql.execution
-
- commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
-
- compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
-
- completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
Time when all tasks in the stage completed or when the stage was cancelled.
- ComplexFutureAction<T> - Class in org.apache.spark
-
:: Experimental ::
A
FutureAction
for actions that could trigger multiple Spark jobs.
- ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
-
- compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- CompressionCodec - Interface in org.apache.spark.io
-
:: DeveloperApi ::
CompressionCodec allows the customization of choosing different compression implementations
to be used in block storage.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point.
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point,
add the gradient to a provided vector to avoid creating new objects, and return loss.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
-
Compute an updated value for weights given the gradient, stepSize, iteration number and
regularization parameter.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Implemented by subclasses to compute a given partition.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
-
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Generate an RDD for the given duration
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Method that generates a RDD for the given Duration
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Method that generates a RDD for the given time
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Ask ReceiverInputTracker for received data blocks and generates RDDs with them.
- computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes column-wise summary statistics.
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Return the K-means cost (sum of squared distances of points to their nearest center) for this
model on the given data.
- computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the covariance matrix, treating each row as an observation.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the Gramian matrix A^T A
.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the Gramian matrix A^T A
.
- computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
-
Computes the preferred locations based on input(s) and returned a location to block map.
- computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the top k principal components.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the singular value decomposition of this matrix.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes singular value decomposition of this matrix.
- condition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
-
- condition() - Method in class org.apache.spark.sql.execution.Filter
-
- condition() - Method in class org.apache.spark.sql.execution.HashOuterJoin
-
- condition() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
-
- conditionEvaluator() - Method in class org.apache.spark.sql.execution.Filter
-
- conf() - Method in class org.apache.spark.SparkContext
-
- conf() - Method in class org.apache.spark.SparkEnv
-
- conf() - Method in class org.apache.spark.streaming.StreamingContext
-
- confidence() - Method in class org.apache.spark.partial.BoundedDouble
-
- configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
- confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns confusion matrix:
predicted classes are in columns,
they are ordered by class label ascending,
as in "labels"
- connectionManager() - Method in class org.apache.spark.SparkEnv
-
- ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
An input stream that always returns the same RDD on each timestep.
- ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- contains(String) - Method in class org.apache.spark.SparkConf
-
Does the configuration contain a given parameter?
- containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return whether the given block is stored in this block manager in O(1) time.
- containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- context() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- context() - Method in class org.apache.spark.InterruptibleIterator
-
- context() - Method in class org.apache.spark.rdd.RDD
-
- context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- context() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return the StreamingContext associated with this DStream
- Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents a matrix in coordinate format.
- CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Makes a deep copy of this vector.
- copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
-
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the
class when applicable for non-locking concurrent usage.
- copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- copy() - Method in class org.apache.spark.util.StatCounter
-
Clone this StatCounter
- corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the Pearson correlation matrix for the input RDD of Vectors.
- corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the correlation matrix for the input RDD of Vectors using the specified method.
- corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the Pearson correlation for the input RDDs.
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the correlation for the input RDDs using the specified method.
- count() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
-
- count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample size.
- count() - Method in class org.apache.spark.rdd.RDD
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.sql.SchemaRDD
-
:: Experimental ::
Return the number of elements in the RDD.
- count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.util.StatCounter
-
- countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for counting the number of elements in the RDD.
- countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Count the number of elements for each key, and return the result to the master as a Map.
- countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Count the number of elements for each key, and return the result to the master as a Map.
- countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the count of each unique value in this RDD as a map of (value, count) pairs.
- countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the count of each unique value in this RDD as a map of (value, count) pairs.
- countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of countByValue().
- countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a window over this DStream.
- countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a sliding window over this DStream.
- create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Deprecated.
- create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Create a new StorageLevel object.
- create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
-
Create a PartitionPruningRDD.
- create(Object...) - Static method in class org.apache.spark.sql.api.java.Row
-
Creates a Row with the given values.
- create(Seq<Object>) - Static method in class org.apache.spark.sql.api.java.Row
-
Creates a Row with the given values.
- create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
-
- createArrayType(DataType) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates an ArrayType by specifying the data type of elements (elementType
).
- createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates an ArrayType by specifying the data type of elements (elementType
) and
whether the array contains null values (containsNull
).
- createCodec(SparkConf) - Method in interface org.apache.spark.io.CompressionCodec
-
- createCodec(SparkConf, String) - Method in interface org.apache.spark.io.CompressionCodec
-
- createCombiner() - Method in class org.apache.spark.Aggregator
-
- createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a MapType by specifying the data type of keys (keyType
) and values
(keyType
).
- createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a MapType by specifying the data type of keys (keyType
), the data type of
values (keyType
), and whether values contain any null value
(valueContainsNull
).
- createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
:: Experimental ::
Creates an empty parquet file with the schema of class beanClass
, which can be registered as
a table.
- createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates an empty parquet file with the schema of class A
, which can be registered as a table.
- createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
Creates a SchemaRDD from an RDD of case classes.
- createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages form a Kafka Broker.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages form a Kafka Broker.
- createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages form a Kafka Broker.
- createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an InputDStream that pulls messages from a Kinesis stream.
- createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
- createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructField by specifying the name (name
), data type (dataType
) and
whether values of this field can be null values (nullable
).
- createStructType(List<StructField>) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructType with the given list of StructFields (fields
).
- createStructType(StructField[]) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructType with the given StructField array (fields
).
- createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext
-
Creates a table using the schema of the given class.
- creationSite() - Method in class org.apache.spark.rdd.RDD
-
User code that created this RDD (e.g.
- creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
-
- failed() - Method in class org.apache.spark.scheduler.TaskInfo
-
- failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- failureReason() - Method in class org.apache.spark.scheduler.StageInfo
-
If the stage failed, the reason why.
- FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- FakeParquetSerDe - Class in org.apache.spark.sql.hive.parquet
-
A placeholder that allows SparkSQL users to create metastore tables that are stored as
parquet files.
- FakeParquetSerDe() - Constructor for class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
-
- falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns false positive rate for a given label (category)
- feature() - Method in class org.apache.spark.mllib.tree.model.Split
-
- features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- FeatureType - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to describe whether a feature is "continuous" or "categorical"
- FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.Split
-
- FetchFailed - Class in org.apache.spark
-
:: DeveloperApi ::
Task failed to fetch shuffle data from a remote node.
- FetchFailed(BlockManagerId, int, int, int) - Constructor for class org.apache.spark.FetchFailed
-
- field() - Method in class org.apache.spark.storage.BroadcastBlockId
-
- FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- files() - Method in class org.apache.spark.SparkContext
-
- fileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<Row, Boolean>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- Filter - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- Filter(Expression, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Filter
-
- filter(Function1<Row, Object>) - Method in class org.apache.spark.sql.SchemaRDD
-
- filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
-
Filters this RDD with p, where p takes an additional parameter of type A.
- findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of a word
- findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of the vector representation of a word
- finished() - Method in class org.apache.spark.scheduler.TaskInfo
-
- finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task has completed successfully (including the time to remotely fetch
results, if necessary).
- first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- first() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- first() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.rdd.RDD
-
Return the first element in this RDD.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
-
Computes the mean and variance and stores as a model to be used for later scaling.
- fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary.
- fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary (Java version).
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more output records from each input record.
- FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A function that takes two inputs and returns zero or more output records.
- flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
FlatMaps f over this RDD, where f takes an additional parameter of type A.
- floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
-
- FloatType - Static variable in class org.apache.spark.sql.api.java.DataType
-
Gets the FloatType object.
- FloatType - Class in org.apache.spark.sql.api.java
-
The data type representing float and Float values.
- floatWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- floor(Duration) - Method in class org.apache.spark.streaming.Time
-
- FlumeUtils - Class in org.apache.spark.streaming.flume
-
- FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
-
- flush() - Method in class org.apache.spark.serializer.SerializationStream
-
- fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure for a given label (category)
- fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f1-measure for a given label (category)
- fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure
(equals to precision and recall because precision equals recall)
- fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve.
- fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve with beta = 1.0.
- fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative function and a neutral "zero value".
- fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative function and a neutral "zero value".
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value"
which may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to all elements of this RDD.
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to all elements of this RDD.
- foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As of release 0.9.0, replaced by foreachRDD
- foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As of release 0.9.0, replaced by foreachRDD
- foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to all elements of this RDD.
- foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to each partition of this RDD.
- foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to each partition of this RDD.
- foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies f to each element of this RDD, where f takes an additional parameter of type A.
- formatExecutorId(String) - Method in class org.apache.spark.storage.StorageStatusListener
-
In the local mode, there is a discrepancy between the executor ID according to the
task ("localhost") and that according to SparkEnv ("").
- fraction() - Method in class org.apache.spark.sql.execution.Sample
-
- fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
-
- fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
-
- fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
Convert a JavaRDD of key-value pairs to JavaPairRDD.
- fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromProductRdd(RDD<A>, TypeTags.TypeTag<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
-
- fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
-
- fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo
-
Construct a StageInfo from a Stage.
- fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Return the StorageLevel object with the specified name.
- Function<T1,R> - Interface in org.apache.spark.api.java.function
-
Base interface for functions whose return types do not create special RDDs.
- Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A two-argument function that takes arguments of type T1 and T2 and returns an R.
- Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
-
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
- FutureAction<T> - Interface in org.apache.spark
-
:: Experimental ::
A future for the result of an action to support cancellation.
- gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
- GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
- GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
GeneralizedLinearModel (GLM) represents a model trained using
GeneralizedLinearAlgorithm.
- GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- Generate - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Applies a Generator
to a stream of input rows, combining the
output of each into a new stream of rows.
- Generate(Generator, boolean, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Generate
-
- generate(Generator, boolean, boolean, Option<String>) - Method in class org.apache.spark.sql.SchemaRDD
-
:: Experimental ::
Applies the given Generator, or table generating function, to this relation.
- GeneratedAggregate - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Alternate version of aggregation that leverages projection and thus code generation.
- GeneratedAggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.GeneratedAggregate
-
- generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
-
- generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
Generate an RDD containing test data for KMeans.
- generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Return a Java List of synthetic data randomly generated according to a multi
collinear model.
- generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso,
and uregularized variants.
- generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
Generate an RDD containing test data for LogisticRegression.
- generator() - Method in class org.apache.spark.sql.execution.Generate
-
- get() - Method in interface org.apache.spark.FutureAction
-
Blocks and returns the result of this job.
- get(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter; throws a NoSuchElementException if it's not set
- get(String, String) - Method in class org.apache.spark.SparkConf
-
Get a parameter, falling back to a default if not set
- get() - Static method in class org.apache.spark.SparkEnv
-
Returns the ThreadLocal SparkEnv, if non-null.
- get(String) - Static method in class org.apache.spark.SparkFiles
-
Get the absolute path of a file added through SparkContext.addFile()
.
- get(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column `i`.
- getAkkaConf() - Method in class org.apache.spark.SparkConf
-
Get all akka conf variables set on this SparkConf
- getAll() - Method in class org.apache.spark.SparkConf
-
Get all parameters as a list of pairs
- getAllPools() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return pools for fair scheduler
- getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return the given block stored in this block manager in O(1) time.
- getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a boolean, falling back to a default if not set
- getBoolean(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a bool.
- getByte(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a byte.
- getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
-
- getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
The three methods below are helpers for accessing the local map, a property of the SparkEnv of
the local process.
- getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- getCheckpointDir() - Method in class org.apache.spark.SparkContext
-
- getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Gets the name of the file to which this RDD was checkpointed
- getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
-
Gets the name of the file to which this RDD was checkpointed
- getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Return a copy of this JavaSparkContext's configuration.
- getConf() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getConf() - Method in class org.apache.spark.SparkContext
-
Return a copy of this SparkContext's configuration.
- getCreationSite() - Static method in class org.apache.spark.streaming.dstream.DStream
-
Get the creation site of a DStream from the stack trace of when the DStream is created.
- getDataType() - Method in class org.apache.spark.sql.api.java.StructField
-
- getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- getDouble(String, double) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a double, falling back to a default if not set
- getDouble(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a double.
- getElementType() - Method in class org.apache.spark.sql.api.java.ArrayType
-
- getExecutorEnv() - Method in class org.apache.spark.SparkConf
-
Get all executor environment variables set on this SparkConf
- getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
-
Return a map from the slave to the max memory available for caching and the remaining
memory available for caching.
- getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return information about blocks stored in all of the slaves
- getFields() - Method in class org.apache.spark.sql.api.java.StructType
-
- getFinalValue() - Method in class org.apache.spark.partial.PartialResult
-
Blocking method to wait for and return the final value.
- getFloat(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a float.
- getHiveFile(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- getInt(String, int) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an integer, falling back to a default if not set
- getInt(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as an int.
- getKeyType() - Method in class org.apache.spark.sql.api.java.MapType
-
- getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLocalProperty(String) - Method in class org.apache.spark.SparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLong(String, long) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a long, falling back to a default if not set
- getLong(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a long.
- getName() - Method in class org.apache.spark.sql.api.java.StructField
-
- getObjectInspector() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
-
- getOption(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an Option
- getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getParents(int) - Method in class org.apache.spark.NarrowDependency
-
Get the parent partitions for a child partition.
- getParents(int) - Method in class org.apache.spark.OneToOneDependency
-
- getParents(int) - Method in class org.apache.spark.RangeDependency
-
- getPartition(Object) - Method in class org.apache.spark.HashPartitioner
-
- getPartition(Object) - Method in class org.apache.spark.Partitioner
-
- getPartition(Object) - Method in class org.apache.spark.RangePartitioner
-
- getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
-
- getPartitions() - Method in class org.apache.spark.sql.SchemaRDD
-
- getPersistentRDDs() - Method in class org.apache.spark.SparkContext
-
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
- getPoolForName(String) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return the pool associated with the given name, if one exists
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
-
- getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return information about what RDDs are cached, if they are in mem or on disk, how much space
they take, etc.
- getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Gets the receiver object that will be sent to the worker nodes
to receive data.
- getRootDirectory() - Static method in class org.apache.spark.SparkFiles
-
Get the root directory that contains files added through SparkContext.addFile()
.
- getSchedulingMode() - Method in class org.apache.spark.SparkContext
-
Return current scheduling mode
- getSerDeStats() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
-
- getSerializedClass() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
-
- getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
-
- getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
-
- getShort(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a short.
- getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get Spark's home location from either a value set through the constructor,
or the spark.home Java property, or the SPARK_HOME environment variable
(in that order of preference).
- getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getStorageLevel() - Method in class org.apache.spark.rdd.RDD
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getString(int) - Method in class org.apache.spark.sql.api.java.Row
-
Returns the value of column i
as a String.
- getThreadLocal() - Static method in class org.apache.spark.SparkEnv
-
Returns the ThreadLocal SparkEnv.
- gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
-
- gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task started remotely getting the result.
- getValueType() - Method in class org.apache.spark.sql.api.java.MapType
-
- Gini - Class in org.apache.spark.mllib.tree.impurity
-
:: Experimental ::
Class for calculating the
Gini impurity
during binary classification.
- Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
-
- global() - Method in class org.apache.spark.sql.execution.Sort
-
- glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- glom() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- Gradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to compute the gradient for a loss function, given a single data point.
- Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
-
- GradientDescent - Class in org.apache.spark.mllib.optimization
-
Class used to solve an optimization problem using Gradient Descent.
- graph() - Method in class org.apache.spark.streaming.dstream.DStream
-
- graph() - Method in class org.apache.spark.streaming.StreamingContext
-
- groupBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function<T, K>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Seq<Expression>, Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
-
Performs a grouping followed by an aggregation.
- groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
on each RDD of this
DStream.
- groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
on each RDD.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Create a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupingExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
-
- groupingExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
-
- makeCopy(Object[]) - Method in class org.apache.spark.sql.execution.SparkPlan
-
Overridden make copy also propogates sqlContext to copied plan.
- makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD, with one or more
location preferences (hostnames of Spark nodes) for each object.
- map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
-
Transform this PartialResult into a PartialResult of type T.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to all elements of this RDD.
- map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream.
- mapId() - Method in class org.apache.spark.FetchFailed
-
- mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- mapOutputTracker() - Method in class org.apache.spark.SparkEnv
-
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
-
- mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- MapType - Class in org.apache.spark.sql.api.java
-
The data type representing Maps.
- mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Maps f over this RDD, where f takes an additional parameter of type A.
- master() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- master() - Method in class org.apache.spark.SparkContext
-
- Matrices - Class in org.apache.spark.mllib.linalg
-
- Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
-
- Matrix - Interface in org.apache.spark.mllib.linalg
-
Trait for a local matrix.
- MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents an entry in an distributed matrix.
- MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
-
Model representing the result of matrix factorization.
- max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the maximum element from this RDD as defined by the specified
Comparator[T].
- max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Maximum value of each column.
- max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the max of this RDD as defined by the implicit Ordering[T].
- max(Duration) - Method in class org.apache.spark.streaming.Duration
-
- max(Time) - Method in class org.apache.spark.streaming.Time
-
- max() - Method in class org.apache.spark.util.StatCounter
-
- maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- maxMem() - Method in class org.apache.spark.storage.StorageStatus
-
- maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the mean of this RDD's elements.
- mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample mean vector.
- mean() - Method in class org.apache.spark.partial.BoundedDouble
-
- mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the mean of this RDD's elements.
- mean() - Method in class org.apache.spark.util.StatCounter
-
- meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the approximate mean of the elements in this RDD.
- meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the mean within a timeout.
- meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
:: Experimental ::
Approximate operation to return the mean within a timeout.
- MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- memRemaining() - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory remaining in this block manager.
- memSize() - Method in class org.apache.spark.storage.BlockStatus
-
- memSize() - Method in class org.apache.spark.storage.RDDInfo
-
- memUsed() - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory used by this block manager.
- memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory used by the given RDD in this block manager in O(1) time.
- merge(R) - Method in class org.apache.spark.Accumulable
-
Merge two accumulable objects together
- merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Merges another.
- merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Merge another MultivariateOnlineSummarizer, and update the statistical summary.
- merge(double) - Method in class org.apache.spark.util.StatCounter
-
Add a value into this StatCounter, updating the internal statistics.
- merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
-
Add multiple values into this StatCounter, updating the internal statistics.
- merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
-
Merge another StatCounter into this one, adding up the internal statistics.
- mergeCombiners() - Method in class org.apache.spark.Aggregator
-
- mergeValue() - Method in class org.apache.spark.Aggregator
-
- metadataCleaner() - Method in class org.apache.spark.SparkContext
-
- metastorePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
-
- metastorePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- metrics() - Method in class org.apache.spark.ExceptionFailure
-
- metricsSystem() - Method in class org.apache.spark.SparkEnv
-
- MFDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate RDD(s) containing data for Matrix Factorization.
- MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
-
- milliseconds() - Method in class org.apache.spark.streaming.Duration
-
- Milliseconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of milliseconds.
- Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
-
- milliseconds() - Method in class org.apache.spark.streaming.Time
-
- millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
Reformat a time interval in milliseconds to a prettier format for output
- min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the minimum element from this RDD as defined by the specified
Comparator[T].
- min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Minimum value of each column.
- min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the min of this RDD as defined by the implicit Ordering[T].
- min(Duration) - Method in class org.apache.spark.streaming.Duration
-
- min(Time) - Method in class org.apache.spark.streaming.Time
-
- min() - Method in class org.apache.spark.util.StatCounter
-
- MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- Minutes - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of minutes.
- Minutes() - Constructor for class org.apache.spark.streaming.Minutes
-
- MLUtils - Class in org.apache.spark.mllib.util
-
Helper methods to load, save and pre-process data used in ML Lib.
- MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
-
- model() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- MQTTUtils - Class in org.apache.spark.streaming.mqtt
-
- MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
-
- MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
-
::Experimental::
Evaluator for multiclass classification.
- MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(double) - Method in class org.apache.spark.util.Vector
-
- MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
-
:: DeveloperApi ::
MultivariateOnlineSummarizer implements
MultivariateStatisticalSummary
to compute the mean,
variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector
format in a online fashion.
- MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
-
Trait for multivariate statistical summary of a data matrix.
- mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
-
- MutablePair<T1,T2> - Class in org.apache.spark.util
-
:: DeveloperApi ::
A tuple of 2 elements.
- MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
-
- MutablePair() - Constructor for class org.apache.spark.util.MutablePair
-
No-arg constructor for serialization
- PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
-
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
- PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more key-value pair records from each input record.
- PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns key-value pairs (Tuple2), and can be used to construct PairRDDs.
- PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
- PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
-
- parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parquetFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
- parquetFile(String) - Method in class org.apache.spark.sql.SQLContext
-
Loads a Parquet file, returning the result as a
SchemaRDD
.
- ParquetTableScan - Class in org.apache.spark.sql.parquet
-
Parquet table scan operator.
- ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
-
- parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Parses a string resulted from
Vector#toString
into
an
Vector
.
- parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
-
Parses a string resulted from
LabeledPoint#toString
into
an
LabeledPoint
.
- partial() - Method in class org.apache.spark.sql.execution.Aggregate
-
- partial() - Method in class org.apache.spark.sql.execution.Distinct
-
- partial() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- PartialResult<R> - Class in org.apache.spark.partial
-
- PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
-
- Partition - Interface in org.apache.spark
-
A partition of an RDD.
- partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a copy of the RDD partitioned using the specified partitioner.
- partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a copy of the RDD partitioned using the specified partitioner.
- Partitioner - Class in org.apache.spark
-
An object that defines how the elements in a key-value pair RDD are partitioned by key.
- Partitioner() - Constructor for class org.apache.spark.Partitioner
-
- partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- partitioner() - Method in class org.apache.spark.rdd.RDD
-
Optionally overridden by subclasses to specify how they are partitioned.
- partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- partitioner() - Method in class org.apache.spark.ShuffleDependency
-
- partitionId() - Method in class org.apache.spark.TaskContext
-
- partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- PartitionPruningRDD<T> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
all partitions.
- PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
-
- partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Set of partitions in this RDD.
- partitions() - Method in class org.apache.spark.rdd.RDD
-
Get the array of partitions of this RDD, taking into account whether the
RDD is checkpointed or not.
- partOutput() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- path() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- path() - Method in class org.apache.spark.scheduler.SplitInfo
-
- percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- persist() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- persist(StorageLevel) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist the RDDs of this DStream with the given storage level
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persistentRdds() - Method in class org.apache.spark.SparkContext
-
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(String) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
-
return (this + plus) dot other, but without creating any intermediate storage
- PoissonGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
-
- poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
- PoissonSampler<T> - Class in org.apache.spark.util.random
-
:: DeveloperApi ::
A sampler based on values drawn from Poisson distribution.
- PoissonSampler(double) - Constructor for class org.apache.spark.util.random.PoissonSampler
-
- poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
- poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- port() - Method in class org.apache.spark.storage.BlockManagerId
-
- pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the precision-recall curve, which is an RDD of (recall, precision),
NOT (precision, recall), with (0.0, 1.0) prepended to it.
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision for a given label (category)
- precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision
- precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, precision) curve.
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for examples stored in a JavaRDD.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Returns the cluster index that a given point belongs to.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of one user for one product.
- predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of many users for many products.
- predict(JavaRDD<byte[]>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
:: DeveloperApi ::
Predict the rating of many users for many products.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for examples stored in a JavaRDD.
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- predict() - Method in class org.apache.spark.mllib.tree.model.Node
-
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
-
predict value if node is not leaf
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on batches of data from a DStream
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Override this to specify a preferred location (hostname).
- preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
-
Get the preferred locations of a partition (as hostnames), taking into account whether the
RDD is checkpointed.
- preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
-
- prettyPrint() - Method in class org.apache.spark.streaming.Duration
-
- prev() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first ten elements of each RDD generated in this DStream.
- print() - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first ten elements of each RDD generated in this DStream.
- printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- prob() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the all jobs of this batch to finish processing from the time they started
processing.
- processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- product() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- productToRowRdd(RDD<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
-
- progressListener() - Method in class org.apache.spark.streaming.StreamingContext
-
- Project - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- Project(Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Project
-
- projectList() - Method in class org.apache.spark.sql.execution.Project
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- Pseudorandom - Interface in org.apache.spark.util.random
-
:: DeveloperApi ::
A class with pseudorandom behavior.
- putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
The probability of obtaining a test statistic result at least as extreme as the one that was
actually observed, assuming that the null hypothesis is true.
- RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- random(int, Random) - Static method in class org.apache.spark.util.Vector
-
Creates this
Vector
of given length containing random numbers
between 0.0 and 1.0.
- RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Trait for random data generators that generate i.i.d.
- randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
:: DeveloperApi ::
Generates an RDD comprised of i.i.d.
- RandomRDDs - Class in org.apache.spark.mllib.random
-
:: Experimental ::
Generator methods for creating RDDs comprised of i.i.d.
- RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
-
- RandomSampler<T,U> - Interface in org.apache.spark.util.random
-
:: DeveloperApi ::
A pseudorandom sampler.
- randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
-
Randomly splits this RDD with the provided weights.
- randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
:: DeveloperApi ::
Generates an RDD[Vector] with vectors containing i.i.d.
- RangeDependency<T> - Class in org.apache.spark
-
:: DeveloperApi ::
Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
- RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
-
- RangePartitioner<K,V> - Class in org.apache.spark
-
A
Partitioner
that partitions sortable records by range into roughly
equal ranges.
- RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
-
- rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- Rating - Class in org.apache.spark.mllib.recommendation
-
:: Experimental ::
A more compact class to represent a rating than Tuple3[Int, Int, Double].
- Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
-
- rating() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaRDD
-
- rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- rdd() - Method in class org.apache.spark.Dependency
-
- rdd() - Method in class org.apache.spark.NarrowDependency
-
- RDD<T> - Class in org.apache.spark.rdd
-
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
- RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
- RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
Construct an RDD with just a one-to-one dependency on one parent
- rdd() - Method in class org.apache.spark.ShuffleDependency
-
- rdd() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- rdd() - Method in class org.apache.spark.sql.execution.ExistingRdd
-
- RDD() - Static method in class org.apache.spark.storage.BlockId
-
- RDDBlockId - Class in org.apache.spark.storage
-
- RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
-
- rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
-
Return the RDD blocks stored in this block manager.
- rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the blocks that belong to the given RDD stored in this block manager.
- rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- rddId() - Method in class org.apache.spark.storage.RDDBlockId
-
- RDDInfo - Class in org.apache.spark.storage
-
- RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
-
- rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
-
Filter RDD info to include only those with cached partitions
- rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
-
- rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- rdds() - Method in class org.apache.spark.rdd.UnionRDD
-
- rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the storage level, if any, used by the given RDD in this block manager.
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
-
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
-
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-
- readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
-
- readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
- ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Blocks until this action completes.
- ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall for a given label (category)
- recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall
(equals to precision for multiclass classifier
because sum of all false positives is equal to sum
of all false negatives)
- recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, recall) curve.
- receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- Receiver<T> - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
Abstract class of a receiver that can be run on worker nodes to receive external data.
- Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
-
- ReceiverInfo - Class in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
Class having information about a receiver
- ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
Abstract class for defining any
InputDStream
that has to start a receiver on worker nodes to receive external data.
- ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends products to a user.
- recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends users to a product.
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD using the specified commutative and associative binary
operator.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD using the specified commutative and
associative binary operator.
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Create a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by reducing over a using incremental computation.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function, but return the results
immediately to the master as a Map.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function, but return the results
immediately to the master as a Map.
- reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for reduceByKeyLocally
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceId() - Method in class org.apache.spark.FetchFailed
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
-
- registerRDDAsTable(JavaSchemaRDD, String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
Registers the given RDD as a temporary table in the catalog.
- registerRDDAsTable(SchemaRDD, String) - Method in class org.apache.spark.sql.SQLContext
-
Registers the given RDD as a temporary table in the catalog.
- registerTestTable(TestHiveContext.TestTable) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- RegressionModel - Interface in org.apache.spark.mllib.regression
-
- relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
-
- relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
- remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
-
Set each DStreams in this context to remember RDDs it generated in the last given duration.
- rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- remove(String) - Method in class org.apache.spark.SparkConf
-
Remove a parameter from the configuration
- repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return a new RDD that has exactly numPartitions
partitions.
- repartition(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
-
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream with an increased or decreased level of parallelism.
- replication() - Method in class org.apache.spark.storage.StorageLevel
-
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Report exceptions in receiving data.
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Aggregate
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Distinct
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.HashOuterJoin
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Sort
-
- requiredChildDistribution() - Method in class org.apache.spark.sql.execution.SparkPlan
-
Specifies any partition requirements on the input data for this operator.
- reset() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
Resets the test instance by deleting any tables that have been created.
- restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- Resubmitted - Class in org.apache.spark
-
:: DeveloperApi ::
A ShuffleMapTask
that completed successfully earlier, but we
lost the executor before the stage completed.
- Resubmitted() - Constructor for class org.apache.spark.Resubmitted
-
- result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Awaits and returns the result (of type T) of this action.
- result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- result() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
-
- resultAttribute() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
-
- resultAttribute() - Method in class org.apache.spark.sql.execution.EvaluatePython
-
- resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
-
- retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- RidgeRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using RidgeRegression.
- RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L2-regularization using Stochastic Gradient Descent.
- RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100,
regParam: 1.0, miniBatchFraction: 1.0}.
- right() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
-
- right() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
-
- right() - Method in class org.apache.spark.sql.execution.CartesianProduct
-
- right() - Method in class org.apache.spark.sql.execution.Except
-
- right() - Method in interface org.apache.spark.sql.execution.HashJoin
-
- right() - Method in class org.apache.spark.sql.execution.HashOuterJoin
-
- right() - Method in class org.apache.spark.sql.execution.Intersect
-
- right() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
-
The Broadcast relation
- right() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
-
- right() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
-
- rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rightKeys() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
-
- rightKeys() - Method in interface org.apache.spark.sql.execution.HashJoin
-
- rightKeys() - Method in class org.apache.spark.sql.execution.HashOuterJoin
-
- rightKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
-
- rightKeys() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
-
- rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rng() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- rng() - Method in class org.apache.spark.util.random.PoissonSampler
-
- roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the receiver operating characteristic (ROC) curve,
which is an RDD of (false positive rate, true positive rate)
with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
- Row - Class in org.apache.spark.sql.api.java
-
A result row from a SparkSQL query.
- Row(Row) - Constructor for class org.apache.spark.sql.api.java.Row
-
- row() - Method in class org.apache.spark.sql.api.java.Row
-
- RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents a row-oriented distributed Matrix with no meaningful row indices.
- RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
-
Executes some action enclosed in the closure.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Train a K-means model on the given set of points; data
should be cached for high
performance, because this is an iterative algorithm.
- run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input
RDD of LabeledPoint entries.
- run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input RDD
of LabeledPoint entries starting from the initial weights provided.
- runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Run a job that can return approximate results.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
-
Runs a Spark job.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and pass the results to the given
handler function.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and return the results as an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on a given set of partitions of an RDD, but take a function of type
Iterator[T] => U
instead of (TaskContext, Iterator[T]) => U
.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
-
Run Limited-memory BFGS (L-BFGS) in parallel.
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
-
Run stochastic gradient descent (SGD) in parallel using mini batches.
- running() - Method in class org.apache.spark.scheduler.TaskInfo
-
- runningLocally() - Method in class org.apache.spark.TaskContext
-
- runSqlHive(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
-
Return a sampled subset of this RDD.
- Sample - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- Sample(double, boolean, long, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sample
-
- sample(boolean, double, long) - Method in class org.apache.spark.sql.SchemaRDD
-
:: Experimental ::
Returns a sampled version of the underlying dataset.
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
-
- sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
-
take a random sample
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.util.StatCounter
-
Return the sample standard deviation of the values, which corrects for bias in estimating the
variance by dividing by N-1 instead of N.
- sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the standard variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.util.StatCounter
-
Return the sample variance, which corrects for bias in estimating the variance by dividing
by N-1 instead of N.
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
- saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHiveFile(RDD<Writable>, Class<?>, FileSinkDesc, JobConf, boolean) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Save labeled data in LIBSVM format.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using
a Configuration object for that storage system.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop
Configuration object for that storage system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as a Sequence file of serialized objects.
- saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
-
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
and value types.
- saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as at text file, using string representation
of elements.
- saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- sc() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- sc() - Method in class org.apache.spark.streaming.StreamingContext
-
- scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- scheduler() - Method in class org.apache.spark.streaming.StreamingContext
-
- schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the first job of this batch to start processing from the time this batch
was submitted to the streaming scheduler.
- SchedulingMode - Class in org.apache.spark.scheduler
-
"FAIR" and "FIFO" determines which policy is used
to order tasks amongst a Schedulable's sub-queues
"NONE" is used when the a Schedulable has no sub-queues.
- SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
-
- schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- schema() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Returns the schema of this JavaSchemaRDD (represented by a StructType).
- schema() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
-
- schema() - Method in class org.apache.spark.sql.SchemaRDD
-
Returns the schema of this SchemaRDD (represented by a StructType
).
- SchemaRDD - Class in org.apache.spark.sql
-
:: AlphaComponent ::
An RDD of Row
objects that has an associated schema.
- SchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.SchemaRDD
-
- script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- ScriptTransformation - Class in org.apache.spark.sql.hive.execution
-
:: DeveloperApi ::
Transforms the input by forking and running the specified script.
- ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- Seconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of seconds.
- Seconds() - Constructor for class org.apache.spark.streaming.Seconds
-
- securityManager() - Method in class org.apache.spark.SparkEnv
-
- seed() - Method in class org.apache.spark.sql.execution.Sample
-
- select(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
-
Changes the output of this relation to the given expressions, similar to the SELECT
clause
in SQL.
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile.
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
-
Version of sequenceFile() for types implicitly convertible to Writables through a
WritableConverter.
- SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile,
through an implicit conversion.
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-
- SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
-
- SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
-
- SerializationStream - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A stream for writing serialized objects.
- SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- serialize(Object, ObjectInspector) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
-
- Serializer - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A serializer.
- Serializer() - Constructor for class org.apache.spark.serializer.Serializer
-
- serializer() - Method in class org.apache.spark.ShuffleDependency
-
- serializer() - Method in class org.apache.spark.SparkEnv
-
- SerializerInstance - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
An instance of a serializer, for use by one thread at a time.
- SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- set(String, String) - Method in class org.apache.spark.SparkConf
-
Set a configuration variable.
- set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
-
- setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set aggregator for RDD's shuffle.
- setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple parameters together
- setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
:: Experimental ::
Sets the constant used in computing confidence in implicit ALS.
- setAppName(String) - Method in class org.apache.spark.SparkConf
-
Set a name for your application.
- setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of blocks for both user blocks and product blocks to parallelize the computation
into; pass -1 for an auto-configured number of blocks.
- setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- setCallSite(String) - Method in class org.apache.spark.SparkContext
-
Set the thread-local property for overriding the call sites
of actions and RDDs.
- setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- SetCommand - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- SetCommand(Option<String>, Option<String>, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.SetCommand
-
- setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
-
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the convergence tolerance of iterations for L-BFGS.
- setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
-
Sets a class loader for the serializer to use in deserialization.
- setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the distance threshold within which we've consider centers to have converged.
- setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
-
Set an environment variable to be used when launching executors for this application.
- setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the gradient function (of the loss function of one single data example)
to be used for SGD.
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the gradient function (of the loss function of one single data example)
to be used for L-BFGS.
- setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
-
Set a parameter if it isn't already configured
- setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets whether to use implicit preference.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the initialization algorithm.
- setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of steps for the k-means|| initialization mode.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the initial weights.
- setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should add an intercept.
- setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
:: DeveloperApi ::
Sets storage level for intermediate RDDs (user/product in/out links).
- setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of iterations to run.
- setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJars(String[]) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJobDescription(String) - Method in class org.apache.spark.SparkContext
-
Set a human readable description of the current job.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of clusters to create (k).
- setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set key ordering for RDD's shuffle.
- setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Set the smoothing parameter.
- setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the regularization parameter, lambda.
- setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets initial learning rate (default: 0.025).
- setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
- setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
-
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
- setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set mapSideCombine flag for RDD's shuffle.
- setMaster(String) - Method in class org.apache.spark.SparkConf
-
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set maximum number of iterations to run.
- setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
:: Experimental ::
Set fraction of data to be used for each SGD iteration.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.rdd.RDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Assign a name to this RDD
- setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set whether the least-squares problems solved at each iteration should have
nonnegativity constraints.
- setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the number of corrections used in the LBFGS update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets number of iterations (default: 1), which should be smaller than or equal to number of
partitions.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the number of iterations for SGD.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the maximal number of iterations for L-BFGS.
- setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets number of partitions (default: 1).
- setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of product blocks to parallelize the computation.
- setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the rank of the feature matrices computed (number of features).
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the regularization parameter.
- setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
:: Experimental ::
Set the number of runs of the algorithm to execute in parallel.
- setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets random seed (default: a random long integer).
- setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets a random seed to have deterministic results.
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
-
- setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
-
Set random seed.
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSparkHome(String) - Method in class org.apache.spark.SparkConf
-
Set the location where Spark is installed on worker nodes.
- setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the initial step size of SGD for the first step.
- setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the step size for gradient descent.
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Sets the threshold that separates positive predictions from negative predictions.
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Sets the threshold that separates positive predictions from negative predictions.
- settings() - Method in class org.apache.spark.SparkConf
-
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the updater function to actually perform a gradient step in a given direction.
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the updater function to actually perform a gradient step in a given direction.
- setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of user blocks to parallelize the computation.
- setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should validate data before training.
- setValue(R) - Method in class org.apache.spark.Accumulable
-
Set the accumulator's value; only allowed on master
- setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets vector size (default: 100).
- shortCompressionCodecNames() - Method in interface org.apache.spark.io.CompressionCodec
-
- ShortType - Static variable in class org.apache.spark.sql.api.java.DataType
-
Gets the ShortType object.
- ShortType - Class in org.apache.spark.sql.api.java
-
The data type representing short and Short values.
- showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
-
- ShuffleBlockId - Class in org.apache.spark.storage
-
- ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
-
- ShuffleDependency<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
Represents a dependency on the output of a shuffle stage.
- ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
-
- ShuffledHashJoin - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Performs an inner hash join of two child relations by first shuffling the data using the join
keys.
- ShuffledHashJoin(Seq<Expression>, Seq<Expression>, BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.ShuffledHashJoin
-
- ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
The resulting RDD from a shuffle (e.g.
- ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
-
- shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.FetchFailed
-
- shuffleId() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- ShuffleIndexBlockId - Class in org.apache.spark.storage
-
- ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
-
- shuffleManager() - Method in class org.apache.spark.SparkEnv
-
- shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
-
- sideEffectResult() - Method in interface org.apache.spark.sql.execution.Command
-
A concrete command should override this lazy field to wrap up any side effects caused by the
command or any other computation that should be evaluated exactly once.
- SimpleFutureAction<T> - Class in org.apache.spark
-
:: Experimental ::
A
FutureAction
holding the result of an action that triggers a single job.
- SimpleUpdater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
A simple updater for gradient descent *without* any regularization.
- SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
-
- SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
-
:: Experimental ::
Represents singular value decomposition (SVD) factors.
- SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- size() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- size() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- size() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Size of the vector.
- sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Sketches the input RDD via reservoir sampling on each partition.
- slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
- slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs defined by the Interval object (both end times included)
- slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs between 'fromTime' to 'toTime' (both included)
- slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
Time interval after which the DStream generates a RDD
- slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- SnappyCompressionCodec - Class in org.apache.spark.io
-
- SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
-
- socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from TCP source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from TCP source hostname:port.
- solveLeastSquares(DoubleMatrix, DoubleMatrix, org.apache.spark.mllib.optimization.NNLS.Workspace) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Given A^T A and A^T b, find the x minimising ||Ax - b||_2, possibly subject
to nonnegativity constraints if nonnegative
is true.
- Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- Sort - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- Sort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sort
-
- sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return this RDD sorted by the given key function.
- sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return this RDD sorted by the given key function.
- sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements in
ascending order.
- sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortOrder() - Method in class org.apache.spark.sql.execution.Sort
-
- sortOrder() - Method in class org.apache.spark.sql.execution.TakeOrdered
-
- SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
-
- SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
-
- SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
-
- SPARK_UNKNOWN_USER() - Static method in class org.apache.spark.SparkContext
-
- SPARK_VERSION() - Static method in class org.apache.spark.SparkContext
-
- SparkConf - Class in org.apache.spark
-
Configuration for a Spark application.
- SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
-
- SparkConf() - Constructor for class org.apache.spark.SparkConf
-
Create a SparkConf that loads defaults from system properties and the classpath
- sparkContext() - Method in class org.apache.spark.rdd.RDD
-
The SparkContext that created this RDD.
- SparkContext - Class in org.apache.spark
-
Main entry point for Spark functionality.
- SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
-
- SparkContext() - Constructor for class org.apache.spark.SparkContext
-
Create a SparkContext that loads settings from system properties (for instance, when
launching with ./bin/spark-submit).
- SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Alternative constructor for setting preferred locations where Spark will create executors.
- SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- sparkContext() - Method in class org.apache.spark.sql.SQLContext
-
- sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
The underlying SparkContext
- sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
-
Return the associated Spark context
- SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- SparkEnv - Class in org.apache.spark
-
:: DeveloperApi ::
Holds all the runtime environment objects for a running Spark instance (either master or worker),
including the serializer, Akka actor system, block manager, map output tracker, etc.
- SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, org.apache.spark.storage.BlockManager, ConnectionManager, SecurityManager, HttpFileServer, String, org.apache.spark.metrics.MetricsSystem, ShuffleMemoryManager, SparkConf) - Constructor for class org.apache.spark.SparkEnv
-
- SparkException - Exception in org.apache.spark
-
- SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
-
- SparkException(String) - Constructor for exception org.apache.spark.SparkException
-
- SparkFiles - Class in org.apache.spark
-
Resolves paths to files added through SparkContext.addFile()
.
- SparkFiles() - Constructor for class org.apache.spark.SparkFiles
-
- sparkFilesDir() - Method in class org.apache.spark.SparkEnv
-
- SparkFlumeEvent - Class in org.apache.spark.streaming.flume
-
A wrapper class for AvroFlumeEvent's with a custom serialization format.
- SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- SparkListener - Interface in org.apache.spark.scheduler
-
:: DeveloperApi ::
Interface for listening to events from the Spark scheduler.
- SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- SparkListenerApplicationStart - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationStart(String, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerAdded(BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerRemoved(BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
-
- SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-
- SparkListenerEvent - Interface in org.apache.spark.scheduler
-
- SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
-
Periodic updates from executors.
- SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- SparkListenerJobEnd - Class in org.apache.spark.scheduler
-
- SparkListenerJobEnd(int, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
-
- SparkListenerJobStart - Class in org.apache.spark.scheduler
-
- SparkListenerJobStart(int, Seq<Object>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
-
- SparkListenerStageCompleted - Class in org.apache.spark.scheduler
-
- SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
-
- SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- SparkListenerTaskEnd - Class in org.apache.spark.scheduler
-
- SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
-
- SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- SparkListenerTaskStart - Class in org.apache.spark.scheduler
-
- SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
-
- SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
-
- SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- SparkLogicalPlan - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Allows already planned SparkQueries to be linked into logical query plans.
- SparkLogicalPlan(SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.SparkLogicalPlan
-
- SparkPlan - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- SparkPlan() - Constructor for class org.apache.spark.sql.execution.SparkPlan
-
- sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- sparkUser() - Method in class org.apache.spark.SparkContext
-
- sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector providing its index array and value array.
- sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs.
- sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
- SparseVector - Class in org.apache.spark.mllib.linalg
-
A sparse vector represented by an index array and an value array.
- SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
-
- speculative() - Method in class org.apache.spark.scheduler.TaskInfo
-
- split() - Method in class org.apache.spark.mllib.tree.model.Node
-
- Split - Class in org.apache.spark.mllib.tree.model
-
:: DeveloperApi ::
Split applied to a feature
- Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
-
- splitId() - Method in class org.apache.spark.TaskContext
-
- splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
-
- SplitInfo - Class in org.apache.spark.scheduler
-
- SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
-
- splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- sql(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
Executes a SQL query using Spark, returning the result as a SchemaRDD.
- sql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
-
- sql() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
-
- sql(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
- sql(String) - Method in class org.apache.spark.sql.SQLContext
-
Executes a SQL query using Spark, returning the result as a SchemaRDD.
- sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
- sqlContext() - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
-
- sqlContext() - Method in class org.apache.spark.sql.SchemaRDD
-
- SQLContext - Class in org.apache.spark.sql
-
:: AlphaComponent ::
The entry point for running relational queries using Spark.
- SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- squaredDist(Vector) - Method in class org.apache.spark.util.Vector
-
- SquaredL2Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Updater for L2 regularized problems.
- SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- ssc() - Method in class org.apache.spark.streaming.dstream.DStream
-
- stackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- stageId() - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in class org.apache.spark.TaskContext
-
- stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- StageInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Stores information about a stage to pass from the scheduler to SparkListeners.
- StageInfo(int, int, String, int, Seq<RDDInfo>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
-
- StandardNormalGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
-
- StandardScaler - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Standardizes features by removing the mean and scaling to unit variance using column summary
statistics on the samples in the training set.
- StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScalerModel - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Represents a StandardScaler model that can transform vectors.
- start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Start the execution of the streams.
- start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- start() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to start receiving data.
- start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- start() - Method in class org.apache.spark.streaming.StreamingContext
-
Start the execution of the streams.
- startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- startTime() - Method in class org.apache.spark.SparkContext
-
- StatCounter - Class in org.apache.spark.util
-
A class for tracking the statistics of a set of numbers (count, mean and variance) in a
numerically robust way.
- StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
-
- StatCounter() - Constructor for class org.apache.spark.util.StatCounter
-
Initialize the StatCounter with no values.
- state() - Method in class org.apache.spark.streaming.StreamingContext
-
- statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Test statistic.
- Statistics - Class in org.apache.spark.mllib.stat
-
API for statistical functions in MLlib.
- Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
-
- statistics() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
-
- Statistics - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
Statistics for querying the supervisor about state of workers.
- Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
-
- stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- stats() - Method in class org.apache.spark.mllib.tree.model.Node
-
- stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- StatsReportListener - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Simple SparkListener that logs a few summary statistics when each stage completes
- StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
-
- StatsReportListener - Class in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
A simple StreamingListener that logs summary statistics across Spark Streaming batches
- StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
-
- status() - Method in class org.apache.spark.scheduler.TaskInfo
-
- stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.util.StatCounter
-
Return the standard deviation of the values.
- stop() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Shut down the SparkContext.
- stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
- stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- stop() - Method in class org.apache.spark.SparkContext
-
Shut down the SparkContext.
- stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to stop receiving data.
- stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely.
- stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely due to an exception
- stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams immediately (does not wait for all received data
to be processed).
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams, with option of ensuring all received data
has been processed.
- storageLevel() - Method in class org.apache.spark.storage.BlockStatus
-
- storageLevel() - Method in class org.apache.spark.storage.RDDInfo
-
- StorageLevel - Class in org.apache.spark.storage
-
:: DeveloperApi ::
Flags for controlling the storage of an RDD.
- StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
-
- storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
-
- storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
-
- storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- StorageLevels - Class in org.apache.spark.api.java
-
Expose some commonly useful storage level constants.
- StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
-
- StorageListener - Class in org.apache.spark.ui.storage
-
:: DeveloperApi ::
A SparkListener that prepares information to be displayed on the BlockManagerUI.
- StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
-
- StorageStatus - Class in org.apache.spark.storage
-
:: DeveloperApi ::
Storage information for each BlockManager.
- StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
-
- StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
-
Create a storage status with an initial set of blocks, leaving the source unmodified.
- storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
-
- storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
-
- storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
-
- StorageStatusListener - Class in org.apache.spark.storage
-
:: DeveloperApi ::
A SparkListener that maintains executor storage status.
- StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
-
- store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store the bytes of received data as a data block into Spark's memory.
- store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store a single item of received data to Spark's memory.
- store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store a single item of received data to Spark's memory.
- store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- Strategy - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Stores all the configuration options for tree construction
- Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- STREAM() - Static method in class org.apache.spark.storage.BlockId
-
- StreamBlockId - Class in org.apache.spark.storage
-
- StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
-
- streamed() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
-
BuildRight means the right relation <=> the broadcast relation.
- streamed() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
-
- streamedKeys() - Method in interface org.apache.spark.sql.execution.HashJoin
-
- streamedPlan() - Method in interface org.apache.spark.sql.execution.HashJoin
-
- streamId() - Method in class org.apache.spark.storage.StreamBlockId
-
- streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Get the unique identifier the receiver input stream that this
receiver is associated with.
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- StreamingContext - Class in org.apache.spark.streaming
-
Main entry point for Spark Streaming functionality.
- StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext using an existing SparkContext.
- StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
- StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
- StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext
-
Accessor for nested Scala object
- StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
StreamingLinearAlgorithm implements methods for continuously
training a generalized linear model model on streaming data,
and using it for prediction on (possibly different) streaming data.
- StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
- StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train or predict a linear regression model on streaming data.
- StreamingLinearRegressionWithSGD(double, int, double, Vector) - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Construct a StreamingLinearRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
- StreamingListener - Interface in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
A listener interface for receiving information about an ongoing streaming
computation.
- StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-
- StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-
- StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-
- StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
Base trait for events related to StreamingListener
- StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- streamSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.HashJoin
-
- stringToText(String) - Static method in class org.apache.spark.SparkContext
-
- StringType - Static variable in class org.apache.spark.sql.api.java.DataType
-
Gets the StringType object.
- StringType - Class in org.apache.spark.sql.api.java
-
The data type representing String values.
- stringWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- StructField - Class in org.apache.spark.sql.api.java
-
A StructField object represents a field in a StructType object.
- StructType - Class in org.apache.spark.sql.api.java
-
The data type representing Rows.
- submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
When this stage was submitted from the DAGScheduler to a TaskScheduler.
- submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
-
:: Experimental ::
Submit a job for execution and return a FutureJob holding the result.
- subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
-
- subtract(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
-
- subtract(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
-
- subtract(Vector) - Method in class org.apache.spark.util.Vector
-
- subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- Success - Class in org.apache.spark
-
:: DeveloperApi ::
Task succeeded.
- Success() - Constructor for class org.apache.spark.Success
-
- successful() - Method in class org.apache.spark.scheduler.TaskInfo
-
- sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Add up the elements in this RDD.
- sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Add up the elements in this RDD.
- sum() - Method in class org.apache.spark.util.StatCounter
-
- sum() - Method in class org.apache.spark.util.Vector
-
- sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- SVMDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate sample data used for SVM.
- SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
-
- SVMModel - Class in org.apache.spark.mllib.classification
-
Model for Support Vector Machines (SVMs).
- SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
-
- SVMWithSGD - Class in org.apache.spark.mllib.classification
-
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
- SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
-
Construct a SVM object with default parameters
- systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- t() - Method in class org.apache.spark.SerializableWritable
-
- table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
-
- table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- table(String) - Method in class org.apache.spark.sql.SQLContext
-
Returns the specified table as a SchemaRDD
- tableName() - Method in class org.apache.spark.sql.execution.CacheCommand
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.DropTable
-
- tachyonFolderName() - Method in class org.apache.spark.SparkContext
-
- tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
-
- tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
-
- take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.rdd.RDD
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- take(int) - Method in class org.apache.spark.sql.SchemaRDD
-
- takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving the first num elements of the RDD.
- takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first K elements from this RDD as defined by
the specified Comparator[T] and maintains the order.
- takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first K elements from this RDD using the
natural ordering for T while maintain the order.
- takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the first K (smallest) elements from this RDD as defined by the specified
implicit Ordering[T] and maintains the ordering.
- TakeOrdered - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Take the first limit elements as defined by the sortOrder.
- TakeOrdered(int, Seq<SortOrder>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.TakeOrdered
-
- takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
-
Return a fixed-size sampled subset of this RDD in an array
- TaskCompletionListener - Interface in org.apache.spark.util
-
:: DeveloperApi ::
- TaskContext - Class in org.apache.spark
-
:: DeveloperApi ::
Contextual information about a task which can be read or mutated during execution.
- TaskContext(int, int, long, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContext
-
- TaskEndReason - Interface in org.apache.spark
-
:: DeveloperApi ::
Various possible reasons why a task ended.
- TaskFailedReason - Interface in org.apache.spark
-
:: DeveloperApi ::
Various possible reasons why a task failed.
- taskId() - Method in class org.apache.spark.scheduler.TaskInfo
-
- taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- TaskInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about a running task attempt inside a TaskSet.
- TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
-
- TaskKilled - Class in org.apache.spark
-
:: DeveloperApi ::
Task was killed intentionally and needs to be rescheduled.
- TaskKilled() - Constructor for class org.apache.spark.TaskKilled
-
- TaskKilledException - Exception in org.apache.spark
-
:: DeveloperApi ::
Exception thrown when a task is explicitly killed (i.e., task failure is expected).
- TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
-
- taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
-
- TaskLocality - Class in org.apache.spark.scheduler
-
- TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskMetrics() - Method in class org.apache.spark.TaskContext
-
- TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
-
- TaskResultBlockId - Class in org.apache.spark.storage
-
- TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
-
- TaskResultLost - Class in org.apache.spark
-
:: DeveloperApi ::
The task finished successfully, but the result was lost from the executor's block manager before
it was fetched.
- TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
-
- taskScheduler() - Method in class org.apache.spark.SparkContext
-
- taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- TEST() - Static method in class org.apache.spark.storage.BlockId
-
- TestHive - Class in org.apache.spark.sql.hive.test
-
- TestHive() - Constructor for class org.apache.spark.sql.hive.test.TestHive
-
- TestHiveContext - Class in org.apache.spark.sql.hive.test
-
A locally running test instance of Spark's Hive execution engine.
- TestHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext
-
- TestHiveContext.QueryExecution - Class in org.apache.spark.sql.hive.test
-
Override QueryExecution with special debug workflow.
- TestHiveContext.QueryExecution() - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
-
- TestHiveContext.TestTable - Class in org.apache.spark.sql.hive.test
-
- TestHiveContext.TestTable(String, Seq<Function0<BoxedUnit>>) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
-
- TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Trait for hypothesis test results.
- TestSQLContext - Class in org.apache.spark.sql.test
-
A SQLContext that can be used for local testing.
- TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
-
- testTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
A list of test tables and the DDL required to initialize them.
- testTempDir() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.SparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- threshold() - Method in class org.apache.spark.mllib.tree.model.Split
-
- thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns thresholds in descending order.
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- Time - Class in org.apache.spark.streaming
-
This is a simple class that represents an absolute instant of time.
- Time(long) - Constructor for class org.apache.spark.streaming.Time
-
- TimestampType - Static variable in class org.apache.spark.sql.api.java.DataType
-
Gets the TimestampType object.
- TimestampType - Class in org.apache.spark.sql.api.java
-
The data type representing java.sql.Timestamp values.
- to(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- toArray() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a dense array in column major.
- toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a double array.
- toArray() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
-
Collects data and assembles a local dense breeze matrix (for test only).
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a breeze matrix.
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a breeze vector.
- toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
-
- toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.rdd.RDD
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.SparkConf
-
Return a string listing all keys and values, one per line.
- toErrorString() - Method in class org.apache.spark.ExceptionFailure
-
- toErrorString() - Static method in class org.apache.spark.ExecutorLostFailure
-
- toErrorString() - Method in class org.apache.spark.FetchFailed
-
- toErrorString() - Static method in class org.apache.spark.Resubmitted
-
- toErrorString() - Method in interface org.apache.spark.TaskFailedReason
-
Error message displayed in the web UI.
- toErrorString() - Static method in class org.apache.spark.TaskKilled
-
- toErrorString() - Static method in class org.apache.spark.TaskResultLost
-
- toErrorString() - Static method in class org.apache.spark.UnknownReason
-
- toFormattedString() - Method in class org.apache.spark.streaming.Duration
-
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to IndexedRowMatrix.
- toInt() - Method in class org.apache.spark.storage.StorageLevel
-
- toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Convert to a JavaDStream
- toJavaRDD() - Method in class org.apache.spark.rdd.RDD
-
- toJavaSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
-
Returns this RDD as a JavaSchemaRDD.
- toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an iterator that contains all of the elements in this RDD.
- toLocalIterator() - Method in class org.apache.spark.rdd.RDD
-
Return an iterator that contains all of the elements in this RDD.
- toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
-
- top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top K elements from this RDD as defined by
the specified Comparator[T].
- top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top K elements from this RDD using the
natural ordering for T.
- top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
-
- topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to RowMatrix, dropping row indices after grouping by row index.
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Drops row indices and converts this matrix to a
RowMatrix
.
- TorrentBroadcastFactory - Class in org.apache.spark.broadcast
-
A
Broadcast
implementation that uses a BitTorrent-like
protocol to do a distributed transfer of the broadcasted data to the executors.
- TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- toSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
-
Returns this RDD as a SchemaRDD.
- toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.Accumulable
-
- toString() - Method in class org.apache.spark.api.java.JavaRDD
-
- toString() - Method in class org.apache.spark.broadcast.Broadcast
-
- toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
- toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
String explaining the hypothesis test result.
- toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print full model.
- toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Node
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Split
-
- toString() - Method in class org.apache.spark.partial.BoundedDouble
-
- toString() - Method in class org.apache.spark.partial.PartialResult
-
- toString() - Method in class org.apache.spark.rdd.RDD
-
- toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- toString() - Method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.SerializableWritable
-
- toString() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- toString() - Method in class org.apache.spark.storage.BlockId
-
- toString() - Method in class org.apache.spark.storage.BlockManagerId
-
- toString() - Method in class org.apache.spark.storage.RDDInfo
-
- toString() - Method in class org.apache.spark.storage.StorageLevel
-
- toString() - Method in class org.apache.spark.streaming.Duration
-
- toString() - Method in class org.apache.spark.streaming.Time
-
- toString() - Method in class org.apache.spark.util.MutablePair
-
- toString() - Method in class org.apache.spark.util.StatCounter
-
- toString() - Method in class org.apache.spark.util.Vector
-
- totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for all the jobs of this batch to finish processing from the time they
were submitted.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a Linear Regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model over an RDD
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
- trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to
some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by
users to some products, in the form of (userID, productID, rating) pairs.
- trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Update the model by training on batches of data from a DStream.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
- transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector.
- transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector (Java version).
- transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors.
- transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors (Java version).
- transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors.
- transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
- transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
-
Applies unit length normalization on a vector.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
Applies standardization transformation on a vector.
- transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on a vector.
- transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on an RDD[Vector].
- transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Transforms a word to its vector representation
- transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns true positive rate for a given label (category)
- TwitterUtils - Class in org.apache.spark.streaming.twitter
-
- TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
-
- U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- udf() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
-
- udf() - Method in class org.apache.spark.sql.execution.EvaluatePython
-
- UDF1<T1,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 1 arguments.
- UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 10 arguments.
- UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 11 arguments.
- UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 12 arguments.
- UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 13 arguments.
- UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 14 arguments.
- UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 15 arguments.
- UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 16 arguments.
- UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 17 arguments.
- UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 18 arguments.
- UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 19 arguments.
- UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 2 arguments.
- UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 20 arguments.
- UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 21 arguments.
- UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 22 arguments.
- UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 3 arguments.
- UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 4 arguments.
- UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 5 arguments.
- UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 6 arguments.
- UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 7 arguments.
- UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 8 arguments.
- UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 9 arguments.
- ui() - Method in class org.apache.spark.SparkContext
-
- uiTab() - Method in class org.apache.spark.streaming.StreamingContext
-
- unbound() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
-
- unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
Remove all persisted state associated with the HTTP broadcast with the given ID.
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
Remove all persisted state associated with the torrent broadcast with the given ID.
- uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Removes the specified table from the in-memory cache.
- underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
-
- UniformGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
-
- uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
- uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
- union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the union of this RDD and another one.
- union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the union of this RDD and another one.
- union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return the union of this RDD and another one.
- union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the union of this RDD and another one.
- union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Build the union of a list of RDDs.
- union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Build the union of a list of RDDs passed as variable-length arguments.
- Union - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- Union(Seq<SparkPlan>) - Constructor for class org.apache.spark.sql.execution.Union
-
- union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- unionAll(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD
-
Combines the tuples of two RDDs with the same schema, keeping duplicates.
- UnionRDD<T> - Class in org.apache.spark.rdd
-
- UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
-
- uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
-
- UnknownReason - Class in org.apache.spark
-
:: DeveloperApi ::
We don't know why the task ended -- for example, because of a ClassNotFound exception when
deserializing the task result.
- UnknownReason() - Constructor for class org.apache.spark.UnknownReason
-
- unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.api.java.JavaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.broadcast.Broadcast
-
Asynchronously delete cached copies of this broadcast on the executors.
- unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
-
Delete cached copies of this broadcast on the executors.
- unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Unpersist intermediate RDDs used in the computation.
- unpersist(boolean) - Method in class org.apache.spark.rdd.RDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- until(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- update() - Method in class org.apache.spark.scheduler.AccumulableInfo
-
- update() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
-
- update(T1, T2) - Method in class org.apache.spark.util.MutablePair
-
Updates this pair with new values and returns itself
- updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage
aggregate metrics by calculating deltas between the currently recorded metrics and the new
metrics.
- Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to perform steps (weight update) using Gradient Descent methods.
- Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
-
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- useDisk() - Method in class org.apache.spark.storage.StorageLevel
-
- useMemory() - Method in class org.apache.spark.storage.StorageLevel
-
- useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
-
- user() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- user() - Method in class org.apache.spark.scheduler.JobLogger
-
- userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-