Index (Spark 1.1.1 JavaDoc)

A B C D E F G H I J K L M N O P Q R S T U V W Z _

A

Accumulable<R,T> - Class in org.apache.spark: A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, to which tasks can add values with +=.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, with a name for display in the Spark UI.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext: Create an accumulator from a "mutable collection" type.
AccumulableInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
AccumulableParam<R,T> - Interface in org.apache.spark: Helper object defining how to accumulate values of a particular type.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo: Terminal values of accumulables updated during this stage.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo: Intermediate updates to accumulables during this task.
Accumulator<T> - Class in org.apache.spark: A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, with a name for display in the Spark UI.
AccumulatorParam<T> - Interface in org.apache.spark: A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
ActorHelper - Interface in org.apache.spark.streaming.receiver: :: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
actorSystem() - Method in class org.apache.spark.SparkEnv
add(T) - Method in class org.apache.spark.Accumulable: Add more data to this accumulator / accumulable
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Adds a new document.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Add a new sample to this summarizer, and update the statistical summary.
add(Vector) - Method in class org.apache.spark.util.Vector
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam: Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
addedFiles() - Method in class org.apache.spark.SparkContext
addedJars() - Method in class org.apache.spark.SparkContext
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam: Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.SparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD: Add Hadoop configuration specific to a single partition and attempt.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext: Add a callback function to be executed on task completion.
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext: Add a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext: Add a listener in the form of a Scala closure to be executed on task completion.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
Aggregate - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Aggregate
aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs an aggregation over all Rows in this RDD.
Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution: An aggregate that needs to be computed for each row in a group.
Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
AggregateEvaluation - Class in org.apache.spark.sql.execution
AggregateEvaluation(Seq<Attribute>, Seq<Expression>, Seq<Expression>, Expression) - Constructor for class org.apache.spark.sql.execution.AggregateEvaluation
aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
aggregateExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
Aggregator<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
aggregator() - Method in class org.apache.spark.ShuffleDependency
Algo - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
ALL_COMPRESSION_CODECS() - Method in interface org.apache.spark.io.CompressionCodec
AlphaComponent - Annotation Type in org.apache.spark.annotation: A new component of Spark which may have unstable API's.
alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
ALS - Class in org.apache.spark.mllib.recommendation: Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS: Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.BlockStats - Class in org.apache.spark.mllib.recommendation: :: DeveloperApi :: Statistics of a block in ALS computation.
ALS.BlockStats(String, int, long, long, long, long) - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats
ALS.BlockStats$ - Class in org.apache.spark.mllib.recommendation
ALS.BlockStats$() - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats$
analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext: Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
analyzeBlocks(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: :: DeveloperApi :: Given an RDD of ratings, number of user blocks, and number of product blocks, computes the statistics of each block in ALS computation.
analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
AnalyzeTable - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils: Returns a new vector with 1.0 (bias) appended to the input vector.
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Gets the (i, j)-th element.
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector: Gets the value of the ith element.
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(String) - Static method in class org.apache.spark.storage.BlockId: Converts a BlockId "name" String back into a BlockId.
apply(String, String, int, int) - Static method in class org.apache.spark.storage.BlockManagerId: Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
apply(long) - Static method in class org.apache.spark.streaming.Minutes
apply(long) - Static method in class org.apache.spark.streaming.Seconds
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values passed as variable-length arguments.
apply(int) - Method in class org.apache.spark.util.Vector
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Applies a schema to an RDD of Java Beans.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: DeveloperApi :: Creates a JavaSchemaRDD from an RDD containing Rows by applying a schema to this RDD.
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext: :: DeveloperApi :: Creates a SchemaRDD from an RDD containing Rows by applying a schema to this RDD.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appName() - Method in class org.apache.spark.SparkContext
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the receiver operating characteristic (ROC) curve.
ArrayType - Class in org.apache.spark.sql.api.java: The data type representing Lists.
as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD: Applies a qualifier to the attributes of this relation.
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator.
asRDDId() - Method in class org.apache.spark.storage.BlockId
AsyncRDDActions<T> - Class in org.apache.spark.rdd: :: Experimental :: A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
attempt() - Method in class org.apache.spark.scheduler.TaskInfo
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
attemptId() - Method in class org.apache.spark.TaskContext
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.

B

baseLogicalPlan() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
baseLogicalPlan() - Method in class org.apache.spark.sql.SchemaRDD
baseSchemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
baseSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
BatchInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
BatchPythonEvaluation - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Uses PythonRDD to evaluate a PythonUDF, one partition of tuples at a time.
BatchPythonEvaluation(PythonUDF, Seq<Attribute>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.BatchPythonEvaluation
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
BernoulliSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliSampler
BernoulliSampler(double) - Constructor for class org.apache.spark.util.random.BernoulliSampler
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation: :: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for classification are either zero or one.
BinaryType - Class in org.apache.spark.sql.api.java: The data type representing byte[] values.
BinaryType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the BinaryType object.
BlockId - Class in org.apache.spark.storage: :: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
blockManager() - Method in class org.apache.spark.SparkEnv
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
BlockManagerId - Class in org.apache.spark.storage: :: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
BlockNotFoundException - Exception in org.apache.spark.storage
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
blocks() - Method in class org.apache.spark.storage.StorageStatus: Return the blocks stored in this block manager.
BlockStatus - Class in org.apache.spark.storage
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
bmAddress() - Method in class org.apache.spark.FetchFailed
BooleanType - Class in org.apache.spark.sql.api.java: The data type representing boolean and Boolean values.
BooleanType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the BooleanType object.
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
boundCondition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
boundCondition() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
BoundedDouble - Class in org.apache.spark.partial: :: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
boundGenerator() - Method in class org.apache.spark.sql.execution.Generate
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast: A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
broadcast() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
BroadcastBlockId - Class in org.apache.spark.storage
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
BroadcastFactory - Interface in org.apache.spark.broadcast: :: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastFuture() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
BroadcastHashJoin - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Performs an inner hash join of two child relations.
BroadcastHashJoin(Seq<Expression>, Seq<Expression>, BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.BroadcastHashJoin
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
broadcastManager() - Method in class org.apache.spark.SparkEnv
BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
BroadcastNestedLoopJoin(SparkPlan, SparkPlan, BuildSide, JoinType, Option<Expression>) - Constructor for class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node: build the left node and right nodes if not leaf
buildKeys() - Method in interface org.apache.spark.sql.execution.HashJoin
BuildLeft - Class in org.apache.spark.sql.execution
BuildLeft() - Constructor for class org.apache.spark.sql.execution.BuildLeft
buildPlan() - Method in interface org.apache.spark.sql.execution.HashJoin
buildProjection() - Method in class org.apache.spark.sql.execution.Project
BuildRight - Class in org.apache.spark.sql.execution
BuildRight() - Constructor for class org.apache.spark.sql.execution.BuildRight
buildSide() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
buildSide() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
BuildSide - Class in org.apache.spark.sql.execution
BuildSide() - Constructor for class org.apache.spark.sql.execution.BuildSide
buildSide() - Method in interface org.apache.spark.sql.execution.HashJoin
buildSide() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
buildSide() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
buildSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.HashJoin
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
ByteType - Class in org.apache.spark.sql.api.java: The data type representing byte and Byte values.
ByteType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
CacheCommand - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
CacheCommand(String, boolean, SQLContext) - Constructor for class org.apache.spark.sql.execution.CacheCommand
cacheManager() - Method in class org.apache.spark.SparkEnv
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Caches the specified table in-memory.
cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: variance calculation
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: variance calculation
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for regression
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: variance calculation
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
call(T1) - Method in interface org.apache.spark.api.java.function.Function
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
cancel() - Method in class org.apache.spark.ComplexFutureAction
cancel() - Method in interface org.apache.spark.FutureAction: Cancels the execution of this action.
cancel() - Method in class org.apache.spark.SimpleFutureAction
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.SparkContext: Cancel all jobs that have been scheduled or are running.
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext: Cancel active jobs for the specified group.
cancelled() - Method in class org.apache.spark.ComplexFutureAction: Returns whether the promise has been cancelled.
canEqual(Object) - Method in class org.apache.spark.sql.api.java.Row
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianProduct - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.CartesianProduct
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
categories() - Method in class org.apache.spark.mllib.tree.model.Split
category() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike: Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
checkpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Enable periodic checkpointing of RDDs of this DStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext: Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointData() - Method in class org.apache.spark.rdd.RDD
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDir() - Method in class org.apache.spark.SparkContext
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
child() - Method in class org.apache.spark.sql.execution.Aggregate
child() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
child() - Method in class org.apache.spark.sql.execution.DescribeCommand
child() - Method in class org.apache.spark.sql.execution.Distinct
child() - Method in class org.apache.spark.sql.execution.EvaluatePython
child() - Method in class org.apache.spark.sql.execution.Exchange
child() - Method in class org.apache.spark.sql.execution.Filter
child() - Method in class org.apache.spark.sql.execution.Generate
child() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
child() - Method in class org.apache.spark.sql.execution.Limit
child() - Method in class org.apache.spark.sql.execution.OutputFaker
child() - Method in class org.apache.spark.sql.execution.Project
child() - Method in class org.apache.spark.sql.execution.Sample
child() - Method in class org.apache.spark.sql.execution.Sort
child() - Method in class org.apache.spark.sql.execution.TakeOrdered
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
children() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
children() - Method in class org.apache.spark.sql.execution.OutputFaker
children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
children() - Method in class org.apache.spark.sql.execution.Union
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's independence test for every feature against the label across the input RDD.
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test: :: Experimental :: Object containing the test results for the chi-squared hypothesis test.
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
ClassificationModel - Interface in org.apache.spark.mllib.classification: :: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
className() - Method in class org.apache.spark.ExceptionFailure
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
classTag() - Method in class org.apache.spark.api.java.JavaRDD
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
cleaner() - Method in class org.apache.spark.SparkContext
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext: Clear the thread-local property for overriding the call sites of actions and RDDs.
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext: Clear the current thread's job group ID and its description.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: :: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: :: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clone() - Method in class org.apache.spark.SparkConf: Copy this object
clone() - Method in class org.apache.spark.storage.StorageLevel
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
clone() - Method in class org.apache.spark.util.random.PoissonSampler
clone() - Method in interface org.apache.spark.util.random.RandomSampler
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliSampler: Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.serializer.DeserializationStream
close() - Method in class org.apache.spark.serializer.SerializationStream
closureSerializer() - Method in class org.apache.spark.SparkEnv
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
codegenEnabled() - Method in class org.apache.spark.sql.execution.SparkPlan
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
collect() - Method in class org.apache.spark.sql.SchemaRDD
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD: Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving all elements of this RDD.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in a specific partition of this RDD.
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Computes column-wise summary statistics for the input RDD[Vector].
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
Command - Interface in org.apache.spark.sql.execution
commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
completionTime() - Method in class org.apache.spark.scheduler.StageInfo: Time when all tasks in the stage completed or when the stage was cancelled.
ComplexFutureAction<T> - Class in org.apache.spark: :: Experimental :: A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
CompressionCodec - Interface in org.apache.spark.io: :: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater: Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream: Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Ask ReceiverInputTracker for received data blocks and generates RDDs with them.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes column-wise summary statistics.
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the covariance matrix, treating each row as an observation.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the Gramian matrix A^T A.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo: Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the top k principal components.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the singular value decomposition of this matrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes singular value decomposition of this matrix.
condition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
condition() - Method in class org.apache.spark.sql.execution.Filter
condition() - Method in class org.apache.spark.sql.execution.HashOuterJoin
condition() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
conditionEvaluator() - Method in class org.apache.spark.sql.execution.Filter
conf() - Method in class org.apache.spark.SparkContext
conf() - Method in class org.apache.spark.SparkEnv
conf() - Method in class org.apache.spark.streaming.StreamingContext
confidence() - Method in class org.apache.spark.partial.BoundedDouble
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD: Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connectionManager() - Method in class org.apache.spark.SparkEnv
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream: An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
contains(String) - Method in class org.apache.spark.SparkConf: Does the configuration contain a given parameter?
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus: Return whether the given block is stored in this block manager in O(1) time.
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
context() - Method in interface org.apache.spark.api.java.JavaRDDLike: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
context() - Method in class org.apache.spark.rdd.RDD: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream: Return the StreamingContext associated with this DStream
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Vector: Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
copy() - Method in class org.apache.spark.util.StatCounter: Clone this StatCounter
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the Pearson correlation for the input RDDs.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the correlation for the input RDDs using the specified method.
count() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample size.
count() - Method in class org.apache.spark.rdd.RDD: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD: Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental ::
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Count the number of elements for each key, and return the result to the master as a Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Create a new StorageLevel object.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD: Create a PartitionPruningRDD.
create(Object...) - Static method in class org.apache.spark.sql.api.java.Row: Creates a Row with the given values.
create(Seq<Object>) - Static method in class org.apache.spark.sql.api.java.Row: Creates a Row with the given values.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
createArrayType(DataType) - Static method in class org.apache.spark.sql.api.java.DataType: Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType: Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createCodec(SparkConf) - Method in interface org.apache.spark.io.CompressionCodec
createCodec(SparkConf, String) - Method in interface org.apache.spark.io.CompressionCodec
createCombiner() - Method in class org.apache.spark.Aggregator
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: Experimental :: Creates an empty parquet file with the schema of class beanClass, which can be registered as a table.
createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: Creates an empty parquet file with the schema of class A, which can be registered as a table.
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext: Creates a SchemaRDD from an RDD of case classes.
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages form a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages form a Kafka Broker.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages form a Kafka Broker.
createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an InputDStream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructType with the given StructField array (fields).
createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext: Creates a table using the schema of the given class.
creationSite() - Method in class org.apache.spark.rdd.RDD: User code that created this RDD (e.g.
creationSite() - Method in class org.apache.spark.streaming.dstream.DStream

D

dagScheduler() - Method in class org.apache.spark.SparkContext
DataType - Class in org.apache.spark.sql.api.java: The base type of all Spark SQL data types.
DataType() - Constructor for class org.apache.spark.sql.api.java.DataType
DataValidators - Class in org.apache.spark.mllib.util: :: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
DecimalType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the DecimalType object.
DecimalType - Class in org.apache.spark.sql.api.java: The data type representing java.math.BigDecimal values.
DecisionTree - Class in org.apache.spark.mllib.tree: :: Experimental :: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model: :: Experimental :: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
DEFAULT_COMPRESSION_CODEC() - Method in interface org.apache.spark.io.CompressionCodec
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.SparkContext: Default level of parallelism to use when not given by user (e.g.
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner: Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Returns the degree(s) of freedom of the hypothesis test.
delegate() - Method in class org.apache.spark.InterruptibleIterator
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-majored dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg: Column-majored dense matrix.
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
DenseVector - Class in org.apache.spark.mllib.linalg: A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
dependencies() - Method in class org.apache.spark.rdd.RDD: Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream: List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
Dependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get depth of tree.
DescribeCommand - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
DescribeCommand(SparkPlan, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.DescribeCommand
describedTable() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution: Implementation for "describe [extended] table".
DescribeHiveTableCommand(org.apache.spark.sql.hive.MetastoreRelation, Seq<Attribute>, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
description() - Method in class org.apache.spark.ExceptionFailure
description() - Method in class org.apache.spark.storage.StorageLevel
DeserializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(Writable) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
deserialized() - Method in class org.apache.spark.storage.StorageLevel
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
details() - Method in class org.apache.spark.scheduler.StageInfo
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
DeveloperApi - Annotation Type in org.apache.spark.annotation: A lower-level, unstable API intended for developers.
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
diskSize() - Method in class org.apache.spark.storage.BlockStatus
diskSize() - Method in class org.apache.spark.storage.RDDInfo
diskUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the disk space used by this block manager.
diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the disk space used by the given RDD in this block manager in O(1) time.
dist(Vector) - Method in class org.apache.spark.util.Vector
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing the distinct elements in this RDD.
Distinct - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Computes the set of distinct input rows using a HashSet.
Distinct(boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Distinct
distinct() - Method in class org.apache.spark.sql.SchemaRDD
distinct(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed: Represents a distributively stored matrix backed by one or more RDDs.
divide(double) - Method in class org.apache.spark.util.Vector
doCache() - Method in class org.apache.spark.sql.execution.CacheCommand
dot(Vector) - Method in class org.apache.spark.util.Vector
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleRDDFunctions - Class in org.apache.spark.rdd: Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
DoubleType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the DoubleType object.
DoubleType - Class in org.apache.spark.sql.api.java: The data type representing double and Double values.
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
DropTable - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: Drops a table from the metastore and removes it if it is cached.
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
DStream<T> - Class in org.apache.spark.streaming.dstream: A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
duration() - Method in class org.apache.spark.scheduler.TaskInfo
Duration - Class in org.apache.spark.streaming
Duration(long) - Constructor for class org.apache.spark.streaming.Duration

E

elements() - Method in class org.apache.spark.util.Vector
empty() - Static method in class org.apache.spark.storage.BlockStatus
emptyRDD() - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD that has no partitions or elements.
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext: Get an RDD that has no partitions or elements.
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Entropy - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating entropy during binary classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
env() - Method in class org.apache.spark.api.java.JavaSparkContext
env() - Method in class org.apache.spark.SparkContext
env() - Method in class org.apache.spark.streaming.StreamingContext
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
EnvironmentListener - Class in org.apache.spark.ui.env: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the EnvironmentTab
EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
equals(Object) - Method in class org.apache.spark.HashPartitioner
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
equals(Object) - Method in class org.apache.spark.RangePartitioner
equals(Object) - Method in class org.apache.spark.scheduler.AccumulableInfo
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
equals(Object) - Method in class org.apache.spark.sql.api.java.ArrayType
equals(Object) - Method in class org.apache.spark.sql.api.java.MapType
equals(Object) - Method in class org.apache.spark.sql.api.java.Row
equals(Object) - Method in class org.apache.spark.sql.api.java.StructField
equals(Object) - Method in class org.apache.spark.sql.api.java.StructType
equals(Object) - Method in class org.apache.spark.storage.BlockId
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
EvaluatePython - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Evaluates a PythonUDF, appending the result to the end of the input tuple.
EvaluatePython(PythonUDF, LogicalPlan) - Constructor for class org.apache.spark.sql.execution.EvaluatePython
event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
eventLogger() - Method in class org.apache.spark.SparkContext
Except - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Returns a table with the elements from left that are not in right using the built-in spark subtract function.
Except(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Except
except(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD: Performs a relational except on two SchemaRDDs
ExceptionFailure - Class in org.apache.spark: :: DeveloperApi :: Task failed due to a runtime exception.
ExceptionFailure(String, String, StackTraceElement[], Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
Exchange - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Exchange(Partitioning, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Exchange
execId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
execute() - Method in class org.apache.spark.sql.execution.Aggregate: Substituted version of aggregateExpressions expressions which are used to compute final output rows given a group and the result of all aggregate computations.
execute() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
execute() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
execute() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
execute() - Method in class org.apache.spark.sql.execution.CacheCommand
execute() - Method in class org.apache.spark.sql.execution.CartesianProduct
execute() - Method in class org.apache.spark.sql.execution.DescribeCommand
execute() - Method in class org.apache.spark.sql.execution.Distinct
execute() - Method in class org.apache.spark.sql.execution.Except
execute() - Method in class org.apache.spark.sql.execution.Exchange
execute() - Method in class org.apache.spark.sql.execution.ExistingRdd
execute() - Method in class org.apache.spark.sql.execution.ExplainCommand
execute() - Method in class org.apache.spark.sql.execution.Filter
execute() - Method in class org.apache.spark.sql.execution.Generate
execute() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
execute() - Method in class org.apache.spark.sql.execution.HashOuterJoin
execute() - Method in class org.apache.spark.sql.execution.Intersect
execute() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
execute() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
execute() - Method in class org.apache.spark.sql.execution.Limit
execute() - Method in class org.apache.spark.sql.execution.OutputFaker
execute() - Method in class org.apache.spark.sql.execution.Project
execute() - Method in class org.apache.spark.sql.execution.Sample
execute() - Method in class org.apache.spark.sql.execution.SetCommand
execute() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
execute() - Method in class org.apache.spark.sql.execution.Sort
execute() - Method in class org.apache.spark.sql.execution.SparkPlan: Runs this query returning the result as an RDD.
execute() - Method in class org.apache.spark.sql.execution.TakeOrdered
execute() - Method in class org.apache.spark.sql.execution.Union
execute() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
execute() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
execute() - Method in class org.apache.spark.sql.hive.execution.DropTable
execute() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
execute() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
execute() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
execute() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
execute() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable: Inserts all rows into the Parquet file.
execute() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
executeCollect() - Method in class org.apache.spark.sql.execution.Limit: A custom implementation modeled after the take function on RDDs but which never runs any job locally.
executeCollect() - Method in class org.apache.spark.sql.execution.SparkPlan: Runs this query returning the result as an array.
executeCollect() - Method in class org.apache.spark.sql.execution.TakeOrdered
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
executor_() - Method in class org.apache.spark.streaming.receiver.Receiver: Handler object that runs the receiver.
executorActorSystemName() - Static method in class org.apache.spark.SparkEnv
executorEnvs() - Method in class org.apache.spark.SparkContext
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
executorId() - Method in class org.apache.spark.SparkEnv
executorId() - Method in class org.apache.spark.storage.BlockManagerId
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
executorIdToStorageStatus() - Method in class org.apache.spark.storage.StorageStatusListener
ExecutorLostFailure - Class in org.apache.spark: :: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure() - Constructor for class org.apache.spark.ExecutorLostFailure
executorMemory() - Method in class org.apache.spark.SparkContext
ExecutorsListener - Class in org.apache.spark.ui.exec: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the ExecutorsTab
ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
ExistingRdd - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
ExistingRdd(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.ExistingRdd
Experimental - Annotation Type in org.apache.spark.annotation: An experimental user-facing API.
ExplainCommand - Class in org.apache.spark.sql.execution: An explain command for users to see how a command will be executed.
ExplainCommand(LogicalPlan, Seq<Attribute>, boolean, SQLContext) - Constructor for class org.apache.spark.sql.execution.ExplainCommand
extended() - Method in class org.apache.spark.sql.execution.ExplainCommand
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener

F

failed() - Method in class org.apache.spark.scheduler.TaskInfo
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
failureReason() - Method in class org.apache.spark.scheduler.StageInfo: If the stage failed, the reason why.
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
FakeParquetSerDe - Class in org.apache.spark.sql.hive.parquet: A placeholder that allows SparkSQL users to create metastore tables that are stored as parquet files.
FakeParquetSerDe() - Constructor for class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns false positive rate for a given label (category)
feature() - Method in class org.apache.spark.mllib.tree.model.Split
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
FeatureType - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
FetchFailed - Class in org.apache.spark: :: DeveloperApi :: Task failed to fetch shuffle data from a remote node.
FetchFailed(BlockManagerId, int, int, int) - Constructor for class org.apache.spark.FetchFailed
field() - Method in class org.apache.spark.storage.BroadcastBlockId
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
files() - Method in class org.apache.spark.SparkContext
fileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Row, Boolean>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing only the elements that satisfy a predicate.
Filter - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Filter(Expression, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Filter
filter(Function1<Row, Object>) - Method in class org.apache.spark.sql.SchemaRDD
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream containing only the elements that satisfy a predicate.
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD: Filters this RDD with p, where p takes an additional parameter of type A.
findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel: Find synonyms of a word
findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel: Find synonyms of the vector representation of a word
finished() - Method in class org.apache.spark.scheduler.TaskInfo
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
first() - Method in class org.apache.spark.api.java.JavaPairRDD
first() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD: Return the first element in this RDD.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF: Computes the inverse document frequency.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF: Computes the inverse document frequency.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler: Computes the mean and variance and stores as a model to be used for later scaling.
fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec: Computes the vector representation of each word in vocabulary.
fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec: Computes the vector representation of each word in vocabulary (Java version).
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, Traversable>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A function that takes two inputs and returns zero or more output records.
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq>, ClassTag) - Method in class org.apache.spark.rdd.RDD: FlatMaps f over this RDD, where f takes an additional parameter of type A.
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
FloatType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the FloatType object.
FloatType - Class in org.apache.spark.sql.api.java: The data type representing float and Float values.
floatWritableConverter() - Static method in class org.apache.spark.SparkContext
floor(Duration) - Method in class org.apache.spark.streaming.Time
FlumeUtils - Class in org.apache.spark.streaming.flume
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
flush() - Method in class org.apache.spark.serializer.SerializationStream
fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f-measure for a given label (category)
fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f1-measure for a given label (category)
fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f-measure (equals to precision and recall because precision equals recall)
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to all elements of this RDD.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to all elements of this RDD.
foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to all elements of this RDD.
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to each partition of this RDD.
foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies f to each element of this RDD, where f takes an additional parameter of type A.
formatExecutorId(String) - Method in class org.apache.spark.storage.StorageStatusListener: In the local mode, there is a discrepancy between the executor ID according to the task ("localhost") and that according to SparkEnv ("").
fraction() - Method in class org.apache.spark.sql.execution.Sample
fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream: Convert a scala DStream to a Java-friendly JavaDStream.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream: Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream: Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD: Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromProductRdd(RDD<A>, TypeTags.TypeTag<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo: Construct a StageInfo from a Stage.
fromString(String) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Return the StorageLevel object with the specified name.
Function<T1,R> - Interface in org.apache.spark.api.java.function: Base interface for functions whose return types do not create special RDDs.
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function: A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
FutureAction<T> - Interface in org.apache.spark: :: Experimental :: A future for the result of an action to support cancellation.

G

gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
Generate - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows.
Generate(Generator, boolean, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Generate
generate(Generator, boolean, boolean, Option<String>) - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Applies the given Generator, or table generating function, to this relation.
GeneratedAggregate - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Alternate version of aggregation that leverages projection and thus code generation.
GeneratedAggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.GeneratedAggregate
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator: Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator: Generate an RDD containing test data for LogisticRegression.
generator() - Method in class org.apache.spark.sql.execution.Generate
get() - Method in interface org.apache.spark.FutureAction: Blocks and returns the result of this job.
get(String) - Method in class org.apache.spark.SparkConf: Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf: Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv: Returns the ThreadLocal SparkEnv, if non-null.
get(String) - Static method in class org.apache.spark.SparkFiles: Get the absolute path of a file added through SparkContext.addFile().
get(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column `i`.
getAkkaConf() - Method in class org.apache.spark.SparkConf: Get all akka conf variables set on this SparkConf
getAll() - Method in class org.apache.spark.SparkConf: Get all parameters as a list of pairs
getAllPools() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return pools for fair scheduler
getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus: Return the given block stored in this block manager in O(1) time.
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf: Get a parameter as a boolean, falling back to a default if not set
getBoolean(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a bool.
getByte(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a byte.
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD: The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
getCheckpointDir() - Method in class org.apache.spark.SparkContext
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike: Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD: Gets the name of the file to which this RDD was checkpointed
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext: Return a copy of this JavaSparkContext's configuration.
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
getConf() - Method in class org.apache.spark.SparkContext: Return a copy of this SparkContext's configuration.
getCreationSite() - Static method in class org.apache.spark.streaming.dstream.DStream: Get the creation site of a DStream from the stack trace of when the DStream is created.
getDataType() - Method in class org.apache.spark.sql.api.java.StructField
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
getDouble(String, double) - Method in class org.apache.spark.SparkConf: Get a parameter as a double, falling back to a default if not set
getDouble(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a double.
getElementType() - Method in class org.apache.spark.sql.api.java.ArrayType
getExecutorEnv() - Method in class org.apache.spark.SparkConf: Get all executor environment variables set on this SparkConf
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext: Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return information about blocks stored in all of the slaves
getFields() - Method in class org.apache.spark.sql.api.java.StructType
getFinalValue() - Method in class org.apache.spark.partial.PartialResult: Blocking method to wait for and return the final value.
getFloat(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a float.
getHiveFile(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
getInt(String, int) - Method in class org.apache.spark.SparkConf: Get a parameter as an integer, falling back to a default if not set
getInt(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as an int.
getKeyType() - Method in class org.apache.spark.sql.api.java.MapType
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.SparkContext: Get a local property set in this thread, or null if it is missing.
getLong(String, long) - Method in class org.apache.spark.SparkConf: Get a parameter as a long, falling back to a default if not set
getLong(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a long.
getName() - Method in class org.apache.spark.sql.api.java.StructField
getObjectInspector() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
getOption(String) - Method in class org.apache.spark.SparkConf: Get a parameter as an Option
getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getParents(int) - Method in class org.apache.spark.NarrowDependency: Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
getParents(int) - Method in class org.apache.spark.RangeDependency
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
getPartition(Object) - Method in class org.apache.spark.Partitioner
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
getPartitions() - Method in class org.apache.spark.sql.SchemaRDD
getPersistentRDDs() - Method in class org.apache.spark.SparkContext: Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPoolForName(String) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return the pool associated with the given name, if one exists
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Gets the receiver object that will be sent to the worker nodes to receive data.
getRootDirectory() - Static method in class org.apache.spark.SparkFiles: Get the root directory that contains files added through SparkContext.addFile().
getSchedulingMode() - Method in class org.apache.spark.SparkContext: Return current scheduling mode
getSerDeStats() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
getSerializedClass() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
getShort(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a short.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext: Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.rdd.RDD: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getString(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a String.
getThreadLocal() - Static method in class org.apache.spark.SparkEnv: Returns the ThreadLocal SparkEnv.
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task started remotely getting the result.
getValueType() - Method in class org.apache.spark.sql.api.java.MapType
Gini - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating the Gini impurity during binary classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
global() - Method in class org.apache.spark.sql.execution.Sort
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
Gradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
GradientDescent - Class in org.apache.spark.mllib.optimization: Class used to solve an optimization problem using Gradient Descent.
graph() - Method in class org.apache.spark.streaming.dstream.DStream
graph() - Method in class org.apache.spark.streaming.StreamingContext
groupBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function<T, K>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Seq<Expression>, Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs a grouping followed by an aggregation.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Create a new DStream by applying groupByKey over a sliding window on this DStream.
groupingExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
groupingExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext: Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext: A default Hadoop Configuration for the Hadoop code (e.g.
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<SerializableWritable<Configuration>>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
hashCode() - Method in class org.apache.spark.HashPartitioner
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector
hashCode() - Method in interface org.apache.spark.Partition
hashCode() - Method in class org.apache.spark.RangePartitioner
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
hashCode() - Method in class org.apache.spark.sql.api.java.ArrayType
hashCode() - Method in class org.apache.spark.sql.api.java.MapType
hashCode() - Method in class org.apache.spark.sql.api.java.Row
hashCode() - Method in class org.apache.spark.sql.api.java.StructField
hashCode() - Method in class org.apache.spark.sql.api.java.StructType
hashCode() - Method in class org.apache.spark.storage.BlockId
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
hashCode() - Method in class org.apache.spark.storage.StorageLevel
HashingTF - Class in org.apache.spark.mllib.feature: :: Experimental :: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(int) - Constructor for class org.apache.spark.mllib.feature.HashingTF
HashingTF() - Constructor for class org.apache.spark.mllib.feature.HashingTF
HashJoin - Interface in org.apache.spark.sql.execution
HashOuterJoin - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Performs a hash based outer join for two child relations by shuffling the data using the join keys.
HashOuterJoin(Seq<Expression>, Seq<Expression>, JoinType, Option<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.HashOuterJoin
HashPartitioner - Class in org.apache.spark: A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
hasNext() - Method in class org.apache.spark.InterruptibleIterator
high() - Method in class org.apache.spark.partial.BoundedDouble
HingeGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram using the provided buckets.
hiveContext() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
hiveContext() - Method in class org.apache.spark.sql.hive.execution.DropTable
HiveContext - Class in org.apache.spark.sql.hive: An instance of the Spark SQL execution engine that integrates with data stored in Hive.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
hiveDevHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: The location of the hive source code.
hiveFilesTemp() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
hiveHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: The location of the compiled hive distribution
HiveMetastoreTypes - Class in org.apache.spark.sql.hive: :: DeveloperApi :: Provides conversions between Spark SQL data types and Hive Metastore types.
HiveMetastoreTypes() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreTypes
hivePlanner() - Method in class org.apache.spark.sql.hive.HiveContext
hiveql(String) - Method in class org.apache.spark.sql.hive.HiveContext
hiveQTestUtilTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
hiveString() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
HiveTableScan - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: The Hive table scan operator.
HiveTableScan(Seq<Attribute>, org.apache.spark.sql.hive.MetastoreRelation, Option<Expression>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.HiveTableScan
host() - Method in class org.apache.spark.scheduler.TaskInfo
host() - Method in class org.apache.spark.storage.BlockManagerId
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
hql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext: DEPRECATED: Use sql(...) Instead
hql(String) - Method in class org.apache.spark.sql.hive.HiveContext
HttpBroadcastFactory - Class in org.apache.spark.broadcast: A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.
HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
httpFileServer() - Method in class org.apache.spark.SparkEnv

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
id() - Method in class org.apache.spark.Accumulable
id() - Method in interface org.apache.spark.api.java.JavaRDDLike: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
id() - Method in class org.apache.spark.mllib.tree.model.Node
id() - Method in class org.apache.spark.rdd.RDD: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.scheduler.AccumulableInfo
id() - Method in class org.apache.spark.scheduler.TaskInfo
id() - Method in class org.apache.spark.storage.RDDInfo
id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: This is an unique identifier for the network input stream.
IDF - Class in org.apache.spark.mllib.feature: :: Experimental :: Inverse document frequency (IDF).
IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Returns the current IDF vector.
idf() - Method in class org.apache.spark.mllib.feature.IDFModel
IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature: Document frequency aggregator.
IDF.DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
IDFModel - Class in org.apache.spark.mllib.feature: :: Experimental :: Represents an IDF model that can transform term frequency vectors.
ifExists() - Method in class org.apache.spark.sql.hive.execution.DropTable
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Impurity - Interface in org.apache.spark.mllib.tree.impurity: :: Experimental :: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
index() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
index() - Method in interface org.apache.spark.Partition: Get the split's index within its parent RDD
index() - Method in class org.apache.spark.scheduler.TaskInfo
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a row-oriented DistributedMatrix with indexed rows.
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF: Returns the index of the input term.
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
InformationGainStats - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Information gain statistics for each split
InformationGainStats(double, double, double, double, double, double) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
initialize(Configuration, Properties) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
initialized() - Method in interface org.apache.spark.Logging
initializeIfNecessary() - Method in interface org.apache.spark.Logging
initializeLogging() - Method in interface org.apache.spark.Logging
initialValue() - Method in class org.apache.spark.partial.PartialResult
initialValues() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
initLocalProperties() - Method in class org.apache.spark.SparkContext
initLock() - Method in interface org.apache.spark.Logging
input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
InputDStream<T> - Class in org.apache.spark.streaming.dstream: This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
InputFormatInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
inRepoTests() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi ::
InsertIntoHiveTable(org.apache.spark.sql.hive.MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
InsertIntoParquetTable - Class in org.apache.spark.sql.parquet: Operator that acts as a sink for queries on RDDs and can be used to store the output inside a directory of Parquet files.
InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini: Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance: Get this impurity instance.
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
IntegerType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the IntegerType object.
IntegerType - Class in org.apache.spark.sql.api.java: The data type representing int and Integer values.
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
InterruptibleIterator<T> - Class in org.apache.spark: :: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
Intersect - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Returns the rows in left that also appear in right using the built in spark intersection function.
Intersect(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Intersect
intersect(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD: Performs a relational intersect on two SchemaRDDs
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return the intersection of this RDD and another one.
intersection(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
intersection(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
intersection(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
intWritableConverter() - Static method in class org.apache.spark.SparkContext
isAkkaConf(String) - Static method in class org.apache.spark.SparkConf: Return whether the given config is an akka config (e.g.
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
isBroadcast() - Method in class org.apache.spark.storage.BlockId
isCached(String) - Method in class org.apache.spark.sql.SQLContext: Returns true if the table is currently cached in-memory.
isCached() - Method in class org.apache.spark.storage.BlockStatus
isCached() - Method in class org.apache.spark.storage.RDDInfo
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.rdd.RDD: Return whether this RDD has been checkpointed or not
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
isCompleted() - Method in interface org.apache.spark.FutureAction: Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
isCompleted() - Method in class org.apache.spark.TaskContext: Checks whether the task has completed.
isContainsNull() - Method in class org.apache.spark.sql.api.java.ArrayType
isDriver() - Method in class org.apache.spark.storage.BlockManagerId
isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf: Return whether the given config should be passed to an executor on start-up.
isExtended() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
isInterrupted() - Method in class org.apache.spark.TaskContext: Checks whether the task has been killed.
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
isLocal() - Method in class org.apache.spark.SparkContext
isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
isNullable() - Method in class org.apache.spark.sql.api.java.StructField
isNullAt(int) - Method in class org.apache.spark.sql.api.java.Row: Returns true if value at column `i` is NULL.
isRDD() - Method in class org.apache.spark.storage.BlockId
isShuffle() - Method in class org.apache.spark.storage.BlockId
isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf: Return whether the given config is a Spark port config.
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if receiver has been marked for stopping.
isTraceEnabled() - Method in interface org.apache.spark.Logging
isValid() - Method in class org.apache.spark.storage.StorageLevel
isValueContainsNull() - Method in class org.apache.spark.sql.api.java.MapType
isZero() - Method in class org.apache.spark.streaming.Duration
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
jars() - Method in class org.apache.spark.SparkContext
JavaDoubleRDD - Class in org.apache.spark.api.java
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
JavaDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
JavaHadoopRDD<K,V> - Class in org.apache.spark.api.java
JavaHadoopRDD(HadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaHadoopRDD
JavaHiveContext - Class in org.apache.spark.sql.hive.api.java: The entry point for executing Spark SQL queries from a Java program.
JavaHiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
JavaKinesisWordCountASL - Class in org.apache.spark.examples.streaming: Java-friendly Kinesis Spark Streaming WordCount example See http://spark.apache.org/docs/latest/streaming-kinesis.html for more details on the Kinesis Spark Streaming integration.
JavaNewHadoopRDD<K,V> - Class in org.apache.spark.api.java
JavaNewHadoopRDD(NewHadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaNewHadoopRDD
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
JavaRDD<T> - Class in org.apache.spark.api.java
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
JavaSchemaRDD - Class in org.apache.spark.sql.api.java: An RDD of Row objects that is returned as the result of a Spark SQL query.
JavaSchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.api.java.JavaSchemaRDD
JavaSerializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
JavaSparkContext - Class in org.apache.spark.api.java: A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext: Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSQLContext - Class in org.apache.spark.sql.api.java: The entry point for executing Spark SQL queries from a Java program.
JavaSQLContext(SQLContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
JavaSQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
JavaStreamingContext - Class in org.apache.spark.streaming.api.java: A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java: Factory interface for creating a new JavaStreamingContext
JdbcRDD<T> - Class in org.apache.spark.rdd: An RDD that executes an SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
JobLogger - Class in org.apache.spark.scheduler: :: DeveloperApi :: A logger class to record runtime information for jobs in Spark.
JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
JobProgressListener - Class in org.apache.spark.ui.jobs: :: DeveloperApi :: Tracks task-level information to be displayed in the UI.
JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
JobResult - Interface in org.apache.spark.scheduler: :: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
JobSucceeded - Class in org.apache.spark.scheduler
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join() - Method in class org.apache.spark.sql.execution.Generate
join(SchemaRDD, JoinType, Option<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs a relational join on two SchemaRDDs
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinIterators(Iterator<Row>, Iterator<Row>) - Method in interface org.apache.spark.sql.execution.HashJoin
joinType() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
joinType() - Method in class org.apache.spark.sql.execution.HashOuterJoin
jsonFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Loads a JSON file (one object per line), returning the result as a JavaSchemaRDD.
jsonFile(String, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: Experimental :: Loads a JSON file (one object per line) and applies the given schema, returning the result as a JavaSchemaRDD.
jsonFile(String) - Method in class org.apache.spark.sql.SQLContext: Loads a JSON file (one object per line), returning the result as a SchemaRDD.
jsonFile(String, StructType) - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: Loads a JSON file (one object per line) and applies the given schema, returning the result as a SchemaRDD.
jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext: :: Experimental ::
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Loads an RDD[String] storing JSON objects (one object per record), returning the result as a JavaSchemaRDD.
jsonRDD(JavaRDD<String>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: Experimental :: Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema, returning the result as a JavaSchemaRDD.
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext: Loads an RDD[String] storing JSON objects (one object per record), returning the result as a SchemaRDD.
jsonRDD(RDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema, returning the result as a SchemaRDD.
jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext: :: Experimental ::
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener

K

k() - Method in class org.apache.spark.mllib.clustering.KMeansModel: Total number of clusters.
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
KafkaUtils - Class in org.apache.spark.streaming.kafka
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
kClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
kClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
key() - Method in class org.apache.spark.sql.execution.SetCommand
keyBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD: Creates tuples of the elements in this RDD by applying f.
keyOrdering() - Method in class org.apache.spark.ShuffleDependency
keys() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the keys of each tuple.
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils: :: Experimental :: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
KinesisUtils - Class in org.apache.spark.streaming.kinesis: Helper class to create Amazon Kinesis Input Stream :: Experimental ::
KinesisUtils() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtils
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
KMeans - Class in org.apache.spark.mllib.clustering: K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans: Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1, initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4}.
KMeansDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
KMeansModel - Class in org.apache.spark.mllib.clustering: A clustering model for K-means.
KMeansModel(Vector[]) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
KryoRegistrator - Interface in org.apache.spark.serializer: Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoSerializer - Class in org.apache.spark.serializer: A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer

L

L1Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
LabeledPoint - Class in org.apache.spark.mllib.regression: Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns the sequence of labels in ascending order
LassoModel - Class in org.apache.spark.mllib.regression: Regression model trained using Lasso.
LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
LassoWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD: Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 1.0, miniBatchFraction: 1.0}.
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Return the latest model.
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
LBFGS - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
left() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
left() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
left() - Method in class org.apache.spark.sql.execution.CartesianProduct
left() - Method in class org.apache.spark.sql.execution.Except
left() - Method in interface org.apache.spark.sql.execution.HashJoin
left() - Method in class org.apache.spark.sql.execution.HashOuterJoin
left() - Method in class org.apache.spark.sql.execution.Intersect
left() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL: The Streamed Relation
left() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
left() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
leftKeys() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
leftKeys() - Method in interface org.apache.spark.sql.execution.HashJoin
leftKeys() - Method in class org.apache.spark.sql.execution.HashOuterJoin
leftKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
leftKeys() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
LeftSemiJoinBNL - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Using BroadcastNestedLoopJoin to calculate left semi join result when there's no join keys for hash join.
LeftSemiJoinBNL(SparkPlan, SparkPlan, Option<Expression>) - Constructor for class org.apache.spark.sql.execution.LeftSemiJoinBNL
LeftSemiJoinHash - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Build the right table's join keys into a HashSet, and iteratively go through the left table, to find the if join keys are in the Hash set.
LeftSemiJoinHash(Seq<Expression>, Seq<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.LeftSemiJoinHash
length() - Method in class org.apache.spark.scheduler.SplitInfo
length() - Method in class org.apache.spark.sql.api.java.Row: Returns the number of columns present in this Row.
length() - Method in class org.apache.spark.util.Vector
Limit - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Take the first limit elements.
Limit(int, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Limit
limit() - Method in class org.apache.spark.sql.execution.Limit
limit() - Method in class org.apache.spark.sql.execution.TakeOrdered
limit(Expression) - Method in class org.apache.spark.sql.SchemaRDD
limit(int) - Method in class org.apache.spark.sql.SchemaRDD: Limits the results by the given integer.
LinearDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
LinearRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using LinearRegression.
LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}.
listenerBus() - Method in class org.apache.spark.SparkContext
loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Deprecated.
Should use RDD.saveAsTextFile(java.lang.String) for saving and MLUtils.loadLabeledPoints(org.apache.spark.SparkContext, java.lang.String, int) for loading.
loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.
loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile with the default number of partitions.
loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadTestTable(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads vectors saved using RDD[Vector].saveAsTextFile.
loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads vectors saved using RDD[Vector].saveAsTextFile with the default number of partitions.
LocalHiveContext - Class in org.apache.spark.sql.hive: DEPRECATED: Use HiveContext instead.
LocalHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.LocalHiveContext
localValue() - Method in class org.apache.spark.Accumulable: Get the current value of this accumulator from within a task.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
log() - Method in interface org.apache.spark.Logging
log_() - Method in interface org.apache.spark.Logging
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
logError(Function0<String>) - Method in interface org.apache.spark.Logging
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
Logging - Interface in org.apache.spark: :: DeveloperApi :: Utility trait for classes that want to log data.
logicalPlan() - Method in class org.apache.spark.sql.execution.ExplainCommand
logicalPlanToSparkQuery(LogicalPlan) - Method in class org.apache.spark.sql.SQLContext: :: DeveloperApi :: Allows catalyst LogicalPlans to be executed as a SchemaRDD.
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
LogisticGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a logistic loss function, as used in binary classification.
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
LogisticRegressionModel - Class in org.apache.spark.mllib.classification: Classification model trained using Logistic Regression.
LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification: Train a classification model for Logistic Regression using Limited-memory BFGS.
LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification: Train a classification model for Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Construct a LogisticRegression object with default parameters
logName() - Method in interface org.apache.spark.Logging
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
LongType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the LongType object.
LongType - Class in org.apache.spark.sql.api.java: The data type representing long and Long values.
longWritableConverter() - Static method in class org.apache.spark.SparkContext
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the list of values in the RDD for key key.
low() - Method in class org.apache.spark.partial.BoundedDouble
LZ4CompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZ4 implementation of CompressionCodec.
LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
LZFCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec

M

main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
makeCopy(Object[]) - Method in class org.apache.spark.sql.execution.SparkPlan: Overridden make copy also propogates sqlContext to copied plan.
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult: Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to all elements of this RDD.
map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream.
mapId() - Method in class org.apache.spark.FetchFailed
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.HadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.NewHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
MapType - Class in org.apache.spark.sql.api.java: The data type representing Maps.
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Maps f over this RDD, where f takes an additional parameter of type A.
master() - Method in class org.apache.spark.api.java.JavaSparkContext
master() - Method in class org.apache.spark.SparkContext
Matrices - Class in org.apache.spark.mllib.linalg: Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
Matrix - Interface in org.apache.spark.mllib.linalg: Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents an entry in an distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation: Model representing the result of matrix factorization.
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the maximum element from this RDD as defined by the specified Comparator[T].
max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the max of this RDD as defined by the implicit Ordering[T].
max(Duration) - Method in class org.apache.spark.streaming.Duration
max(Time) - Method in class org.apache.spark.streaming.Time
max() - Method in class org.apache.spark.util.StatCounter
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
maxMem() - Method in class org.apache.spark.storage.StorageStatus
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.util.StatCounter
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: :: Experimental :: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: :: Experimental :: Approximate operation to return the mean within a timeout.
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
memRemaining() - Method in class org.apache.spark.storage.StorageStatus: Return the memory remaining in this block manager.
memSize() - Method in class org.apache.spark.storage.BlockStatus
memSize() - Method in class org.apache.spark.storage.RDDInfo
memUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the memory used by this block manager.
memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the memory used by the given RDD in this block manager in O(1) time.
merge(R) - Method in class org.apache.spark.Accumulable: Merge two accumulable objects together
merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Merges another.
merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Merge another MultivariateOnlineSummarizer, and update the statistical summary.
merge(double) - Method in class org.apache.spark.util.StatCounter: Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter: Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter: Merge another StatCounter into this one, adding up the internal statistics.
mergeCombiners() - Method in class org.apache.spark.Aggregator
mergeValue() - Method in class org.apache.spark.Aggregator
metadataCleaner() - Method in class org.apache.spark.SparkContext
metastorePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
metastorePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
metrics() - Method in class org.apache.spark.ExceptionFailure
metricsSystem() - Method in class org.apache.spark.SparkEnv
MFDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
milliseconds() - Method in class org.apache.spark.streaming.Duration
Milliseconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
milliseconds() - Method in class org.apache.spark.streaming.Time
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener: Reformat a time interval in milliseconds to a prettier format for output
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the minimum element from this RDD as defined by the specified Comparator[T].
min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the min of this RDD as defined by the implicit Ordering[T].
min(Duration) - Method in class org.apache.spark.streaming.Duration
min(Time) - Method in class org.apache.spark.streaming.Time
min() - Method in class org.apache.spark.util.StatCounter
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
Minutes - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
MLUtils - Class in org.apache.spark.mllib.util: Helper methods to load, save and pre-process data used in ML Lib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
model() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
MQTTUtils - Class in org.apache.spark.streaming.mqtt
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
MulticlassMetrics - Class in org.apache.spark.mllib.evaluation: ::Experimental:: Evaluator for multiclass classification.
MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Multiply this matrix by a local matrix on the right.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Multiply this matrix by a local matrix on the right.
multiply(double) - Method in class org.apache.spark.util.Vector
MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat: :: DeveloperApi :: MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector format in a online fashion.
MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat: Trait for multivariate statistical summary of a data matrix.
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
MutablePair<T1,T2> - Class in org.apache.spark.util: :: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
MutablePair() - Constructor for class org.apache.spark.util.MutablePair: No-arg constructor for serialization

N

NaiveBayes - Class in org.apache.spark.mllib.classification: Trains a Naive Bayes model given an RDD of (label, features) pairs.
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
NaiveBayesModel - Class in org.apache.spark.mllib.classification: Model for Naive Bayes Classifiers.
name() - Method in class org.apache.spark.Accumulable
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
name() - Method in class org.apache.spark.rdd.RDD: A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.AccumulableInfo
name() - Method in class org.apache.spark.scheduler.StageInfo
name() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
name() - Method in class org.apache.spark.storage.BlockId: A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
name() - Method in class org.apache.spark.storage.RDDBlockId
name() - Method in class org.apache.spark.storage.RDDInfo
name() - Method in class org.apache.spark.storage.ShuffleBlockId
name() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
name() - Method in class org.apache.spark.storage.StreamBlockId
name() - Method in class org.apache.spark.storage.TaskResultBlockId
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
NarrowDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
NativeCommand - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi ::
NativeCommand(String, Seq<Attribute>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.NativeCommand
nettyPort() - Method in class org.apache.spark.storage.BlockManagerId
networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented receiver.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary new API InputFormat.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory: Creates a new broadcast variable.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
newInstance() - Method in class org.apache.spark.serializer.Serializer: Creates a new SerializerInstance.
newInstance() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
newPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
next() - Method in class org.apache.spark.InterruptibleIterator
nextValue() - Method in class org.apache.spark.mllib.random.PoissonGenerator
nextValue() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns an i.i.d.
nextValue() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
nextValue() - Method in class org.apache.spark.mllib.random.UniformGenerator
NO_PREF() - Static method in class org.apache.spark.scheduler.TaskLocality
Node - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Node in a decision tree
Node(int, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
NONE() - Static method in class org.apache.spark.storage.StorageLevel
Normalizer - Class in org.apache.spark.mllib.feature: :: Experimental :: Normalizes samples individually to unit L^p^ norm
Normalizer(double) - Constructor for class org.apache.spark.mllib.feature.Normalizer
Normalizer() - Constructor for class org.apache.spark.mllib.feature.Normalizer
normalJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.normalRDD(org.apache.spark.SparkContext, long, int, long).
normalJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed.
normalJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.normalVectorRDD(org.apache.spark.SparkContext, long, int, int, long).
normalJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed.
normalOutput() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
normalRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d.
normalVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d.
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Null hypothesis of the test.
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
numBlocks() - Method in class org.apache.spark.storage.StorageStatus: Return the number of blocks stored in this block manager in O(RDDs) time.
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
numClassesForClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of columns.
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
numFeatures() - Method in class org.apache.spark.mllib.feature.HashingTF
numInLinks() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
numNodes() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get number of nodes in tree, including leaf nodes.
numNonzeros() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Number of nonzero elements (including explicitly presented zero values) in each column.
numOutLinks() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
numPartitions() - Method in class org.apache.spark.HashPartitioner
numPartitions() - Method in class org.apache.spark.Partitioner
numPartitions() - Method in class org.apache.spark.RangePartitioner
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
numRatings() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
numRddBlocks() - Method in class org.apache.spark.storage.StorageStatus: Return the number of RDD blocks stored in this block manager in O(RDDs) time.
numRddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus: Return the number of blocks that belong to the given RDD in O(1) time.
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of rows.
numTasks() - Method in class org.apache.spark.scheduler.StageInfo

O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
offHeapUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the off-heap space used by this block manager.
offHeapUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the off-heap space used by the given RDD in this block manager in O(1) time.
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the application ends
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the application starts
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has completed.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has started.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a batch of jobs has been submitted for processing.
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener: Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction: When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener: Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
ones(int) - Static method in class org.apache.spark.util.Vector
OneToOneDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the driver receives task metrics from an executor in a heartbeat.
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called if this PartialResult's job fails.
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger: When job ends, recording job completion status and close log file
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a job ends
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger: When job starts, record job property and stage graph
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a job starts
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has reported an error
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been started
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been stopped
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger: When stage is completed, record stage completion status
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger: When stage is submitted, record stage submit info
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener: For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is started.
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is stopped.
onTaskCompletion(TaskContext) - Method in interface org.apache.spark.util.TaskCompletionListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger: When task ends, record task completion status and metrics
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener: Assumes the storage status list is fully up-to-date.
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener: Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer: Solve the provided convex optimization problem.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
Optimizer - Interface in org.apache.spark.mllib.optimization: :: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
orderBy(Seq<SortOrder>) - Method in class org.apache.spark.sql.SchemaRDD: Sorts the results by the given expressions.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
ordering() - Method in class org.apache.spark.sql.execution.TakeOrdered
ordering() - Static method in class org.apache.spark.streaming.Time
org.apache.spark - package org.apache.spark: Core Spark classes in Scala.
org.apache.spark.annotation - package org.apache.spark.annotation: Spark annotations to mark an API experimental or intended only for advanced usages by developers.
org.apache.spark.api.java - package org.apache.spark.api.java: Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function: Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.broadcast - package org.apache.spark.broadcast: Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.examples.streaming - package org.apache.spark.examples.streaming
org.apache.spark.io - package org.apache.spark.io: IO codecs used for compression.
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
org.apache.spark.mllib.feature - package org.apache.spark.mllib.feature
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
org.apache.spark.mllib.random - package org.apache.spark.mllib.random
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
org.apache.spark.mllib.stat.test - package org.apache.spark.mllib.stat.test
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
org.apache.spark.partial - package org.apache.spark.partial
org.apache.spark.rdd - package org.apache.spark.rdd: Provides implementation's of various RDDs.
org.apache.spark.scheduler - package org.apache.spark.scheduler: Spark's DAG scheduler.
org.apache.spark.serializer - package org.apache.spark.serializer: Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java: Allows the execution of relational queries, including those expressed in SQL using Spark.
org.apache.spark.sql.execution - package org.apache.spark.sql.execution
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
org.apache.spark.sql.hive.api.java - package org.apache.spark.sql.hive.api.java
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
org.apache.spark.sql.hive.parquet - package org.apache.spark.sql.hive.parquet
org.apache.spark.sql.hive.test - package org.apache.spark.sql.hive.test
org.apache.spark.sql.parquet - package org.apache.spark.sql.parquet
org.apache.spark.sql.test - package org.apache.spark.sql.test
org.apache.spark.storage - package org.apache.spark.storage
org.apache.spark.streaming - package org.apache.spark.streaming
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java: Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream: Various implementations of DStreams.
org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume: Spark streaming receiver for Flume.
org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka: Kafka receiver for spark streaming.
org.apache.spark.streaming.kinesis - package org.apache.spark.streaming.kinesis
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt: MQTT receiver for Spark Streaming.
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter: Twitter feed receiver for spark streaming.
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq: Zeromq receiver for spark streaming.
org.apache.spark.ui.env - package org.apache.spark.ui.env
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
org.apache.spark.util - package org.apache.spark.util: Spark utilities.
org.apache.spark.util.random - package org.apache.spark.util.random: Utilities for random number generation.
otherCopyArgs() - Method in class org.apache.spark.sql.execution.ExplainCommand
otherCopyArgs() - Method in class org.apache.spark.sql.execution.SetCommand
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
outer() - Method in class org.apache.spark.sql.execution.Generate
output() - Method in class org.apache.spark.sql.execution.Aggregate
output() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
output() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
output() - Method in class org.apache.spark.sql.execution.CacheCommand
output() - Method in class org.apache.spark.sql.execution.CartesianProduct
output() - Method in class org.apache.spark.sql.execution.DescribeCommand
output() - Method in class org.apache.spark.sql.execution.Distinct
output() - Method in class org.apache.spark.sql.execution.EvaluatePython
output() - Method in class org.apache.spark.sql.execution.Except
output() - Method in class org.apache.spark.sql.execution.Exchange
output() - Method in class org.apache.spark.sql.execution.ExistingRdd
output() - Method in class org.apache.spark.sql.execution.ExplainCommand
output() - Method in class org.apache.spark.sql.execution.Filter
output() - Method in class org.apache.spark.sql.execution.Generate
output() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
output() - Method in interface org.apache.spark.sql.execution.HashJoin
output() - Method in class org.apache.spark.sql.execution.HashOuterJoin
output() - Method in class org.apache.spark.sql.execution.Intersect
output() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
output() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
output() - Method in class org.apache.spark.sql.execution.Limit
output() - Method in class org.apache.spark.sql.execution.OutputFaker
output() - Method in class org.apache.spark.sql.execution.Project
output() - Method in class org.apache.spark.sql.execution.Sample
output() - Method in class org.apache.spark.sql.execution.SetCommand
output() - Method in class org.apache.spark.sql.execution.Sort
output() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
output() - Method in class org.apache.spark.sql.execution.TakeOrdered
output() - Method in class org.apache.spark.sql.execution.Union
output() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
output() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
output() - Method in class org.apache.spark.sql.hive.execution.DropTable
output() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
output() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
output() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
output() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
output() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
outputClass() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
OutputFaker - Class in org.apache.spark.sql.execution: :: DeveloperApi :: A plan node that does nothing but lie about the output of its child.
OutputFaker(Seq<Attribute>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.OutputFaker
outputPartitioning() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
outputPartitioning() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
outputPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
outputPartitioning() - Method in class org.apache.spark.sql.execution.HashOuterJoin
outputPartitioning() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
outputPartitioning() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
outputPartitioning() - Method in class org.apache.spark.sql.execution.SparkPlan: Specifies how data is partitioned across different nodes in the cluster.
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
overwrite() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable

P

PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream: Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns key-value pairs (Tuple2), and can be used to construct PairRDDs.
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parquetFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Loads a parquet file, returning the result as a JavaSchemaRDD.
parquetFile(String) - Method in class org.apache.spark.sql.SQLContext: Loads a Parquet file, returning the result as a SchemaRDD.
ParquetTableScan - Class in org.apache.spark.sql.parquet: Parquet table scan operator.
ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors: Parses a string resulted from Vector#toString into an Vector.
parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint: Parses a string resulted from LabeledPoint#toString into an LabeledPoint.
partial() - Method in class org.apache.spark.sql.execution.Aggregate
partial() - Method in class org.apache.spark.sql.execution.Distinct
partial() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
PartialResult<R> - Class in org.apache.spark.partial
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
Partition - Interface in org.apache.spark: A partition of an RDD.
partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a copy of the RDD partitioned using the specified partitioner.
Partitioner - Class in org.apache.spark: An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
partitioner() - Method in class org.apache.spark.rdd.RDD: Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
partitioner() - Method in class org.apache.spark.ShuffleDependency
partitionId() - Method in class org.apache.spark.TaskContext
partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
PartitionPruningRDD<T> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike: Set of partitions in this RDD.
partitions() - Method in class org.apache.spark.rdd.RDD: Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
partOutput() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
path() - Method in class org.apache.spark.scheduler.SplitInfo
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist(StorageLevel) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persistentRdds() - Method in class org.apache.spark.SparkContext
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector: return (this + plus) dot other, but without creating any intermediate storage
PoissonGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.poissonRDD(org.apache.spark.SparkContext, double, long, int, long).
poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default seed.
poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default number of partitions and the default seed.
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.poissonVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long).
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default seed.
poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default number of partitions and the default seed.
poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d.
PoissonSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on values drawn from Poisson distribution.
PoissonSampler(double) - Constructor for class org.apache.spark.util.random.PoissonSampler
poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d.
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
port() - Method in class org.apache.spark.storage.BlockManagerId
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.
precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns precision for a given label (category)
precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns precision
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, precision) curve.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of many users for many products.
predict(JavaRDD<byte[]>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: :: DeveloperApi :: Predict the rating of many users for many products.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for examples stored in a JavaRDD.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
predict() - Method in class org.apache.spark.mllib.tree.model.Node
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node: predict value if node is not leaf
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Use the model to make predictions on batches of data from a DStream
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Use the model to make predictions on the values of a DStream and carry over its keys.
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver: Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD: Get the preferred locations of a partition (as hostnames), taking into account whether the RDD is checkpointed.
preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
prettyPrint() - Method in class org.apache.spark.streaming.Duration
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Print the first ten elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream: Print the first ten elements of each RDD generated in this DStream.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
prob() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
product() - Method in class org.apache.spark.mllib.recommendation.Rating
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
productToRowRdd(RDD<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
progressListener() - Method in class org.apache.spark.streaming.StreamingContext
Project - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Project(Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Project
projectList() - Method in class org.apache.spark.sql.execution.Project
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
Pseudorandom - Interface in org.apache.spark.util.random: :: DeveloperApi :: A class with pseudorandom behavior.
putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult: The probability of obtaining a test statistic result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.

Q

quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
QueryExecutionException - Exception in org.apache.spark.sql.execution
QueryExecutionException(String) - Constructor for exception org.apache.spark.sql.execution.QueryExecutionException
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.

R

RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
random(int, Random) - Static method in class org.apache.spark.util.Vector: Creates this Vector of given length containing random numbers between 0.0 and 1.0.
RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random: :: DeveloperApi :: Trait for random data generators that generate i.i.d.
randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD comprised of i.i.d.
RandomRDDs - Class in org.apache.spark.mllib.random: :: Experimental :: Generator methods for creating RDDs comprised of i.i.d.
RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
RandomSampler<T,U> - Interface in org.apache.spark.util.random: :: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD: Randomly splits this RDD with the provided weights.
randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD[Vector] with vectors containing i.i.d.
RangeDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
RangePartitioner<K,V> - Class in org.apache.spark: A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Rating - Class in org.apache.spark.mllib.recommendation: :: Experimental :: A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
rdd() - Method in class org.apache.spark.api.java.JavaRDD
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
rdd() - Method in class org.apache.spark.Dependency
rdd() - Method in class org.apache.spark.NarrowDependency
RDD<T> - Class in org.apache.spark.rdd: A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD: Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.ShuffleDependency
rdd() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
rdd() - Method in class org.apache.spark.sql.execution.ExistingRdd
RDD() - Static method in class org.apache.spark.storage.BlockId
RDDBlockId - Class in org.apache.spark.storage
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus: Return the RDD blocks stored in this block manager.
rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus: Return the blocks that belong to the given RDD stored in this block manager.
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
rddId() - Method in class org.apache.spark.storage.RDDBlockId
RDDInfo - Class in org.apache.spark.storage
RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener: Filter RDD info to include only those with cached partitions
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
rdds() - Method in class org.apache.spark.rdd.UnionRDD
rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus: Return the storage level, if any, used by the given RDD in this block manager.
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns recall for a given label (category)
recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns recall (equals to precision for multiclass classifier because sum of all false positives is equal to sum of all false negatives)
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, recall) curve.
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Receiver<T> - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
ReceiverInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream: Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented receiver.
recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends products to a user.
recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends users to a product.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for reduceByKeyLocally
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceId() - Method in class org.apache.spark.FetchFailed
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
registerRDDAsTable(JavaSchemaRDD, String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Registers the given RDD as a temporary table in the catalog.
registerRDDAsTable(SchemaRDD, String) - Method in class org.apache.spark.sql.SQLContext: Registers the given RDD as a temporary table in the catalog.
registerTestTable(TestHiveContext.TestTable) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
RegressionModel - Interface in org.apache.spark.mllib.regression
relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext: Set each DStreams in this context to remember RDDs it generated in the last given duration.
rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
remove(String) - Method in class org.apache.spark.SparkConf: Remove a parameter from the configuration
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream with an increased or decreased level of parallelism.
replication() - Method in class org.apache.spark.storage.StorageLevel
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Report exceptions in receiving data.
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Aggregate
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Distinct
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.HashOuterJoin
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Sort
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.SparkPlan: Specifies any partition requirements on the input data for this operator.
reset() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: Resets the test instance by deleting any tables that have been created.
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
Resubmitted - Class in org.apache.spark: :: DeveloperApi :: A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Awaits and returns the result (of type T) of this action.
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
result() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
resultAttribute() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
resultAttribute() - Method in class org.apache.spark.sql.execution.EvaluatePython
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
RidgeRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using RidgeRegression.
RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 1.0, miniBatchFraction: 1.0}.
right() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
right() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
right() - Method in class org.apache.spark.sql.execution.CartesianProduct
right() - Method in class org.apache.spark.sql.execution.Except
right() - Method in interface org.apache.spark.sql.execution.HashJoin
right() - Method in class org.apache.spark.sql.execution.HashOuterJoin
right() - Method in class org.apache.spark.sql.execution.Intersect
right() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL: The Broadcast relation
right() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
right() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
rightKeys() - Method in class org.apache.spark.sql.execution.BroadcastHashJoin
rightKeys() - Method in interface org.apache.spark.sql.execution.HashJoin
rightKeys() - Method in class org.apache.spark.sql.execution.HashOuterJoin
rightKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
rightKeys() - Method in class org.apache.spark.sql.execution.ShuffledHashJoin
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rng() - Method in class org.apache.spark.util.random.BernoulliSampler
rng() - Method in class org.apache.spark.util.random.PoissonSampler
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
Row - Class in org.apache.spark.sql.api.java: A result row from a SparkSQL query.
Row(Row) - Constructor for class org.apache.spark.sql.api.java.Row
row() - Method in class org.apache.spark.sql.api.java.Row
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction: Executes some action enclosed in the closure.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans: Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS: Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Run a job that can return approximate results.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction: Runs a Spark job.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS: Run Limited-memory BFGS (L-BFGS) in parallel.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent: Run stochastic gradient descent (SGD) in parallel using mini batches.
running() - Method in class org.apache.spark.scheduler.TaskInfo
runningLocally() - Method in class org.apache.spark.TaskContext
runSqlHive(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD: Return a sampled subset of this RDD.
Sample - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Sample(double, boolean, long, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sample
sample(boolean, double, long) - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Returns a sampled version of the underlying dataset.
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler: take a random sample
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD: ::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD: ::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions: ::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter: Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter: Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsHiveFile(RDD<Writable>, Class<?>, FileSinkDesc, JobConf, boolean) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions: Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as at text file, using string representation of elements.
saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Deprecated.
Should use RDD.saveAsTextFile(java.lang.String) for saving and MLUtils.loadLabeledPoints(org.apache.spark.SparkContext, java.lang.String, int) for loading.
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
sc() - Method in class org.apache.spark.streaming.StreamingContext
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
SchedulingMode - Class in org.apache.spark.scheduler: "FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
schema() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Returns the schema of this JavaSchemaRDD (represented by a StructType).
schema() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
schema() - Method in class org.apache.spark.sql.SchemaRDD: Returns the schema of this SchemaRDD (represented by a StructType).
SchemaRDD - Class in org.apache.spark.sql: :: AlphaComponent :: An RDD of Row objects that has an associated schema.
SchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.SchemaRDD
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
ScriptTransformation - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: Transforms the input by forking and running the specified script.
ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
Seconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
securityManager() - Method in class org.apache.spark.SparkEnv
seed() - Method in class org.apache.spark.sql.execution.Sample
select(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Changes the output of this relation to the given expressions, similar to the SELECT clause in SQL.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext: Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
SerializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for writing serialized objects.
SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
serialize(Object, ObjectInspector) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
Serializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A serializer.
Serializer() - Constructor for class org.apache.spark.serializer.Serializer
serializer() - Method in class org.apache.spark.ShuffleDependency
serializer() - Method in class org.apache.spark.SparkEnv
SerializerInstance - Class in org.apache.spark.serializer: :: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
set(String, String) - Method in class org.apache.spark.SparkConf: Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD: Set aggregator for RDD's shuffle.
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS: :: Experimental :: Sets the constant used in computing confidence in implicit ALS.
setAppName(String) - Method in class org.apache.spark.SparkConf: Set a name for your application.
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of blocks for both user blocks and product blocks to parallelize the computation into; pass -1 for an auto-configured number of blocks.
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext: Set the thread-local property for overriding the call sites of actions and RDDs.
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext: Set the directory under which RDDs are going to be checkpointed.
SetCommand - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
SetCommand(Option<String>, Option<String>, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.SetCommand
setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the convergence tolerance of iterations for L-BFGS.
setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer: Sets a class loader for the serializer to use in deserialization.
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the distance threshold within which we've consider centers to have converged.
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf: Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf: Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets whether to use implicit preference.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the initialization algorithm.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of steps for the k-means|| initialization mode.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the initial weights.
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should add an intercept.
setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS: :: DeveloperApi :: Sets storage level for intermediate RDDs (user/product in/out links).
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of iterations to run.
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJobDescription(String) - Method in class org.apache.spark.SparkContext: Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of clusters to create (k).
setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD: Set key ordering for RDD's shuffle.
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Set the smoothing parameter.
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the regularization parameter, lambda.
setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets initial learning rate (default: 0.025).
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD: Set mapSideCombine flag for RDD's shuffle.
setMaster(String) - Method in class org.apache.spark.SparkConf: The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set maximum number of iterations to run.
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Deprecated.
use LBFGS.setNumIterations(int) instead
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: Experimental :: Set fraction of data to be used for each SGD iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the fraction of each batch to use for updates.
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.rdd.RDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Assign a name to this RDD
setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS: Set whether the least-squares problems solved at each iteration should have nonnegativity constraints.
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the number of corrections used in the LBFGS update.
setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets number of iterations (default: 1), which should be smaller than or equal to number of partitions.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the number of iterations for SGD.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the maximal number of iterations for L-BFGS.
setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the number of iterations of gradient descent to run per update.
setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets number of partitions (default: 1).
setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of product blocks to parallelize the computation.
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the rank of the feature matrices computed (number of features).
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the regularization parameter.
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans: :: Experimental :: Set the number of runs of the algorithm to execute in parallel.
setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets random seed (default: a random long integer).
setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets a random seed to have deterministic results.
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom: Set random seed.
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD: Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD: Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSparkHome(String) - Method in class org.apache.spark.SparkConf: Set the location where Spark is installed on worker nodes.
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the initial step size of SGD for the first step.
setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the step size for gradient descent.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: :: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel: :: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
settings() - Method in class org.apache.spark.SparkConf
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the updater function to actually perform a gradient step in a given direction.
setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of user blocks to parallelize the computation.
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should validate data before training.
setValue(R) - Method in class org.apache.spark.Accumulable: Set the accumulator's value; only allowed on master
setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets vector size (default: 100).
shortCompressionCodecNames() - Method in interface org.apache.spark.io.CompressionCodec
ShortType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the ShortType object.
ShortType - Class in org.apache.spark.sql.api.java: The data type representing short and Short values.
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
ShuffleBlockId - Class in org.apache.spark.storage
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
ShuffleDependency<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
ShuffledHashJoin - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Performs an inner hash join of two child relations by first shuffling the data using the join keys.
ShuffledHashJoin(Seq<Expression>, Seq<Expression>, BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.ShuffledHashJoin
ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd: :: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.FetchFailed
shuffleId() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
ShuffleIndexBlockId - Class in org.apache.spark.storage
ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
shuffleManager() - Method in class org.apache.spark.SparkEnv
shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
sideEffectResult() - Method in interface org.apache.spark.sql.execution.Command: A concrete command should override this lazy field to wrap up any side effects caused by the command or any other computation that should be evaluated exactly once.
SimpleFutureAction<T> - Class in org.apache.spark: :: Experimental :: A FutureAction holding the result of an action that triggers a single job.
SimpleUpdater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg: :: Experimental :: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
size() - Method in interface org.apache.spark.mllib.linalg.Vector: Size of the vector.
sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Sketches the input RDD via reservoir sampling on each partition.
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream: Time interval after which the DStream generates a RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
SnappyCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from TCP source hostname:port.
solveLeastSquares(DoubleMatrix, DoubleMatrix, org.apache.spark.mllib.optimization.NNLS.Workspace) - Method in class org.apache.spark.mllib.recommendation.ALS: Given A^T A and A^T b, find the x minimising ||Ax - b||_2, possibly subject to nonnegativity constraints if nonnegative is true.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
Sort - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Sort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sort
sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD: Return this RDD sorted by the given key function.
sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return this RDD sorted by the given key function.
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortOrder() - Method in class org.apache.spark.sql.execution.Sort
sortOrder() - Method in class org.apache.spark.sql.execution.TakeOrdered
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
SPARK_UNKNOWN_USER() - Static method in class org.apache.spark.SparkContext
SPARK_VERSION() - Static method in class org.apache.spark.SparkContext
SparkConf - Class in org.apache.spark: Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
SparkConf() - Constructor for class org.apache.spark.SparkConf: Create a SparkConf that loads defaults from system properties and the classpath
sparkContext() - Method in class org.apache.spark.rdd.RDD: The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark: Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
SparkContext() - Constructor for class org.apache.spark.SparkContext: Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext: :: DeveloperApi :: Alternative constructor for setting preferred locations where Spark will create executors.
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.SQLContext
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext: Return the associated Spark context
SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
SparkEnv - Class in org.apache.spark: :: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, org.apache.spark.storage.BlockManager, ConnectionManager, SecurityManager, HttpFileServer, String, org.apache.spark.metrics.MetricsSystem, ShuffleMemoryManager, SparkConf) - Constructor for class org.apache.spark.SparkEnv
SparkException - Exception in org.apache.spark
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
SparkException(String) - Constructor for exception org.apache.spark.SparkException
SparkFiles - Class in org.apache.spark: Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
SparkFlumeEvent - Class in org.apache.spark.streaming.flume: A wrapper class for AvroFlumeEvent's with a custom serialization format.
SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
SparkListener - Interface in org.apache.spark.scheduler: :: DeveloperApi :: Interface for listening to events from the Spark scheduler.
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
SparkListenerApplicationStart(String, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
SparkListenerBlockManagerAdded(BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
SparkListenerBlockManagerRemoved(BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
SparkListenerEvent - Interface in org.apache.spark.scheduler
SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler: Periodic updates from executors.
SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
SparkListenerJobEnd - Class in org.apache.spark.scheduler
SparkListenerJobEnd(int, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
SparkListenerJobStart - Class in org.apache.spark.scheduler
SparkListenerJobStart(int, Seq<Object>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
SparkListenerTaskStart - Class in org.apache.spark.scheduler
SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
SparkLogicalPlan - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Allows already planned SparkQueries to be linked into logical query plans.
SparkLogicalPlan(SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.SparkLogicalPlan
SparkPlan - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
SparkPlan() - Constructor for class org.apache.spark.sql.execution.SparkPlan
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
sparkUser() - Method in class org.apache.spark.SparkContext
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseVector - Class in org.apache.spark.mllib.linalg: A sparse vector represented by an index array and an value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
speculative() - Method in class org.apache.spark.scheduler.TaskInfo
split() - Method in class org.apache.spark.mllib.tree.model.Node
Split - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Split applied to a feature
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
splitId() - Method in class org.apache.spark.TaskContext
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
SplitInfo - Class in org.apache.spark.scheduler
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
sql(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Executes a SQL query using Spark, returning the result as a SchemaRDD.
sql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
sql() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
sql(String) - Method in class org.apache.spark.sql.hive.HiveContext
sql(String) - Method in class org.apache.spark.sql.SQLContext: Executes a SQL query using Spark, returning the result as a SchemaRDD.
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSQLContext
sqlContext() - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
sqlContext() - Method in class org.apache.spark.sql.SchemaRDD
SQLContext - Class in org.apache.spark.sql: :: AlphaComponent :: The entry point for running relational queries using Spark.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
SquaredL2Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
stackTrace() - Method in class org.apache.spark.ExceptionFailure
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
stageId() - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in class org.apache.spark.TaskContext
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
StageInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, int, String, int, Seq<RDDInfo>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
StandardNormalGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
StandardScaler - Class in org.apache.spark.mllib.feature: :: Experimental :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
StandardScalerModel - Class in org.apache.spark.mllib.feature: :: Experimental :: Represents a StandardScaler model that can transform vectors.
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
start() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
start() - Method in class org.apache.spark.streaming.StreamingContext: Start the execution of the streams.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
startTime() - Method in class org.apache.spark.SparkContext
StatCounter - Class in org.apache.spark.util: A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
StatCounter() - Constructor for class org.apache.spark.util.StatCounter: Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.streaming.StreamingContext
statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Test statistic.
Statistics - Class in org.apache.spark.mllib.stat: API for statistical functions in MLlib.
Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
statistics() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
Statistics - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Statistics for querying the supervisor about state of workers.
Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.model.Node
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
StatsReportListener - Class in org.apache.spark.scheduler: :: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
StatsReportListener - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
status() - Method in class org.apache.spark.scheduler.TaskInfo
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter: Return the standard deviation of the values.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext: Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
stop() - Method in class org.apache.spark.SparkContext: Shut down the SparkContext.
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely due to an exception
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams, with option of ensuring all received data has been processed.
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
StorageLevel - Class in org.apache.spark.storage: :: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
StorageLevels - Class in org.apache.spark.api.java: Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
StorageListener - Class in org.apache.spark.ui.storage: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the BlockManagerUI.
StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
StorageStatus - Class in org.apache.spark.storage: :: DeveloperApi :: Storage information for each BlockManager.
StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus: Create a storage status with an initial set of blocks, leaving the source unmodified.
storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
StorageStatusListener - Class in org.apache.spark.storage: :: DeveloperApi :: A SparkListener that maintains executor storage status.
StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store the bytes of received data as a data block into Spark's memory.
store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store a single item of received data to Spark's memory.
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver: Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
Strategy - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Stores all the configuration options for tree construction
Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy: Java-friendly constructor for Strategy
STREAM() - Static method in class org.apache.spark.storage.BlockId
StreamBlockId - Class in org.apache.spark.storage
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
streamed() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin: BuildRight means the right relation <=> the broadcast relation.
streamed() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
streamedKeys() - Method in interface org.apache.spark.sql.execution.HashJoin
streamedPlan() - Method in interface org.apache.spark.sql.execution.HashJoin
streamId() - Method in class org.apache.spark.storage.StreamBlockId
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver: Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
StreamingContext - Class in org.apache.spark.streaming: Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext: Accessor for nested Scala object
StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, and using it for prediction on (possibly different) streaming data.
StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train or predict a linear regression model on streaming data.
StreamingLinearRegressionWithSGD(double, int, double, Vector) - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
StreamingListener - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
streamSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.HashJoin
stringToText(String) - Static method in class org.apache.spark.SparkContext
StringType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the StringType object.
StringType - Class in org.apache.spark.sql.api.java: The data type representing String values.
stringWritableConverter() - Static method in class org.apache.spark.SparkContext
StructField - Class in org.apache.spark.sql.api.java: A StructField object represents a field in a StructType object.
StructType - Class in org.apache.spark.sql.api.java: The data type representing Rows.
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo: When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext: :: Experimental :: Submit a job for execution and return a FutureJob holding the result.
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
subtract(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
subtract(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
subtract(Vector) - Method in class org.apache.spark.util.Vector
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from `this` whose keys are not in `other`.
Success - Class in org.apache.spark: :: DeveloperApi :: Task succeeded.
Success() - Constructor for class org.apache.spark.Success
successful() - Method in class org.apache.spark.scheduler.TaskInfo
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Add up the elements in this RDD.
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Add up the elements in this RDD.
sum() - Method in class org.apache.spark.util.StatCounter
sum() - Method in class org.apache.spark.util.Vector
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: :: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: :: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: :: Experimental :: Approximate operation to return the sum within a timeout.
SVMDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
SVMModel - Class in org.apache.spark.mllib.classification: Model for Support Vector Machines (SVMs).
SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
SVMWithSGD - Class in org.apache.spark.mllib.classification: Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD: Construct a SVM object with default parameters
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener

T

t() - Method in class org.apache.spark.SerializableWritable
table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
table(String) - Method in class org.apache.spark.sql.SQLContext: Returns the specified table as a SchemaRDD
tableName() - Method in class org.apache.spark.sql.execution.CacheCommand
tableName() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
tableName() - Method in class org.apache.spark.sql.hive.execution.DropTable
tachyonFolderName() - Method in class org.apache.spark.SparkContext
tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
take(int) - Method in class org.apache.spark.sql.SchemaRDD
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first K elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first K elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the first K (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
TakeOrdered - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Take the first limit elements as defined by the sortOrder.
TakeOrdered(int, Seq<SortOrder>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.TakeOrdered
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD: Return a fixed-size sampled subset of this RDD in an array
TaskCompletionListener - Interface in org.apache.spark.util: :: DeveloperApi ::
TaskContext - Class in org.apache.spark: :: DeveloperApi :: Contextual information about a task which can be read or mutated during execution.
TaskContext(int, int, long, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContext
TaskEndReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task ended.
TaskFailedReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task failed.
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
TaskInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
TaskKilled - Class in org.apache.spark: :: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.
TaskKilled() - Constructor for class org.apache.spark.TaskKilled
TaskKilledException - Exception in org.apache.spark: :: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
TaskLocality - Class in org.apache.spark.scheduler
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskMetrics() - Method in class org.apache.spark.TaskContext
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
TaskResultBlockId - Class in org.apache.spark.storage
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
TaskResultLost - Class in org.apache.spark: :: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
taskScheduler() - Method in class org.apache.spark.SparkContext
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
TEST() - Static method in class org.apache.spark.storage.BlockId
TestHive - Class in org.apache.spark.sql.hive.test
TestHive() - Constructor for class org.apache.spark.sql.hive.test.TestHive
TestHiveContext - Class in org.apache.spark.sql.hive.test: A locally running test instance of Spark's Hive execution engine.
TestHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext
TestHiveContext.QueryExecution - Class in org.apache.spark.sql.hive.test: Override QueryExecution with special debug workflow.
TestHiveContext.QueryExecution() - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
TestHiveContext.TestTable - Class in org.apache.spark.sql.hive.test
TestHiveContext.TestTable(String, Seq<Function0<BoxedUnit>>) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
TestResult<DF> - Interface in org.apache.spark.mllib.stat.test: :: Experimental :: Trait for hypothesis test results.
TestSQLContext - Class in org.apache.spark.sql.test: A SQLContext that can be used for local testing.
TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
testTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: A list of test tables and the DDL required to initialize them.
testTempDir() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns thresholds in descending order.
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
Time - Class in org.apache.spark.streaming: This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
TimestampType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the TimestampType object.
TimestampType - Class in org.apache.spark.sql.api.java: The data type representing java.sql.Timestamp values.
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike: Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead
toArray() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a double array.
toArray() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a breeze matrix.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a breeze vector.
toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.rdd.RDD: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf: Return a string listing all keys and values, one per line.
toErrorString() - Method in class org.apache.spark.ExceptionFailure
toErrorString() - Static method in class org.apache.spark.ExecutorLostFailure
toErrorString() - Method in class org.apache.spark.FetchFailed
toErrorString() - Static method in class org.apache.spark.Resubmitted
toErrorString() - Method in interface org.apache.spark.TaskFailedReason: Error message displayed in the web UI.
toErrorString() - Static method in class org.apache.spark.TaskKilled
toErrorString() - Static method in class org.apache.spark.TaskResultLost
toErrorString() - Static method in class org.apache.spark.UnknownReason
toFormattedString() - Method in class org.apache.spark.streaming.Duration
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to IndexedRowMatrix.
toInt() - Method in class org.apache.spark.storage.StorageLevel
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
toJavaSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD: Returns this RDD as a JavaSchemaRDD.
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD: Return an iterator that contains all of the elements in this RDD.
toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top K elements from this RDD as defined by the specified Comparator[T].
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top K elements from this RDD using the natural ordering for T.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to RowMatrix, dropping row indices after grouping by row index.
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Drops row indices and converts this matrix to a RowMatrix.
TorrentBroadcastFactory - Class in org.apache.spark.broadcast: A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.
TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
toSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD: Returns this RDD as a SchemaRDD.
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.Accumulable
toString() - Method in class org.apache.spark.api.java.JavaRDD
toString() - Method in class org.apache.spark.broadcast.Broadcast
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult: String explaining the hypothesis test result.
toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Print full model.
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
toString() - Method in class org.apache.spark.mllib.tree.model.Node
toString() - Method in class org.apache.spark.mllib.tree.model.Split
toString() - Method in class org.apache.spark.partial.BoundedDouble
toString() - Method in class org.apache.spark.partial.PartialResult
toString() - Method in class org.apache.spark.rdd.RDD
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
toString() - Method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.SerializableWritable
toString() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
toString() - Method in class org.apache.spark.storage.BlockId
toString() - Method in class org.apache.spark.storage.BlockManagerId
toString() - Method in class org.apache.spark.storage.RDDInfo
toString() - Method in class org.apache.spark.storage.StorageLevel
toString() - Method in class org.apache.spark.streaming.Duration
toString() - Method in class org.apache.spark.streaming.Time
toString() - Method in class org.apache.spark.util.MutablePair
toString() - Method in class org.apache.spark.util.StatCounter
toString() - Method in class org.apache.spark.util.Vector
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for all the jobs of this batch to finish processing from the time they were submitted.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model over an RDD
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by users to some products, in the form of (userID, productID, rating) pairs.
trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Update the model by training on batches of data from a DStream.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document into a sparse term frequency vector.
transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document into a sparse term frequency vector (Java version).
transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document to term frequency vectors.
transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document to term frequency vectors (Java version).
transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms term frequency (TF) vectors to TF-IDF vectors.
transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer: Applies unit length normalization on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel: Applies standardization transformation on a vector.
transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on a vector.
transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on an RDD[Vector].
transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel: Transforms a word to its vector representation
transform(Function<R, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(JavaDStream, Function3<R, JavaRDD, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function2<RDD<T>, RDD, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function3<RDD<T>, RDD, Time, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream, Function3<R, JavaRDD, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns true positive rate for a given label (category)
TwitterUtils - Class in org.apache.spark.streaming.twitter
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
udf() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
udf() - Method in class org.apache.spark.sql.execution.EvaluatePython
UDF1<T1,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 1 arguments.
UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 10 arguments.
UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 11 arguments.
UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 12 arguments.
UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 13 arguments.
UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 14 arguments.
UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 15 arguments.
UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 16 arguments.
UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 17 arguments.
UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 18 arguments.
UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 19 arguments.
UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 2 arguments.
UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 20 arguments.
UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 21 arguments.
UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 22 arguments.
UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 3 arguments.
UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 4 arguments.
UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 5 arguments.
UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 6 arguments.
UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 7 arguments.
UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 8 arguments.
UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 9 arguments.
ui() - Method in class org.apache.spark.SparkContext
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
unbound() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory: Remove all persisted state associated with the HTTP broadcast with the given ID.
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory: Remove all persisted state associated with the torrent broadcast with the given ID.
uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Removes the specified table from the in-memory cache.
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
UniformGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.uniformRDD(org.apache.spark.SparkContext, long, int, long).
uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed.
uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.uniformVectorRDD(org.apache.spark.SparkContext, long, int, int, long).
uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed.
uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d.
uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d.
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs passed as variable-length arguments.
Union - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Union(Seq<SparkPlan>) - Constructor for class org.apache.spark.sql.execution.Union
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD: Combines the tuples of two RDDs with the same schema, keeping duplicates.
UnionRDD<T> - Class in org.apache.spark.rdd
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
UnknownReason - Class in org.apache.spark: :: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast: Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Delete cached copies of this broadcast on the executors.
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
update() - Method in class org.apache.spark.scheduler.AccumulableInfo
update() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
update(T1, T2) - Method in class org.apache.spark.util.MutablePair: Updates this pair with new values and returns itself
updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener: Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage aggregate metrics by calculating deltas between the currently recorded metrics and the new metrics.
Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
useDisk() - Method in class org.apache.spark.storage.StorageLevel
useMemory() - Method in class org.apache.spark.storage.StorageLevel
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
user() - Method in class org.apache.spark.mllib.recommendation.Rating
user() - Method in class org.apache.spark.scheduler.JobLogger
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
value() - Method in class org.apache.spark.Accumulable: Access the accumulator's current value; only allowed on master.
value() - Method in class org.apache.spark.broadcast.Broadcast: Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
value() - Method in interface org.apache.spark.FutureAction: The value of this Future.
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
value() - Method in class org.apache.spark.scheduler.AccumulableInfo
value() - Method in class org.apache.spark.SerializableWritable
value() - Method in class org.apache.spark.SimpleFutureAction
value() - Method in class org.apache.spark.sql.execution.SetCommand
values() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the values of each tuple.
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
values() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the values of each tuple.
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
variance() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.util.StatCounter: Return the variance of the values.
vClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
vClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
Vector - Interface in org.apache.spark.mllib.linalg: Represents a numeric vector, whose index type is Int and value type is Double.
Vector - Class in org.apache.spark.util
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
Vector.Multiplier - Class in org.apache.spark.util
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
Vector.VectorAccumParam$ - Class in org.apache.spark.util
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
Vectors - Class in org.apache.spark.mllib.linalg
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
VectorTransformer - Interface in org.apache.spark.mllib.feature: :: DeveloperApi :: Trait for transformation of a vector
version() - Method in class org.apache.spark.api.java.JavaSparkContext: The version of Spark on which this application is running.
version() - Method in class org.apache.spark.SparkContext: The version of Spark on which this application is running.
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
VoidFunction<T> - Interface in org.apache.spark.api.java.function: A function with no return value.

W

waiter() - Method in class org.apache.spark.streaming.StreamingContext
warehousePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
warehousePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
weightedFalsePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted false positive rate
weightedFMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged f-measure
weightedFMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged f1-measure
weightedPrecision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged precision
weightedRecall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged recall (equals to precision, recall and f-measure)
weightedTruePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted true positive rate (equals to precision, recall and f-measure)
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
where(Expression) - Method in class org.apache.spark.sql.SchemaRDD: Filters the output, only returning those rows where condition evaluates to true.
where(Symbol, Function1<T1, Object>) - Method in class org.apache.spark.sql.SchemaRDD: Filters tuples using a function over the value of the specified column.
where(Function1<DynamicRow, Object>) - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Filters tuples using a function over a Dynamic version of a given Row.
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
withMean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
withReplacement() - Method in class org.apache.spark.sql.execution.Sample
withStd() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
Word2Vec - Class in org.apache.spark.mllib.feature: :: Experimental :: Word2Vec creates vector representation of words in a text corpus.
Word2Vec() - Constructor for class org.apache.spark.mllib.feature.Word2Vec
Word2VecModel - Class in org.apache.spark.mllib.feature: :: Experimental :: Word2Vec model
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
wrapRDD(RDD<Row>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
writeAll(Iterator<T>, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream

Z

zero() - Method in class org.apache.spark.Accumulable
zero(R) - Method in interface org.apache.spark.AccumulableParam: Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
zeros(int) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector of all zeros.
zeros(int) - Static method in class org.apache.spark.util.Vector
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, boolean, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, boolean, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with generated unique Long ids.

_

_1() - Method in class org.apache.spark.util.MutablePair
_2() - Method in class org.apache.spark.util.MutablePair
_rddInfoMap() - Method in class org.apache.spark.ui.storage.StorageListener

A B C D E F G H I J K L M N O P Q R S T U V W Z _