spark

Core Spark functionality. org.apache.spark.SparkContext serves as the main entry point to Spark, while org.apache.spark.rdd.RDD is the data type representing a distributed collection, and provides most parallel operations.

In addition, org.apache.spark.rdd.PairRDDFunctions contains operations available only on RDDs of key-value pairs, such as groupByKey and join; org.apache.spark.rdd.DoubleRDDFunctions contains operations available only on RDDs of Doubles; and org.apache.spark.rdd.SequenceFileRDDFunctions contains operations available on RDDs that can be saved as SequenceFiles. These operations are automatically available on any RDD of the right type (e.g. RDD[(Int, Int)] through implicit conversions when you import org.apache.spark.SparkContext._.

Java programmers should reference the spark.api.java package for Spark programming APIs in Java.

Linear Supertypes

AnyRef, Any

Type Members

class Accumulable[R, T] extends Serializable

A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
trait AccumulableParam[R, T] extends Serializable

Helper object defining how to accumulate values of a particular type.
class Accumulator[T] extends Accumulable[T, T]

A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.
trait AccumulatorParam[T] extends AccumulableParam[T, T]

A simpler version of org.apache.spark.AccumulableParam where the only data type you can add in is the same type as the accumulated value.
case class Aggregator[K, V, C](createCombiner: (V) ⇒ C, mergeValue: (C, V) ⇒ C, mergeCombiners: (C, C) ⇒ C) extends Product with Serializable

A set of functions used to aggregate data.
class ComplexFutureAction[T] extends FutureAction[T]

A FutureAction for actions that could trigger multiple Spark jobs.
abstract class Dependency[T] extends Serializable

Base class for dependencies.
trait FutureAction[T] extends Future[T]

A future for the result of an action to support cancellation.
class HashPartitioner extends Partitioner

A org.apache.spark.Partitioner that implements hash-based partitioning using Java's Object.hashCode.
class InterruptibleIterator[+T] extends Iterator[T]

An iterator that wraps around an existing iterator to provide task killing functionality.
trait Logging extends AnyRef

Utility trait for classes that want to log data.
abstract class NarrowDependency[T] extends Dependency[T]

Base class for dependencies where each partition of the parent RDD is used by at most one partition of the child RDD.
class OneToOneDependency[T] extends NarrowDependency[T]

Represents a one-to-one dependency between partitions of the parent and child RDDs.
trait Partition extends Serializable

A partition of an RDD.
abstract class Partitioner extends Serializable

An object that defines how the elements in a key-value pair RDD are partitioned by key.
class RangeDependency[T] extends NarrowDependency[T]

Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
class RangePartitioner[K, V] extends Partitioner

A org.apache.spark.Partitioner that partitions sortable records by range into roughly equal ranges.
class SerializableWritable[T <: Writable] extends Serializable
class ShuffleDependency[K, V] extends Dependency[Product2[K, V]]

Represents a dependency on the output of a shuffle stage.
class SimpleFutureAction[T] extends FutureAction[T]

A FutureAction holding the result of an action that triggers a single job.
class SparkConf extends Cloneable with Logging

Configuration for a Spark application.
class SparkContext extends Logging

Main entry point for Spark functionality.
class SparkEnv extends Logging

Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
class SparkException extends Exception
class SparkFiles extends AnyRef
class TaskContext extends Serializable

Value Members

object Partitioner extends Serializable
object SparkContext

The SparkContext object contains a number of implicit conversions and parameters for use with various Spark features.
object SparkEnv extends Logging
package api
package broadcast

Package for broadcast variables.
package executor
package io
package metrics
package partial
package rdd
package scheduler
package serializer
package storage
package util

package spark

Type Members

class Accumulable[R, T] extends Serializable

trait AccumulableParam[R, T] extends Serializable

class Accumulator[T] extends Accumulable[T, T]

trait AccumulatorParam[T] extends AccumulableParam[T, T]

case class Aggregator[K, V, C](createCombiner: (V) ⇒ C, mergeValue: (C, V) ⇒ C, mergeCombiners: (C, C) ⇒ C) extends Product with Serializable

class ComplexFutureAction[T] extends FutureAction[T]

abstract class Dependency[T] extends Serializable

trait FutureAction[T] extends Future[T]

class HashPartitioner extends Partitioner

class InterruptibleIterator[+T] extends Iterator[T]

trait Logging extends AnyRef

abstract class NarrowDependency[T] extends Dependency[T]

class OneToOneDependency[T] extends NarrowDependency[T]

trait Partition extends Serializable

abstract class Partitioner extends Serializable

class RangeDependency[T] extends NarrowDependency[T]

class RangePartitioner[K, V] extends Partitioner

class SerializableWritable[T <: Writable] extends Serializable

class ShuffleDependency[K, V] extends Dependency[Product2[K, V]]

class SimpleFutureAction[T] extends FutureAction[T]

class SparkConf extends Cloneable with Logging

class SparkContext extends Logging

class SparkEnv extends Logging

class SparkException extends Exception

class SparkFiles extends AnyRef

class TaskContext extends Serializable

Value Members

object Partitioner extends Serializable

object SparkContext

object SparkEnv extends Logging

package api

package broadcast

package executor

package io

package metrics

package partial

package rdd

package scheduler

package serializer

package storage

package util

Inherited from AnyRef

Inherited from Any

Ungrouped