Package

org.apache.spark.sql

expressions

Permalink

package expressions

Visibility
  1. Public
  2. All

Type Members

  1. abstract class Aggregator[-IN, BUF, OUT] extends Serializable

    Permalink

    :: Experimental :: A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.

    :: Experimental :: A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.

    For example, the following aggregator extracts an int from a specific class and adds them up:

    case class Data(i: Int)
    
    val customSummer =  new Aggregator[Data, Int, Int] {
      def zero: Int = 0
      def reduce(b: Int, a: Data): Int = b + a.i
      def merge(b1: Int, b2: Int): Int = b1 + b2
      def finish(r: Int): Int = r
    }.toColumn()
    
    val ds: Dataset[Data] = ...
    val aggregated = ds.select(customSummer)

    Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird

    IN

    The input type for the aggregation.

    BUF

    The type of the intermediate value of the reduction.

    OUT

    The type of the final output result.

    Annotations
    @Experimental() @Evolving()
    Since

    1.6.0

  2. abstract class MutableAggregationBuffer extends Row

    Permalink

    A Row representing a mutable aggregation buffer.

    A Row representing a mutable aggregation buffer.

    This is not meant to be extended outside of Spark.

    Annotations
    @Stable()
    Since

    1.5.0

  3. abstract class UserDefinedAggregateFunction extends Serializable

    Permalink

    The base class for implementing user-defined aggregate functions (UDAF).

    The base class for implementing user-defined aggregate functions (UDAF).

    Annotations
    @Stable()
    Since

    1.5.0

  4. case class UserDefinedFunction(f: AnyRef, dataType: DataType, inputTypes: Option[Seq[DataType]]) extends Product with Serializable

    Permalink

    A user-defined function.

    A user-defined function. To create one, use the udf functions in functions.

    As an example:

    // Define a UDF that returns true or false based on some numeric score.
    val predict = udf((score: Double) => score > 0.5)
    
    // Projects a column that adds a prediction column based on the score column.
    df.select( predict(df("score")) )
    Annotations
    @Stable()
    Since

    1.3.0

  5. class Window extends AnyRef

    Permalink

    Utility functions for defining window in DataFrames.

    Utility functions for defining window in DataFrames.

    // PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    Window.partitionBy("country").orderBy("date")
      .rowsBetween(Window.unboundedPreceding, Window.currentRow)
    
    // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
    Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
    Annotations
    @Stable()
    Since

    1.4.0

  6. class WindowSpec extends AnyRef

    Permalink

    A window specification that defines the partitioning, ordering, and frame boundaries.

    A window specification that defines the partitioning, ordering, and frame boundaries.

    Use the static methods in Window to create a WindowSpec.

    Annotations
    @Stable()
    Since

    1.4.0

Value Members

  1. object Window

    Permalink

    Utility functions for defining window in DataFrames.

    Utility functions for defining window in DataFrames.

    // PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    Window.partitionBy("country").orderBy("date")
      .rowsBetween(Window.unboundedPreceding, Window.currentRow)
    
    // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
    Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
    Annotations
    @Stable()
    Since

    1.4.0

    Note

    When ordering is not defined, an unbounded window frame (rowFrame, unboundedPreceding, unboundedFollowing) is used by default. When ordering is defined, a growing window frame (rangeFrame, unboundedPreceding, currentRow) is used by default.

  2. package javalang

    Permalink
  3. package scalalang

    Permalink

Ungrouped