:: Experimental ::
A base class for user-defined aggregations, which can be used in Dataset
operations to take
all of the elements of a group and reduce them to a single value.
A Row
representing a mutable aggregation buffer.
A Row
representing a mutable aggregation buffer.
This is not meant to be extended outside of Spark.
1.5.0
The base class for implementing user-defined aggregate functions (UDAF).
The base class for implementing user-defined aggregate functions (UDAF).
1.5.0
A user-defined function.
A user-defined function. To create one, use the udf
functions in functions
.
As an example:
// Define a UDF that returns true or false based on some numeric score. val predict = udf((score: Double) => score > 0.5) // Projects a column that adds a prediction column based on the score column. df.select( predict(df("score")) )
1.3.0
Utility functions for defining window in DataFrames.
Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date") .rowsBetween(Window.unboundedPreceding, Window.currentRow) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
1.4.0
A window specification that defines the partitioning, ordering, and frame boundaries.
A window specification that defines the partitioning, ordering, and frame boundaries.
Use the static methods in Window to create a WindowSpec.
1.4.0
Utility functions for defining window in DataFrames.
Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date") .rowsBetween(Window.unboundedPreceding, Window.currentRow) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
1.4.0
When ordering is not defined, an unbounded window frame (rowFrame, unboundedPreceding, unboundedFollowing) is used by default. When ordering is defined, a growing window frame (rangeFrame, unboundedPreceding, currentRow) is used by default.
:: Experimental :: A base class for user-defined aggregations, which can be used in
Dataset
operations to take all of the elements of a group and reduce them to a single value.For example, the following aggregator extracts an
int
from a specific class and adds them up:Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird
The input type for the aggregation.
The type of the intermediate value of the reduction.
The type of the final output result.
1.6.0