pyspark.sql.functions.
bucket
Partition transform function: A transform for any type that partitions by a hash of the input column.
New in version 3.1.0.
Changed in version 3.4.0: Supports Spark Connect.
Column
target date or timestamp column to work on.
data partitioned by given columns.
Notes
This function can be used only in combination with partitionedBy() method of the DataFrameWriterV2.
partitionedBy()
Examples
>>> df.writeTo("catalog.db.table").partitionedBy( ... bucket(42, "ts") ... ).createOrReplace()