pyspark.sql.functions.array_intersect¶
-
pyspark.sql.functions.
array_intersect
(col1: ColumnOrName, col2: ColumnOrName) → pyspark.sql.column.Column[source]¶ Collection function: returns an array of the elements in the intersection of col1 and col2, without duplicates.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
an array of values in the intersection of two arrays.
Examples
>>> from pyspark.sql import Row >>> df = spark.createDataFrame([Row(c1=["b", "a", "c"], c2=["c", "d", "a", "f"])]) >>> df.select(array_intersect(df.c1, df.c2)).collect() [Row(array_intersect(c1, c2)=['a', 'c'])]