public interface Encoder<T>
extends scala.Serializable
T
to and from the internal Spark SQL representation.
== Scala ==
Encoders are generally created automatically through implicits from a SparkSession
, or can be
explicitly created by calling static methods on Encoders
.
import spark.implicits._
val ds = Seq(1, 2, 3).toDS() // implicitly provided (spark.implicits.newIntEncoder)
== Java ==
Encoders are specified by calling static methods on Encoders
.
List<String> data = Arrays.asList("abc", "abc", "xyz");
Dataset<String> ds = context.createDataset(data, Encoders.STRING());
Encoders can be composed into tuples:
Encoder<Tuple2<Integer, String>> encoder2 = Encoders.tuple(Encoders.INT(), Encoders.STRING());
List<Tuple2<Integer, String>> data2 = Arrays.asList(new scala.Tuple2(1, "a");
Dataset<Tuple2<Integer, String>> ds2 = context.createDataset(data2, encoder2);
Or constructed from Java Beans:
Encoders.bean(MyClass.class);
== Implementation == - Encoders are not required to be thread-safe and thus they do not need to use locks to guard against concurrent access if they reuse internal buffers to improve performance.
Modifier and Type | Method and Description |
---|---|
scala.reflect.ClassTag<T> |
clsTag()
A ClassTag that can be used to construct and Array to contain a collection of `T`.
|
StructType |
schema()
Returns the schema of encoding this type of object as a Row.
|
StructType schema()
scala.reflect.ClassTag<T> clsTag()