public abstract class BaseRelation
extends Object
StructType
. Concrete
implementation should inherit from one of the descendant Scan
classes, which define various
abstract methods for execution.
BaseRelations must also define an equality function that only returns true when the two instances will return the same data. This equality function is used when determining when it is safe to substitute cached results for a given relation.
Constructor and Description |
---|
BaseRelation() |
Modifier and Type | Method and Description |
---|---|
boolean |
needConversion()
Whether does it need to convert the objects in Row to internal representation, for example:
java.lang.String -> UTF8String
java.lang.Decimal -> Decimal
|
abstract StructType |
schema() |
long |
sizeInBytes()
Returns an estimated size of this relation in bytes.
|
abstract SQLContext |
sqlContext() |
Filter[] |
unhandledFilters(Filter[] filters)
Returns the list of
Filter s that this datasource may not be able to handle. |
public abstract SQLContext sqlContext()
public abstract StructType schema()
public long sizeInBytes()
Note that it is always better to overestimate size than underestimate, because underestimation could lead to execution plans that are suboptimal (i.e. broadcasting a very large table).
public boolean needConversion()
If needConversion
is false
, buildScan() should return an RDD
of InternalRow
Note: The internal representation is not stable across releases and thus data sources outside of Spark SQL should leave this as true.
public Filter[] unhandledFilters(Filter[] filters)
Filter
s that this datasource may not be able to handle.
These returned Filter
s will be evaluated by Spark SQL after data is output by a scan.
By default, this function will return all filters, as it is always safe to
double evaluate a Filter
. However, specific implementations can override this function to
avoid double filtering when they are capable of processing a filter internally.
filters
- (undocumented)