public class InsertIntoHiveTable extends org.apache.spark.sql.catalyst.plans.logical.LogicalPlan implements SaveAsHiveFile, scala.Product, scala.Serializable
This class is mostly a mess, for legacy reasons (since it evolved in organic ways and had to follow Hive's internal implementations closely, which itself was a mess too). Please don't blame Reynold for this! He was just moving code around!
In the future we should converge the write path for Hive with the normal data source write path,
as defined in org.apache.spark.sql.execution.datasources.FileFormatWriter
.
param: table the metadata of the table.
param: partition a map from the partition key to the partition value (optional). If the partition
value is optional, dynamic partition insert will be performed.
As an example, INSERT INTO tbl PARTITION (a=1, b=2) AS ...
would have
Map('a' -> Some('1'), 'b' -> Some('2'))
and INSERT INTO tbl PARTITION (a=1, b) AS ...
would have
Map('a' -> Some('1'), 'b' -> None)
.
param: query the logical plan representing data to write to.
param: overwrite overwrite existing table or partitions.
param: ifPartitionNotExists If true, only write if the partition does not exist.
Only valid for static partitions.Constructor and Description |
---|
InsertIntoHiveTable(org.apache.spark.sql.catalyst.catalog.CatalogTable table,
scala.collection.immutable.Map<String,scala.Option<String>> partition,
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan query,
boolean overwrite,
boolean ifPartitionNotExists,
scala.collection.Seq<String> outputColumnNames) |
Modifier and Type | Method and Description |
---|---|
abstract static R |
apply(T1 v1,
T2 v2,
T3 v3,
T4 v4,
T5 v5,
T6 v6) |
boolean |
ifPartitionNotExists() |
scala.collection.Seq<String> |
outputColumnNames() |
boolean |
overwrite() |
scala.collection.immutable.Map<String,scala.Option<String>> |
partition() |
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan |
query() |
scala.collection.Seq<Row> |
run(SparkSession sparkSession,
org.apache.spark.sql.execution.SparkPlan child)
Inserts all the rows in the table into Hive.
|
org.apache.spark.sql.catalyst.catalog.CatalogTable |
table() |
static String |
toString() |
analyzed, assertNotAnalysisRule, childrenResolved, constraints, constructIsNotNullConstraints, inferAdditionalConstraints, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, invalidateStatsCache, isStreaming, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, maxRows, maxRowsPerPartition, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$_analyzed_$eq, org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$_analyzed, outputOrdering, refresh, resolve, resolve, resolveChildren, resolved, resolveExpressions, resolveOperators, resolveOperatorsDown, resolveOperatorsUp, resolveQuoted, setAnalyzed, statePrefix, stats, statsCache_$eq, statsCache, transformAllExpressions, transformDown, transformUp, validConstraints, verboseStringWithSuffix
allAttributes, canEvaluate, canEvaluateWithinJoin, canonicalized, conf, doCanonicalize, expressions, innerChildren, inputSet, isCanonicalizedPlan, mapExpressions, missingInput, normalizeExprId, normalizePredicates, org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1, org$apache$spark$sql$catalyst$plans$QueryPlan$$seqToExpressions$1, output, outputSet, printSchema, producedAttributes, references, replaceAlias, sameResult, schema, schemaString, semanticHash, simpleString, splitConjunctivePredicates, splitDisjunctivePredicates, subqueries, transformExpressions, transformExpressionsDown, transformExpressionsUp, verboseString
apply, argString, asCode, children, collect, collectFirst, collectLeaves, containsChild, fastEquals, find, flatMap, foreach, foreachUp, generateTreeString, generateTreeString$default$5, generateTreeString$default$6, hashCode, jsonFields, makeCopy, map, mapChildren, mapProductIterator, nodeName, numberedTreeString, org$apache$spark$sql$catalyst$trees$TreeNode$$allChildren, org$apache$spark$sql$catalyst$trees$TreeNode$$collectJsonValue$1, org$apache$spark$sql$catalyst$trees$TreeNode$$getNodeNumbered, org$apache$spark$sql$catalyst$trees$TreeNode$$mapChild$1, org$apache$spark$sql$catalyst$trees$TreeNode$$mapChild$2, org$apache$spark$sql$catalyst$trees$TreeNode$$mapTreeNode$1, org$apache$spark$sql$catalyst$trees$TreeNode$$parseToJson, org$apache$spark$sql$catalyst$trees$TreeNode$$simpleClassName, origin, otherCopyArgs, p, prettyJson, productIterator, productPrefix, stringArgs, toJSON, toString, transform, treeString, treeString, treeString$default$2, withNewChildren
createdTempDir, deleteExternalTmpPath, executionId, getExternalScratchDir, getExternalTmpPath, getExtTmpPathRelTo, getStagingDir, newVersionExternalTempPath, oldVersionExternalTempPath, saveAsHiveFile
basicWriteJobStatsTracker, children, metrics, outputColumns
productArity, productElement, productIterator, productPrefix
initializeLogging, log_
public InsertIntoHiveTable(org.apache.spark.sql.catalyst.catalog.CatalogTable table, scala.collection.immutable.Map<String,scala.Option<String>> partition, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan query, boolean overwrite, boolean ifPartitionNotExists, scala.collection.Seq<String> outputColumnNames)
public abstract static R apply(T1 v1, T2 v2, T3 v3, T4 v4, T5 v5, T6 v6)
public static String toString()
public org.apache.spark.sql.catalyst.catalog.CatalogTable table()
public scala.collection.immutable.Map<String,scala.Option<String>> partition()
public org.apache.spark.sql.catalyst.plans.logical.LogicalPlan query()
query
in interface org.apache.spark.sql.execution.command.DataWritingCommand
public boolean overwrite()
public boolean ifPartitionNotExists()
public scala.collection.Seq<String> outputColumnNames()
outputColumnNames
in interface org.apache.spark.sql.execution.command.DataWritingCommand
public scala.collection.Seq<Row> run(SparkSession sparkSession, org.apache.spark.sql.execution.SparkPlan child)
org.apache.hadoop.hive.serde2.SerDe
and the
org.apache.hadoop.mapred.OutputFormat
provided by the table definition.run
in interface org.apache.spark.sql.execution.command.DataWritingCommand
sparkSession
- (undocumented)child
- (undocumented)