pyspark.SparkConf¶
-
class
pyspark.
SparkConf
(loadDefaults=True, _jvm=None, _jconf=None)[source]¶ Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.
Most of the time, you would create a SparkConf object with
SparkConf()
, which will load values from spark.* Java system properties as well. In this case, any parameters you set directly on theSparkConf
object take priority over system properties.For unit tests, you can also call
SparkConf(false)
to skip loading external settings and get the same configuration no matter what the system properties are.All setter methods in this class support chaining. For example, you can write
conf.setMaster("local").setAppName("My app")
.- Parameters
- loadDefaultsbool
whether to load values from Java system properties (True by default)
- _jvmclass:py4j.java_gateway.JVMView
internal parameter used to pass a handle to the Java VM; does not need to be set by users
- _jconfclass:py4j.java_gateway.JavaObject
Optionally pass in an existing SparkConf handle to use its parameters
Notes
Once a SparkConf object is passed to Spark, it is cloned and can no longer be modified by the user.
Examples
>>> from pyspark.conf import SparkConf >>> from pyspark.context import SparkContext >>> conf = SparkConf() >>> conf.setMaster("local").setAppName("My app") <pyspark.conf.SparkConf object at ...> >>> conf.get("spark.master") 'local' >>> conf.get("spark.app.name") 'My app' >>> sc = SparkContext(conf=conf) >>> sc.master 'local' >>> sc.appName 'My app' >>> sc.sparkHome is None True
>>> conf = SparkConf(loadDefaults=False) >>> conf.setSparkHome("/path") <pyspark.conf.SparkConf object at ...> >>> conf.get("spark.home") '/path' >>> conf.setExecutorEnv("VAR1", "value1") <pyspark.conf.SparkConf object at ...> >>> conf.setExecutorEnv(pairs = [("VAR3", "value3"), ("VAR4", "value4")]) <pyspark.conf.SparkConf object at ...> >>> conf.get("spark.executorEnv.VAR1") 'value1' >>> print(conf.toDebugString()) spark.executorEnv.VAR1=value1 spark.executorEnv.VAR3=value3 spark.executorEnv.VAR4=value4 spark.home=/path >>> for p in sorted(conf.getAll(), key=lambda p: p[0]): ... print(p) ('spark.executorEnv.VAR1', 'value1') ('spark.executorEnv.VAR3', 'value3') ('spark.executorEnv.VAR4', 'value4') ('spark.home', '/path') >>> conf._jconf.setExecutorEnv("VAR5", "value5") JavaObject id... >>> print(conf.toDebugString()) spark.executorEnv.VAR1=value1 spark.executorEnv.VAR3=value3 spark.executorEnv.VAR4=value4 spark.executorEnv.VAR5=value5 spark.home=/path
Methods
contains
(key)Does this configuration contain a given key?
get
(key[, defaultValue])Get the configured value for some key, or return a default otherwise.
getAll
()Get all values as a list of key-value pairs.
set
(key, value)Set a configuration property.
setAll
(pairs)Set multiple parameters, passed as a list of key-value pairs.
setAppName
(value)Set application name.
setExecutorEnv
([key, value, pairs])Set an environment variable to be passed to executors.
setIfMissing
(key, value)Set a configuration property, if not already set.
setMaster
(value)Set master URL to connect to.
setSparkHome
(value)Set path where Spark is installed on worker nodes.
Returns a printable version of the configuration, as a list of key=value pairs, one per line.