See: Description
Interface | Description |
---|---|
SparkAppHandle |
A handle to a running Spark application.
|
SparkAppHandle.Listener |
Listener for updates to a handle's state.
|
Class | Description |
---|---|
AbstractLauncher<T extends AbstractLauncher> |
Base class for launcher implementations.
|
InProcessLauncher |
In-process launcher for Spark applications.
|
SparkLauncher |
Launcher for Spark applications.
|
Enum | Description |
---|---|
SparkAppHandle.State |
Represents the application's state.
|
There are two ways to start applications with this library: as a child process, using
SparkLauncher
, or in-process, using
InProcessLauncher
.
The AbstractLauncher.startApplication(
org.apache.spark.launcher.SparkAppHandle.Listener...)
method can be used to start Spark and
provide a handle to monitor and control the running application:
import org.apache.spark.launcher.SparkAppHandle;
import org.apache.spark.launcher.SparkLauncher;
public class MyLauncher {
public static void main(String[] args) throws Exception {
SparkAppHandle handle = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.startApplication();
// Use handle API to monitor / control application.
}
}
Launching applications as a child process requires a full Spark installation. The installation directory can be provided to the launcher explicitly in the launcher's configuration, or by setting the SPARK_HOME environment variable.
Launching applications in-process is only recommended in cluster mode, since Spark cannot run multiple client-mode applications concurrently in the same process. The in-process launcher requires the necessary Spark dependencies (such as spark-core and cluster manager-specific modules) to be present in the caller thread's class loader.
It's also possible to launch a raw child process, without the extra monitoring, using the
SparkLauncher.launch()
method:
import org.apache.spark.launcher.SparkLauncher;
public class MyLauncher {
public static void main(String[] args) throws Exception {
Process spark = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.launch();
spark.waitFor();
}
}
This method requires the calling code to manually manage the child process, including its
output streams (to avoid possible deadlocks). It's recommended that
SparkLauncher.startApplication(
org.apache.spark.launcher.SparkAppHandle.Listener...)
be used instead.