Skip to main content

Tuning Spark Cyclone

Tuning a spark job will improve performance in most cases. Having said that, finding the good configuration is always a difficult task.

Number of Executors

We need to specify the --num-executors equals to the number of Vector Engine cores to maximize the job performance

Avoiding OpenMP from overloading the Vector Engine

If --num-executors is set to number of Vector Engine cores, OpenMP might overload the Vector Engine. To disable any OpenMP from happening, we can specify --conf spark.executorEnv.VE_OMP_NUM_THREADS=1 which allows only 1 thread on single process

Using Precompiled Directory

Using precompiled directory speeds up the performance. To use that, we need to specify first --conf spark.com.nec.spark.kernel.directory=/path/to/precompiled on the first run. After that we can specify --conf spark.com.nec.spark.kernel.precompiled=/path/to/precompiled to use precompiled directory.