3.1.6.4 Configure Pyspark Interpreter

Users must write for connection either in the Initialization section or in the notebook’s paragraph. This interpreter is used to write the pyspark language to query and perform analytics on data present in big data. This requires additional configuration, which must be performed as a prerequisite or as postinstallation with the manual change of interpreter settings.

In the pyspark interpreter, you can configure the Python binary executable for PySpark in both driver and workers, set 'True' to use IPython, else set it to 'False'.

To configure the pyspark interpreter variant, follow these steps:
  1. On the Interpreter page LHS menu, select pyspark. The pyspark interpreter pane is displayed.
  2. On the Interpreter Settings page, expand Interpreter Client Configurations and click the Edit icon for <Class Name> (zeppelin). The Interpreter Client Configurations Window is displayed.
  3. Enter the following information in the pyspark interpreter variant pane as tabulated in the following table

    Table 3-5 pyspark interpreter

    Field Description
    zeppelin.pyspark.python Enter the Python binary executable for PySpark in both drivers and workers. The default value is python.

    For example, python

    zeppelin.pyspark.useIPython Set to 'True' to use IPython, else set to 'False'.
    zeppelin.interpreter.output.limit Output message from interpreter exceeding the limit will be truncated