fbpx

How To Run Scala In the Jupyter Notebook

I recently came across the need to run Scala programs in a notebook, for this Azure notebook is readily available however it is a costly solution for individuals who want to play around. The Jupyter notebook is one of the most used tools in data science projects. It’s a great tool for developing software in python and has great support for that. It can also be used for scala development with the spylon-kernel.

Writing this blog for all the individuals who need to run the Scala programs on Jupyter notebook.

There is a utility called spylon kernel which helps Scala to run on Jupyter.

Prerequisite:

Software –

  1. Spark (http://spark.apache.org/downloads.html)
  2. Hadoop (http://media.sundog-soft.com/Udemy/winutils.exe)
  3. JDK

Once you have downloaded all the Software listed above you would need to make certain modifications, and they are listed below:

Spark:

  1. Create a Folder in Spark on C drive and copy all the content from .tar to a newly created folder.
  2. Rename log4j.properties. template to log4j.properties
  3. Edit the same file and replace log4j.rootCategory=INFO, console to log4j.rootCategory=ERROR, console and then save and close the file.

Hadoop:

We are doing the below steps to execute spark programs on our local machine.

  1. After you’ve successfully downloaded winutils create a folder on c drive winutils\bin and tmp\hive.
  2. Paste wintils.exe in the bin folder.
  3. open a command prompt and run the following commands.

    – cd c:\winutils\bin

    – winutils.exe chmod 777 \tmp\hive

  1. It should run successfully.

Environment variables:

  1. SPARK_HOME: eg: “C:\Spark”
  2. HADOOP_HOME: “C:\winutils”
  3. JAVA_HOME

(all path should contain a path to the folder not till bin)

After you have set the env variables, run and check if the spark is running.

– cd c:\spark

– pyspark

If everything is OK, you should see an output like the image below.

For Jupyter scala, open Anaconda prompt and run the following commands.

pip install spylon-kernel python -m spylon_kernel install jupyter notebook

Once the installation is complete you can see the spylon-kernel in a New file dropdown.

If everything goes well the scala snippets should run like Usain Bolt (Pun Intended). If in case, it does not run, you would need to perform some additional steps and they are as follows.

You need to copy the content from the following zip file

“C:\Spark\python\lib\py4j-0.10.8.1-src.zip”

“C:\Spark\python\lib\pyspark.zip”

TO

\anaconda\Lib\site-packages

That’s It Enjoy SCALA with Jupyter.



Author: Tanvir
An aspiring digital marketer, a passionate singer, a guitarist and a mechanical engineer by degree. It would be so cool if I had lots of fans but the ceiling space is limited. You can find me on LinkedIn.

Leave a Reply