- October 30, 2020
- Posted by: Swati.patel
- Category: Blogs
I recently came across the need to run Scala programs in a notebook, for this Azure Notebook is readily available however it is a costly solution for individuals who want to play around. The Jupyter Notebook is one of the most used tools in data science projects. It’s a great tool for developing software in Python and has great support for that. It can also be used for scale development with the spylon-kernel.
Writing this blog for all the individuals who need to run the Scala programs on Jupyter Notebook.
There is a utility called spylon kernel which helps Scala to run on Jupyter.
Prerequisite:
Software –
- Spark (http://spark.apache.org/downloads.html)
- Hadoop (http://media.sundog-soft.com/Udemy/winutils.exe)
- JDK (https://www.oracle.com/in/java/technologies/downloads/)
Once you have downloaded all the Software listed above you would need to make certain modifications, and they are listed below:
Spark:
- Create a Folder in Spark on C drive and copy all the content from .tar to a newly created folder.
- Rename log4j.properties. template to log4j.properties
- Edit the same file and replace log4j.rootCategory=INFO, console to log4j.rootCategory=ERROR, console and then save and close the file.
Hadoop:
We are doing the below steps to execute spark programs on our local machine.
- After you’ve successfully downloaded winutils create a folder on c drive winutils\bin and tmp\hive.
- Paste wintils.exe in the bin folder.
- open a command prompt and run the following commands.
– cd c:\winutils\bin
– winutils.exe chmod 777 \tmp\hive
- It should run successfully.
Environment variables:
- SPARK_HOME: eg: “C:\Spark”
- HADOOP_HOME: “C:\winutils”
- JAVA_HOME
(all paths should contain a path to the folder not to the bin)
After you have set the env variables, run and check if the spark is running.
– cd c:\spark
– pyspark
If everything is OK, you should see an output like the image below.
For Jupyter Scala, open the Anaconda prompt and run the following commands.
pip install spylon-kernel python -m spylon_kernel install Jupyter notebook
Once the installation is complete you can see the spylon-kernel in a New file dropdown.
If everything goes well the scala snippets should run like Usain Bolt (Pun Intended). If in case, it does not run, you would need to perform some additional steps and they are as follows.
You need to copy the content from the following zip file
“C:\Spark\python\lib\py4j-0.10.8.1-src.zip”
“C:\Spark\python\lib\pyspark.zip”
TO
\anaconda\Lib\site-packages
That’s It Enjoy SCALA with Jupyter.
good show, I love afourtech.com !
https://afourtech.com/how-to-run-scala-in-the-jupyter-notebook/
nice post, I love afourtech.com !
https://afourtech.com/how-to-run-scala-in-the-jupyter-notebook/
Howdy! Would you mind if I share your blog with my facebook group? There’s a lot of people that I think would really appreciate your content. Please let me know. Thanks
Everything is very open with a precise description of the challenges. It was definitely informative. Your site is very useful. Many thanks for sharing!
I really like it when individuals come together and share views. Great site, continue the good work!
Excellent post. I am going through some of these issues as well..
It’s hard to come by knowledgeable people about this subject, however, you seem like you know what you’re talking about! Thanks