This post is on : How to Install Jupyter Notebook on AWS EC2 Instance for Machine Learning and Python scripting.
We will also install boto3 for taking advantage of powerful AWS libraries.
In real life Machine Learning or Deep Learning are highly memory and compute intensive processes and these cannot be supported by a standalone desktop. In these cases we can leverage on cloud and choose a server which is having right CPU/GPU for processing huge data. For this post i have chosen a micro server so as anyone can launch and learn as well it costs less / free for those who are within one year of free tier eligibility.
Note: AWS provides ML AMI & AWS SageMaker for ML Development, below post is yet another way to to develop ML which will gives us more flexibility in installing software’s on need by basis with full root access.
What we will do:
1) Launching an EC2 instance -micro.
2) Small change to default security group to allow Jupyter Notebook access EC2 instance via browser.
3) Installing Anaconda.
4) Configuring jupyter.
5) Installing boto3, AWS package
6) Launching jupyter notebook from browser, by connecting to EC2 server.
Assume you already aware of how to Launch a new EC2 instance. If not please check here remember to change the security group as below.
Launch a new EC2 and add SG changes as below.
Launching EC2 server – https://awsontop.com/index.php/how-to-start-my-first-ec2-instance-part1/
During the EC2 Security group configuration, edit and provide the TCP protocol port = 8888, and open to public. This will allow jupyter notebook once configured to launch from a browser.
- SSH into your EC2 Instance which was just launched as above. If you are not sure how to SSH, please check https://awsontop.com/index.php/how-to-start-my-first-ec2-instance-part2/.
- Do sudo su to take the root access and ‘yum update -y’ to make sure all latest packages are available.
- Check https://repo.continuum.io/archive/ for latest available installable Anaconda package. I have used ‘https://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-x86_64.sh‘.
- From your root enter below, to download Anaconda,
- wget https://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-x86_64.sh
- Keep on ‘Enter’ for all the agreement / license.
- Default location for the installation will also be set, if you want to change provide your location else give another ‘Enter’
- Note the path where it is installed, in this case /root/anaconda3
- That’s it Anaconda installation is complete.
- If you have noticed i had an issue when tried to find the python location, using ‘ which python’ command. It takes me to existing or preinstalled python path (ie) ‘/bin/python’ and my new installation happened in ‘/root/anaconda3’. So when i have issued the command ‘which python’ it sources from old library rather the new one.
- Also you will get another familiar error ‘bash: ipython: command not found’, this again caused by the older python version when we try to set the password for the our jupyter notebook little later.
- If you face above problems follow this step else skip it. Fixing both issues is simple, edit the bash profile as below and start a new ssh session for the new profile to get reflected.
Go to ‘cd ~’ and edit .bash_profile to add below lines,
export IPYTHON_HOME = /root/anaconda3
save and exit.
- Launch a new session if you have faced above problem, else continue with the same session and check for your python location and version as below using ‘which python’.
- Create a password to access jupyter from browser using ipython library as below, IMPORTANT you will have to copy the key after verifying password, which will used as input while creating certificates in later part.
- Configure the installed Anaconda/Jupyter by generating a config file and providing the above generated password.
- Create a certificate by generating the pem file, this is a different file from you AWS pem file, as below,
- Create a directory to store the certificate and run the ssl command to generate the pem file as below
[[email protected] ec2-user]# mkdir certs
- It will ask for few inputs like country,state, city and others, please provide the same — not required to be perfect.
- Now edit the already generated ‘jupyter/jupyter_notebook_config.py’ config file with above certificate and password.
[[email protected] ~]# vi .jupyter/jupyter_notebook_config.py
- Go to the last line and add the below code,
c = get_config()
# Kernel config
c.IPKernelApp.pylab = ‘inline’
# Notebook config
c.NotebookApp.certfile = u’/home/ec2-user/certs/jupyterpython1.pem’
c.NotebookApp.ip = ‘*’
c.NotebookApp.open_browser = False
c.NotebookApp.password = u’sha1:<<the password generated as above>>>’
c.NotebookApp.port = 8888 #same port number as configured in SG setup.
Now create working directory for jupyter workbook
- Create a new working folder for storing all the files.
- Start the jupyter service to start working by running ‘jupyter notebook –allow-root.
Thats it !!! the Jupyter Service is up and running, open the browser and enter the ec2 ip address as below to open the console,
- To access AWS libraries, open another SSH and install boto3 ‘pip install boto3’
- Note: You may get ‘Your connection is not secure’ warning, add to exception and proceed.
- Enter the password which was generate earlier.
- Go to new Notebook Python3 to start writing your code.
Now you have your cloud environment ready to develop highly compute intensive Machine Learning objects.
Happy Learning !!!