School of Computer Science Dr. Ying Zhou
COMP5349: Cloud Computing 2.Sem./2020
Objectives
Week 2: AWS EC2 Tutorial
In this tutorial, we will practice basic steps of requesting and using AWS EC2 instance. The practice aims to highlight:
• the difference between virtual machine (EC2 instance) and machine image (AMI)
• various networking options including internal/external DNS, firewall through secu-
rity group rules
• the benefit of using separate storage mechanism (EBS)
Requesting/Setting Up AWS Account
In this section, we set up AWS account, and then apply for AWS Education credit. If you already have an AWS account, you can start applying for AWS Education Credit directly. If you have also finished that, you can start working on Question 1.
Firstly, go to https://aws.amazon.com/, and follow the instructions to create an AWS account. You need to provide your email, mobile phone number, as well as credit card information (debit cards also work in most cases). You are recommended to use the uni- versity email address. Note that AWS might “hold” 1 USD for a short period, in order to verify the credit card information.
WARNING: It is your responsibility to shut down all AWS resources after finishing using it. Otherwise, if the money spent exceeds the Education Credit limit, AWS will charge from your credit card.
Secondly, go to https://aws.amazon.com/education/awseducate/, join in AWS Ed- ucation. You should join in as a “student”. AWS will ask you to provide AWS account number, which can be found in AWS Console – My Account. After everything has been set up, you should receive an promotion code in your email. Apply your promotion code to AWS Console – My Account – Credit. Note that if you do not receive the promotion code now, you can still proceed with the exercises in this week, AWS only charges you money at the end of each month. We have almost 3 weeks to fix it. Don’t waste time now.
1
05.03.2020
You may have also received an email inviting you to join AWS Classroom. AWS Class- room is a separate account for you to use. It contains 50 dollars. Note that a large number of functions of AWS is not available in AWS Classroom, e.g. Deep Learning AMI used in Question 2. So, in this lab, we will use normal AWS account.
Question 1: Basic Steps of Launching Linux EC2 Instance
Follow the AWS tutorial to learn basic steps of requesting an EC2 instance and connecting to an EC2 instance. If you have used AWS before, you may skip this step. If you take the tutorial, remember to terminate the instance. https://docs.aws.amazon.com/AWSEC2/ latest/UserGuide/EC2_GetStarted.html
Question 2: Requesting an Instance with deep learning environment
In this section, we launch another instance with pre-build deep learning environment. We will use AWS Deep Learning AMI (Ubuntu) to launch the instance. It can be found from Amazon Marketplace in the AMI selection page by querying ‘Deep Learning AMI (Ubuntu)’ (see Fig 1). After selecting the AMI, you will see a pop up screen showing useful information about this AMI. These include: the type of instances this AMI can run and their respective prices; basic information about the AMI; and a few links to tutorials and usage guide. In this section, we will use the cheapest instance t2.small.
Figure 1: AWS Deep Learning AMI
Click Continue to go to Step 2: Choose an Instance Type and choose t2.small. We do not need to do any further configuration at the moment. Click Review and Launch will bring you to Step 7: Review Instance Launch. Click Launch to launch your instance. On the launch pop up widow, Select the key pair you just created in question 1 and click Launch Instances.
Question 3: Connecting to the Instance and Configuring Jupyter Notebook
Once your instance has been launched, click View Instances will bring you to the EC2 instance window. Select the instance you just created and click Connect to view the
2
Figure 2: Connection instruction
instruction on connecting to your instance (Figure 2). You can copy the SSH command and paste it on your shell window to make connection.
This AMI contains a list of popular systems and packages for developing and running deep learning applications. We are going to use jupyter notebook to execute some sample code on this instance.
By default, Jupyter notebook server runs locally on port 8888. It can be configured to run on public interface. This can be achieved at command line or by updating the configuration file: /.jupyter/jupyter notebook config.py.
Using an text editor to open the this file. Uncomment the line c.NotebookApp.ip = ’localhost’ and change it to c.NotebookApp.ip = ’0.0.0.0’. Save the change. Note that securing a Jupyter notebook server running on public interface needs a lot more con- figuration changes. But we will just run it on a not secured mode. This simple configuration can be invoked directly from command line as jupyter notebook –ip=0.0.0.0
Question 4: Configuring Security Group Rules
We need to update the firewall setting to enable external access to Jupyter notebook server. The default security setting only allows incoming SSH connections at port 22. All other connections will be blocked.
We need to open port 8888 in the instance to enable remote client talking to Jupyter notebook server. On EC2’s instance window. This can be done by adding a rule in the security group. Select the instance you just created from AWS management console. Click the security group associated with this instance. It should be called something like Deep Learning AMI (Ubuntu) Version 21.2 …. You can find it from the instance row (you need to scroll to the second last column), or from the Description tab at the bottom. Once you are on the security group page, click edit to add a new inbound rule on port 8888 as show blow and save it.
note: The security group setting can be configured before launching the instance. Re- member that we skipped a few configuration steps by jumping from Step 2: Choose an Instance Type directly to Step 7: Review Instance Launch. All other steps in between allow for configuring various properties, including security group.
3
Figure 3: Security Group Setting
Question 5: Starting and Connecting to Jupyter Notebook Server
On your instance’s SSH window, start Jupyter notebook from directory /tutorials with the following command:
jupyter notebook –no-browser The notebook may take a while to start. When it fin- ishes, you will see something like :
Figure 4: Jupyter Starting Logs
Copy and paste the given url in your browser. This url uses EC2 internal DNS with format like ip-xxx-xxx-xxx-xxx as the host address, you need to replace the internal one with the external DNS that you can find in the instance window. The external DNS has format like ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com
The tutorials directory contains deep learning tutorials written in different framework. You may try to run the simple ones. Avoid running more complicated examples, which might take a long time to execute.
4
Question 6: Homework: Attaching an EBS volume
When starting the instance, we use the default storage setting with a boot EBS volume of 75G. This volume will be deleted after you terminate the instance. A typical way to persist your data and/or code developed on EC2 is to put them on a separate EBS volume that can be unmounted and detached from this instance. EBS volumes can have their own life cycle independent to any instance. They can be detached from one instance and reattached to another instance. An instanced can specify multiple EBS volumes at requesting time. It is also possible to attache an EBS volume to a running instance. Attaching an new EBS volume to a running instance involves the following steps:
1. Create a volume with desirable capacity from AWS management console. 2. Attach it to an instance
3. Inside the instance, format the volume with appropriate file system format 4. Mount it to a directory in the instance’s file system
In this section, we will go through the above steps. a) Creating an EBS volume
From AWS EC2 dashboard, click the Volumes under EBS heading (Figure 5).
Figure 5: EBS Volumes in EC2 Dashboard
Click the Create Volume button to create a new EBS volume. The volume creation page (Figure 6) allows to configure various properties of the new volume. In this lab, we use default values for most properties. However, there is one property we cannot
5
rely on default value: the availability zone property. An EBS volume can only be seen and attached to instances in the same availability zone. You need to make sure the newly created EBS volume is in the same availability zone as your instance. The availability zone information of an instance can be found from the EC2 instances section. You may also change the default volume size to a different number.
Figure 6: EBS Creation Screen
After updating the necessary information. Click Create Volume to create the volume. This should not take a long time and you will be brought back to the main page. It is a good practice to give your volume a meaningful name for easy reference later. We will call this newly created volume “Projects” as shown in Figure 7. Notice also that the state of this volume is Available, meaning it can be attached to an instance. A volume already attached to an instance has a state called in-use.
Figure 7: EBS Volume Status
b) Attaching a Volume to an Instance Select the newly EBS Volume and click Attach volume from the Action drop down menu list (Figure 8a). Select the instance id or name (if you have given your instance a name) in the following window to specify the target instance for this volume (figure 8b).
After attaching the volume, you will be able to see it from the EC2 instance description tab as a block device.
c) Format and Mount the EBS Volume After attaching the volume, we need to mount it to the file system. For new volume, it is necessary to format it before mounting. This needs to be done inside the instance (e.g. in the SSH login window)
We specify a name “ /dev/sdf” while attaching the volume. But that might not be the name the OS gives to the device. We need to check the actual device name in the instance using lsblk command. Figure 9 is an sample output of an Ubuntu instance. The device is renamed as “/dev/xvdf ”.
6
(a) Attach Volume Menu Item (b) Identify Instance as EBS Volume Target Figure 8: Attach EBS Volume to an Instance
Figure 9: Command lsblk Sample Output
The EBS volume needs to be formatted before we can use it. The formatting com- mand is: sudo mkfs.ext4 /dev/xvdf. It formats the device as ext4, the standard format for Linux file system.
d) Mount the Volume to File System
The following commands create a directory called “Projects” under the current user’s home directory and mount device /dev/xvdf to that directory.
sudo mkdir ~/Projects
sudo mount /dev/xvdf ~/Projects/
Both commands do not return anything if successful. You can change the current working directory to /Projects and check its size if you are in doubt: df -lh .. The command should show approximate 10G available size. (
e) Using the New Volume Once mounted, you can treat the directory in the same way as any other directory in the file system.
To test that it works fine, let’s clone the Tensorflow official document repository into the directory:
git clone https://github.com/tensorflow/docs.git tf-tutorials
This would create a directory “tf-tutorials” under “ /Projects”.
Start jupyter notebook from this directory. The official tutorial is located at: site/en/tutorials/kera You can run both image classification and text classification notebooks. Remember to
set the kernel to conda tensorflow p36 before running either notebook. It is relatively
7
fast to train both models on this small instance. Save the notebook you have run and shut down the notebook by pressing Ctrl+C from the SSH window.
Question 7: Homework: Detaching an EBS volume
An instance can be terminated or stopped. Terminating an instance would delete all its attached volumes. Stopping an instance, on the contrary, do not delete its volumes. All volumes will be kept and indicated as in-use by the stopped instance. If we only want to keep some volume, we can detach if from an instance before terminating it. We need to unmount it from the file system before detaching the device from the instance. The simple command for unmount is: sudo umount /Projects.
To detach the volume, go back to AWS’s EC2 dashboard and find the Volumes page. This will show all volumes in the default availability zone (Figure 10). Notice that this volume is still showing as “in-use”. Click Actions and select Detach Volume to detach it from the instance. The status should change to “available”.
Figure 10: All EBS volumes
Stop the instance by clicking the Actions button at the top of the page and choose Stop menu item. A stopped instance can be re-started and its EBS volumnes is kept.
Question 8: Homework: Re-attach the EBS volume to another instance
Now try to request a bigger instance, e.g. t3.medium with the same AWS Deep Learning AMI Ubuntu image. Attach the “Project” volume to this instance and mount it under a directory in the file system. You should see the git repo containing official Tensorflow tutorials you just downloaded still there and the notebook have saved the running output you produced in the previous instnce. You can try to run the notebook again to compare the training time spent on two different instances.
Remember to stop or terminate your instance once you have finished working on it.
Question 9: Stop/Terminate the instance(s)
Before leaving the lab, always remember to stop or terminate all instance(s) you have started in the lab! If you want to continue the work at home, you can stop your instance and restart it again. If you have finished everything, you can terminate the instance. Stopping an instance without terminating it incurs storage cost.
8
References
https://aws.amazon.com/getting-started/tutorials/get-started-dlami/ https://docs.aws.amazon.com/dlami/latest/devguide/gs.html https://n2ws.com/blog/how-to-guides/connect-aws-ebs-volume-another-instance
9