程序代写代做 AWS deep learning dns graph kernel file system Keras go School of Computer Science Dr. Ying Zhou

School of Computer Science Dr. Ying Zhou
COMP5349: Cloud Computing Sem. 1/2020
Week 3: Docker Tutorial
Learning Objectives
12.03.2020
• Familiarize yourself with Docker key component like docker daemon, registry, con- tainers and images by going through the official tutorial.
• Understand Docker networking and storage options using standard and customized Jupyter Notebook container
• Briefly explore the Linux kernel’s name space feature
• Compare virtualization technology and container technology.
Please note that for security reason, students are not allowed to run Docker on shared lab workstations. We practice docker on AWS EC2 instance. This is the case of running container on VM. You can use either Windows or Linux OS on lab workstation. Using Linux is more convenient since most of the time we are working on command line.
Lab Exercises
Question 1: Start an EC2 Instance and Connect to it
Start an EC2 instance using AMI amzn-ami-2016.09.g-amazon-ecs-optimized-30G. You can find the AMI from Community AMIs tab by searching its name. Note that you have to choose the region N.Virginia. The AMI is available for AWS Educate starter and classroom account. It is an Linux based AMI with docker setup ready to use. In short, the docker daemon is started at boot time and the default ec2-user has been added to the docker group.
This AMI can run on a range of instance types. You may use very small instance. To get reasonable performance, you are suggested to use t2.small or t2.medium instance.
In this lab, the instance is expected to receive request on the following tcp ports in various exercise steps:
• port 4000 for a simple Python web server 1

• port 8888 and 8899 for jupyter notebook • port 6006 for tensorboard
Make sure you include security rules to allow TCP connections from any node to those ports on the instance.
In the launching step, remember to use the key pair you generated in week 2 lab.
After the instance is launched, use a terminal window to connect to it as ec2-user. There is no password for this user. You can verify that your current user belongs to docker group by command groups.
Question 2: Basic Docker Usage
Please follow the official docker tutorial to get familiar with basic docker usage.
In the last step, you may run the curl command to visit your web server from another
SSH window.
Alternatively, you can start a browser on the lab workstation and change the DNS part of the url to your EC2 instance’s public DNS address.
Test your understanding
The component and applications involved in this question are shown in figure 1
Figure 1: Lab workstation, EC2 instance and the Applications Running in them in Q2
• Describe the network traffic flow after a request is sent from your lab workstation’s browser window to the web server running inside the container.
• How can we run two web servers on the same host?
• What other configuration we need to update to enable running two servers on this instance?
2

Question 3: Using Volumes in Docker
The basic docker tutorial demonstrates the use case of deploying application through Docker. We can also use docker as development environment. This and all following exercises use docker to set up development environment highlighting various docker fea- tures.
This exercise demonstrates the use of Volumes through a Jupyter Notebook container. Project Jupyter maintains “a set of ready-to-run Docker images containing Jupyter appli- cations and interactive computing tools” in its git repo. The description and user guide of the whole docker stack can be found from the user guide document. In this lab, we will use the tensorflow-notebook image from the core stack . The full name of the image is jupyter/tensorflow-notebook. Alternative way to find the exact name of a docker im- age is to perform a docker search command. For instance, docker search tensorflow would search Docke Hub for images containing string string tensorflow.
a) Starting the Tensorflow Notebook Container
In one SSH window, run the following command to pull the tensorflow notebook docker
image into the local registry
docker pull jupyter/tensorflow-notebook
This will take a few minutes as the image size is 5.24G. After downloading finishes, you can inspect the image’s size using docker images command, which should show you something similar like
jupyter/tensorflow-notebook latest e34690d08026 2 days ago 5.24GB
To inspect each layer and the commands used to build those layers, use:
docker history jupyter/tensorflow-notebook
You will see a list of commands each add a layer to the image. Alternatively, you can view the actual Dockerfile from the container’s github repository.
The basic command to start this container is as follow:
docker run –rm -p 8888:8888 jupyter/tensorflow-notebook
The command starts a container with the Notebook server listening for HTTP con- nections on port 8888. Once the container is successfully started, you will see log messages similar to the those you have seen when you start a jupyter notebook di- rectly, e.g. in week 2 lab.
You can access the server from a browser on the lab workstation, just remember to change the host name from localhost to the instance’s public DNS. The container has a user account called “jovyan” and the notebook is running from this user’s home directory /home/jovyan. A sub folder work has been created under the home directory. When you access the notebook server from a browser, you will see this folderwhich is empty.
3

Create a new notebook inside this folder. You can write any simple program as you like. Below is an example for printing out a few system information. You can give the notebook a name ‘get sysinfo’
Type the following in the first code cell and run the cell to see the output:
1 import platform 2
3 print(“Operating System:”)
4 print(platform.uname())
5 print(“Processors: “)
6 with open(“/proc/cpuinfo”, “r”) as f:
7 info = f.readlines()
8 for info_line in info:
9 print(info_line)
Save the notebook and go back to the home screen, you will see a new file get sysinfo.ipynb under work. This file is stored currently at the thin writable layer of this container. After
you exit the container, the thin writable layer will be deleted. The Jupyter note book
we created will be deleted with the writable layer.
Exit the container by pressing CTR+C on the terminal you started the container. Use the docker run command again to start the container another time, you will find the work directory is empty. Kill this container by pressing CTR+C on the terminal you started the container
Figure 2 shows the component involved in this exercise.
Figure 2: Lab workstation, EC2 instance and the Applications Running in them in Q3.a
b) Mount a Host Directory inside Container
There are various ways to instruct a container to use host file system as its storage. All
of these can be specified as option when starting the container. For instance, the fol-
4

lowing option mounts the present working directory on the host to /home/jovyan/work in the container file system:
–mount type=bind,source=”$(pwd)”,target=/home/jovyan/work
Below is the complete command :
docker run -it –rm -p 8888:8888
–mount type=bind,source=”$(pwd)”,target=/home/jovyan/work
jupyter/tensorflow-notebook
The command would as usual start a container with tensorflow-notebook image.
To check if the mount is successful, start another SSH window and run the com- mand docker container ls to find out your container’s ID. You can find the ID value from the first column (see a sample output in Figure 3). You may occasionally see two entries, the other one is a cluster management services automatically started by Amazon. You can identify the one you just started by its name tensorflow-container.
Figure 3: Sample output of docker container ls command
The docker inspect command would print out information about the container spec-
ified by its ID:
docker inspect –format ‘{{json .Mounts}}’ Your output will be similar to the following, but in an single line:
[
{
]
“Type”: “bind”,
“Source”: “/home/ec2-user”,
“Destination”: “/home/jovyan/work”,
“Mode”: “”,
“RW”: true,
“Propagation”: “rprivate”
}
Now repeat what you have done in the previous exercise to create and save a very simple notebook under the work directory. You should be able to see the notebook created in the host file system. After you have saved the notebook, exit and restart the container with the same volume attached, you will see the notebook there.
Figure 4 shows the component involved in this exercise.
5

Figure 4: Lab workstation, EC2 instance and the Applications Running in them in Q3.b
Question 4: Create Your Own Notebook Container
In this exercise, we will create a customized tensorflow container with a few official tutori- als written in it. We will use the jupyter/tensorflow-notebook as the parent image. We then clone the official tensorflow document repo to get the official tutorials.
The AMI we use for the instance does not have git installed. On a SSH window, run sudo yum install git to install it.
Similar as you did in week 2 lab, clone the official tensorflow document into a directory:
git clone https://github.com/tensorflow/docs.git tf-doc
The repository contains many document in various folders. We are only interested in the ones under site/en/tutorials/keras.
Create a directory to store all files needed for your new image. Let’s call it kr-tutorials-src. Your directory structure would look like this:
Figure 5: ec2-user home directory after cloning
Copy all files insite/en/tutorials/keras from tf-doc directory into kr-tutorials-src directory using the following command (assuming your present working directory is the ec2-user’s home directory):
cp -r tf-doc/site/en/tutorials/keras/ ~/kr-tutorials-src/tutorials/
This would create a tutorials folder under kr-tutorials-src and copy all files in tf-doc/site/en/tuto
there.
Building this image involves only two steps:
6

1. Grab a Jupyter notebook container
2. Copy the files to the working directory
These are translated into two commands in Dockerfile: FROM jupyter/tensorflow-notebook
COPY tutorials /home/$NB_USER/work
Create a text file Dockerfile file in kr-tutorials-src directory, and paste the above two commands in it. If you don’t know any command line editor. Just run the following two echo commands:
echo “FROM jupyter/tensorflow-notebook” >> Dockerfile
echo “COPY tutorials/ /home/$NB_USER/work” >> Dockerfile
Your kr-tutorials-src directory now should contain a text file Dockerfile and a subdi- rectory tutorials with a few notebook files in it(see figure 6).
Figure 6: Docker image source directory
Change your present working directory to kr-tutorials-src and build the image with the command (don’t forget the trailing dot representing current direcgtory) :
docker build -t kr_tutorail_notebook .
After the image is built, you can find the image id using command docker image ls. Command docker history will show that the image has one extra one layer created by the COPY command in Dockerfile. The size of this layer is 139KB .
You can start a container with this newly created image using the following command:
docker run -it –rm -p 8899:8888 -p 6006:6006 kr_tutorail_notebook
Again, you can find out the Jupyter notebook’s URL from the output message and access that from your browser in local lab workstation. This time you will see many notebooks inside the work folder. You should be able to run any of them. To prepare for next question, we suggest that you open the notebook “text classification.ipynb”. Find the cell under heading “Train the model” and add the following statement at the top of the cell
7

tb_callback = keras.callbacks.TensorBoard(log_dir=’./logs’,
histogram_freq=0, write_graph=True, write_images=True)
Then insert the following parameter in model.fit function call: callbacks = [tb_callback],
The updated cell would look like figure 7. Run all cells in this notebook. If you en- counter ”tensorflow datasets not found” message when executing the cell with import tensorflow datasets as tfds statement, you can add the command
!pip install tensorflow-datasets at the top of this cell.
The callback function we just created will log various information during training to be visualized later in Tensorboard.
Figure 7: Creating Log Data for Tensorboard
Figure 8 shows the component involved in this exercise. Note that port 6006 in the con- tainer is mapped to port 6006 in the host. At the moment, there is no process listening on that port inside the container. There is no connecting to that port from the client (local lab workstation) as well.
Figure 8: Lab workstation, EC2 instance and the Applications Running in them in Q4
Question 5: Container and Host Machine (Homework)
Containers are isolated environment. There is relative strong isolation among containers running on the same host. But container and processes running inside it are visible to the
8

host machine and can be manipulated from the host machine. In this exercise, we will use the same tensorflow container to illustrate the process “transparency” between container and host.
Start another terminal and SSH to your EC2 instance. Now you have two SSH connec- tions to the instance. The first One is running a jupyter container. We will use the second one to inspect and control the container.
a) Process Management
OnthesecondSSHwindow,findouttherunningcontainer’sIDwithdocker container ls command. Then run the following command to inspect the processes running in- side the container:
docker container exec ps -ef|grep jupyter
The above command runs a Linux process listing command ps -ef|grep jupyter in the container specified by the container-id. The ps -ef|grep jupyter command lists all processes with string “jupyter” in it. You may see an output similar to figure 9, which shows two processes, both started by user “jovyan”. The one with process id 6 is the jupyter notebook itself; the other one with process id 20 is the ipython kernel started to run the notebook “basic text classification.ipynb”. You can see that the parent of process 20 is process 6.
Figure 9: Processes Seen from the Container
Now run command ps -ef|grep jupyter on the host machine. You will see the same processes listed but with different user name and process id. Figure 10 shows an ex- ample output. In Figure 10, both jupyter notebook server and ipython kernel processes are started by user “ec2-user”, which is the user started the container. They also have very different process ids. The notebook server has an id 25648 and it is the parent of process 19471, which represents the ipython kernel.
Figure 10: Processes Seen from the Host
Figure 11 shows the component involved in this exercise. Note that we used two SSH windows: one for starting the container; the other for inspecting and manipulating the container.
9

Figure 11: Lab workstation, EC2 instance and the Applications Running in them in Q5.a
b) Start a Long Running Process Inside Container The docker container exec is a very powerful command that can be used to manipulate a running container from host machine. In this exercise, we will start a tensorboard server inside the container to visualize various training information.
Run the following command in second SSH window:
docker container exec tensorboard –logdir=/home/jovyan/work/logs
This would start a tensorboard server inside the container. The server uses the data on /home/jovyan/work/logs to generate various visualization of the model and train- ing process. The data are written in the specified directory in question 5 by adding a tb callback in the training process. We have mapped the network port 6006 when starting the container. You should be able to access tensorboard from your local lab machine’s browser with url like http://:6006.
Figure 12 shows the component involved in this exercise. Note that now we have tensorboard listening on port 6006 inside the container and there is a client connection from local lab workstation to port 6006 of the EC2 instance.
Figure 12: Lab workstation, EC2 instance and the Applications Running in them in Q5.b
10

Question 6: Container vs VM
In week 2 tutorial, we started an EC2 instance with deep learning image. This instance contains a comprehensive deep learning development environment, making it very con- venient for developers to build and train various deep learning models.
In this week, we started a relative simple EC2 instance without any deep learning pack- ages. We then run a tensorflow container, with the full stack tensorflow development environment.
The two options provide nearly identical capacity for building and testing tensorflow mod- els. The development experience could be slightly different. You should have experienced certain differences in the two lab exercises. Now use your experience and general under- standings of VM and Container technology to
• identify the differences when using this two options;
• Identify a few scenarios that developers would prefer using the VM option (as in week
2).
• Identify a few scenarios that developers would prefer using the container option (as in week 3). This include using containers directly on a host machine, or on a virtual machine.
References
• Docker Getting Started. https://docs.docker.com/get-started/
• Manage data in Docker. https://docs.docker.com/storage/
• Using Volumes. https://docs.docker.com/storage/volumes/
• Jupyter Docker Stacks. https://jupyter-docker-stacks.readthedocs.io/en/latest/
11