DSCC 201/401
Tools and Infrastructure for Data Science
February 10, 2021
TAs and Blackboard
• Teaching Assistants
• Alex Crystal (acrystal@u.rochester.edu)
• Siyu Xue (sxue3@u.rochester.edu)
• Senqi Zhang (szhang71@u.rochester.edu)
• Quick Review of Blackboard
• HW#1 was due at 9 a.m. EST on Wednesday
2
Hardware Resources for Data Science
• Supercomputers
• Cluster Computing
• Virtualization and Cloud Computing
3
Cluster Computing
4
• Computing • Storage
• Network
• Software
Linux Clusters – Major Components
5
Linux Cluster Hardware – Compute
• Compute nodes are typically dense servers that are responsible for processing in a Linux cluster
• Compute nodes are stacked and placed in standard 19-inch wide 42U high racks
• Each server often has 2 sockets with RAM and local disk
• Each socket has 1 CPU and that CPU has many cores
• At least 1 node in the cluster is a login node and 1 node in the cluster is is a service node
• Login node is where users log in to system and submit jobs
• Service node controls the system (not user accessible)
6
Linux Cluster Hardware – Compute
7
Linux Cluster Hardware – Compute
e.g. Dell C6300: 2 sockets, RAM, hard drive
8
Linux Cluster Hardware – Compute
2U ~3.5 in
PS
CN
CN
CPSN
CN
CN
e.g. Dell C6300: 4 compute nodes in 2U
9
Linux Cluster Hardware – Compute
PS
CN
CN
PS
CN
CN
42U Rack ~6.5 ft.
19 inch width rails
10
Linux Cluster Hardware – Storage
• Individual hard drives on compute nodes – BUT typically not enough!
• How do we make sure all files are accessible by any one of the compute nodes?
• Clustered file system allows a file system to be mounted on several nodes
• Network attached storage (NAS) can be mounted on nodes using NFS (network file system) – similar to “network drive”
• Better performance and redundancy is achieved through parallel file systems, which is a special type of clustered file system
• Parallel file systems provide concurrent high-speed file access to applications executing on multiple nodes of clusters
• Efficiency and redundancy is improved by allowing nodes to access block level (which is a lower level than the file level)
11
• Lustre
Parallel File Systems
• Developed by Cluster File Systems, Inc. (but now open source)
• Uses metadata servers, object storage servers, and client servers for the file system
• Spectrum Scale (i.e. GPFS – General Parallel File System)
• Developed by IBM (proprietary)
• Parallel file system that provides access to block level storage on multiple nodes
• Blocks are distributed across multiple disk arrays with redundancy (declustered RAID – Redundant Array of Independent Disks)
• Metadata (i.e. information about the files and layout) are distributed across the multiple disk arrays
12
Spectrum Scale (GPFS)
JBOD
JBOD
NSD Servers
JBOD
JBOD
JBOD
JBOD
NSD Servers
JBOD
JBOD
• JBOD (Just a Bunch of Disks) provides the actual storage – typically 60 disks
• NSD (Network Shared Disk) Servers share out the storage to the clients through a network connection
• GPFS server and client software
42U Rack
13
Linux Clusters – Networking
• Ethernet can be used but has high latency (e.g. 50-125 μs) due to the TCP/IP protocol
• Small packets of information take a long time to reach destination
• InfiniBand has been designed for low latency (and high bandwidth) – typically less than 5 μs
• FDR10 (10 Gb/s), FDR (14 Gb/s), EDR (25 Gb/s), and HDR (50 Gb/s) are commonly used today
• Links can be aggregated for extra bandwidth (e.g. 4X, 8X, etc.)
• BlueHive uses 4X aggregation of EDR (i.e. 100 Gb/s) and 4X aggregation of FDR10 (i.e. 40 Gb/s bandwidth)
• Copper cables can be used for InfiniBand lengths less than 10 meters (otherwise optical fiber cables are used)
• Network switches can be be used to link all components together 14
BlueHive
15
16
Linux Cluster Hardware
• Necessary hardware components: Compute Nodes, Storage, Networking
• Also need a login node to provide a system for users to log in and interact with the system
• Also need a service node to provide a place to run and manage the control and monitoring software for the system
17
Linux Clusters – Networking
• Ethernet can be used but has high latency (e.g. 50-125 μs) due to the TCP/IP protocol
• Small packets of information take a long time to reach destination
• InfiniBand has been designed for low latency (and high bandwidth) – typically less than 5 μs
• FDR10 (10 Gb/s), FDR (14 Gb/s), EDR (25 Gb/s), and HDR (50 Gb/s) are commonly used today
• Links can be aggregated for extra bandwidth (e.g. 4X, 8X, etc.)
• BlueHive uses 4X aggregation of EDR (i.e. 100 Gb/s) and 4X aggregation of FDR10 (i.e. 40 Gb/s bandwidth)
• Copper cables can be used for InfiniBand lengths less than 10 meters (otherwise optical fiber cables are used)
• Network switches can be be used to link all components together 18
BlueHive
19
20
Linux Cluster Hardware
• Necessary hardware components: Compute Nodes, Storage, Networking
• Also need a login node to provide a system for users to log in and interact with the system
• Also need a service node to provide a place to run and manage the control and monitoring software for the system
21
Linux Clusters – Software
• Operating system
• Management and monitoring software • Job scheduling and launching software • User software
22
Operating System
• >99% are based on Linux
• Beneficial for user multi-user management in shared environment • Great for security and permissions (file, executable, etc.)
• Support for open source codes
• Excellent community support
23
Management and Monitoring Software
• XCat (Extreme Cluster Administration Toolkit)
• Provision OS and node images
• Remotely manage systems – reboot and distributed shell
• Ganglia
• Node use and monitoring system
• Shows CPU, RAM, and I/O usage of nodes • Can be user accessible
• Zabbix
• Server monitoring software
• Shows node stability and resource usage (CPU, RAM, and I/O) • Typically not user accessible
24
Job Scheduling and Managing Software
• Torque/Maui • SLURM
25
Job Scheduling and Managing
Login Node User 1 User 2 User 3
• Torque/Maui
• Torque manages the resources • Maui is the scheduler
PBS Server (Torque)
Maui Scheduler
Service Node
26
Node 0 Run Script
Node 1
Node N
Job Scheduling and Managing
• Slurm (Simple Linux Utility for Resource Management)
• Very similar to Torque/Maui
• Provides necessary software for starting, stoping, and monitoring compute jobs and resources
• Allocates and deallocates computing resources on nodes and partitions
• Manages queues of jobs waiting for resources
27
Partitions
• Most multi-user clusters are divided into partitions or subunits that allow jobs with similar characteristics to run in a common environment
• BlueHive is divided into partitions
• Similar hardware attributes and limitations of running jobs
• User selects which partition to run job
28
Virtualization and Cloud Computing
29
Virtualization
• Virtualization is the process of creating a software-based representation of computing, storage, and network resources rather than a physical one
• Key advantage: Multiple virtual computing, storage, and network resources can be created with limited hardware
• Key disadvantage: Multiple virtual computing, storage, and network resources compete for underlying hardware resources
VM
VM
VM
CPU
RAM
Server
Network
30
Storage
Paravirtualization
• Paravirtualization: virtual machine does not simulate hardware, but instead offers a special programming interface (or API) that can only be used by modifying the guest operating system
• Mostly historical
• Example: Xen
VM Guest RHEL (modified)
VM Guest Ubuntu (modified)
VM Guest Windows (modified)
CPU
Storage VM Host (Hypervisor)
Network
RAM
31
Full-Virtualization
• Full-virtualization: the virtual machine simulates enough hardware to allow an unmodified guest operating systems (one designed for the same instruction set) to be run in isolation
• Often uses hardware acceleration (extensions in CPU)
• Examples: Parallels for Mac OS X, VMWare
VM Guest RHEL
VM Guest Ubuntu
VM Guest Windows
CPU
RAM
VM Host
Network
32
Storage
Containers (Docker)
• A kernel is software that runs on a computer to provide control for specific hardware components and data transfers from different components of the computer. All input/output actions (system and user) are handled by the kernel in a computer.
• Containers are like “operating system virtualization” – Use the same kernel but isolate libraries, tools, software, etc. into compartmentalized containers
Container 1
Container 2
Operating System Kernel
CPU
RAM
Storage
Host System
Network
33
Containers (Docker)
• Useful for consistent environments and application dependencies
• Docker images are instantiated to create a container (class vs. object) • Can run on virtualized systems
Container 1
Container 2
Operating System Kernel
CPU
RAM
Storage
Host System
Network
34
Cloud
• Cloud: A series of systems offered up for computing, storage, and network resources
• Mostly applications and virtual machines (but can also be bare metal)
• Cloud can mean any machine that is not local to you and you don’t administer (underlying components)
• Three models: SaaS, PaaS, and IaaS
• Largest providers: Amazon AWS (Amazon Web Services), Microsoft, and Google
35
Cloud Service Models – SaaS
• SaaS (Software as a Service) – NIST definition:
• Applications run on a cloud infrastructure and are available remotely to the end user typically a web browser
• The user does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings
• Examples: Gmail, Google Docs, Microsoft Office 365, Blackboard
36
Cloud Service Models – PaaS
• PaaS (Platform as a Service) – NIST definition:
• The capability to deploy onto the cloud infrastructure user-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider
• The user does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment
• Examples: WordPress hosting sites, SalesForce, etc.
37
PaaS: Amazon Machine Learning
38
Cloud Service Models – IaaS
• IaaS (Infrastructure as a Service) – NIST definition:
• End user can provision processing, storage, networks, and other fundamental computing resources
• User can deploy and run arbitrary software, which can include operating systems and applications
• User does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications (often through virtual machines)
• Examples: Amazon AWS, Microsoft Azure
39
Using BlueHive
40
Logging on to BlueHive
• You must be connected to the UR_Connected wireless network or using the VPN client (See: http://tech.rochester.edu/remote-access-vpn- tutorials/ for help with VPN)
• Detailed instructions and links to learn how to connect is at:
http://info.circ.rochester.edu
• 2 Ways to Connect:
• Log in with FastX for a Graphical User Interface (GUI):
https://bluehive.circ.rochester.edu
• Connect through Terminal (Mac) or PuTTY (Windows) (No GUI)
41
Running a Program on BlueHive
• Running jobs interactively with FastX (GUI) • Default FastX partition
• Custom resources
• Running jobs interactively with a terminal session (no GUI) • Login node
• Compute node
42
ssh web browser
BlueHive Resource Allocation
bluehive
terminal session on bluehive
default
FastX server
interactive interactive
FastX node (1 core, 2GB)
Slurm Server
compute node
(with requested resources)
FastX
• Graphical connection to https://bluehive.circ.rochester.edu • Login with NetID username and password
• DUO authentication
• Default Session
• 1 CPU
• 2 GB RAM
• “Unlimited” time
• Interactive Session
• User selects resources • Up to 12 hours
44
Terminal Session
• Connect to bluehive.circ.rochester.edu with a terminal application (e.g. Terminal on Mac OS X or PuTTY on Windows)
• Login with NetID username and password
• DUO authentication
• Log in to shared login node (Do not do calculations on this node)
• Interactive Session
• User selects resources
interactive -p interactive -t 1:00:00 -c 1
–mem-per-cpu=2GB
• Up to 12 hours
• Note: Interactive session with a terminal is also available from FastX
45
info.circ.rochester.edu
46