CMPSC 450 definitions
CMPSC 450
What is a ‘parallel computer’?
• A parallel computer consists of a number of tightly-coupled compute elements that cooperatively solve a problem.
• Example of `tight coupling’: shared caches, shared main memory, shared led system, high-speed access to data, high-speed network connecting compute nodes.
• Cooperatively solving implies manual or automated work partitioning, load balancing, synchronization.
• Examples of `compute elements’: NVIDIA CUDA cores in NVIDIA GeForce GTX 1080 (2560), x86 cores in Intel Core i7-7700 processor (4), processors in Sunway TaihuLight supercomputer (40,960), Qualcomm Snapdragon processor in a Samsung Galaxy S7 Edge smartphone (1). There is no common definition for a compute element.
CMPSC 450
What is ‘high-performance computing’ (HPC)?
• Any computation that is not ‘low performance’?
• HPC is used synonymously with parallel computing and supercomputing. However, achieving high serial performance can also be considered an example of HPC.
• A performance model and performance metrics let us quantify efficiency.
CMPSC 450
What is ‘supercomputing’?
• Anything to do with supercomputers: applications, algorithms, hardware design, software and middleware, performance analysis.
• What is a supercomputer? A system that appears in the Top 500 list of supercomputers.
• NVIDIA calls DGX-1 an `AI Supercomputer’. What does that mean? Marketing.
• Are the Penn State ACI clusters supercomputers? No.
• A reasonable ordering of computational capabilities:
phone < laptop < workstation < cluster << supercomputer
CMPSC 450
What is the Top 500 list of supercomputers?
• A ranking of supercomputers updated twice a year (by a small group of researchers).
• A benchmark called High Performance LINPACK (HPL) is executed (by owners of supercomputers interested in having their system on the list).
• LINPACK solves a large linear system, Ax = b, where A is a dense matrix. A performance rate Rmax (units: FLOPS) is used for the ranking.
• Rpeak is the theoretical peak performance.
• FLOPS: (double-precision) oating point operations per second.
• How is theoretical peak computed? See Wikipedia page for FLOPS.
CMPSC 450
What is the difference between supercomputing and 'distributed computing'?
• There is no tight coupling of compute elements in a distributed system.
• Compute elements may be heterogeneous and geographically dispersed.
• Inter-node communication latency in distributed systems is signicantly higher (3-4 orders of magnitude) than (centralized) clusters or supercomputers.
• Multiple jobs may be concurrently (simultaneously) running on a distributed system.
• Distributed computing has an entirely different set of design and research challenges.
• Volunteer computing? A type of distributed computing where volunteers donate compute cycles for a common cause (e.g., SETI@home).
CMPSC 450
What is 'cloud computing'?
• Internet-based computing.
• On-demand access to compute resources.
• Service models: IaaS, PaaS, SaaS.
• Deployment models: public, private, hybrid.
• Closer to distributed computing than parallel computing.
CMPSC 450
What is the difference between concurrency and parallelism?
• Concurrency: simultaneous execution of computational tasks/units/components.
• Parallelism: using multiple compute elements efficiently to solve a single task.
• Concurrency is more general concept than parallelism.
• Concurrency issues can arise in distributed systems as well.
• In this class, we will synonymously use concurrency and parallelism in the context of tightly-coupled compute systems.
CMPSC 450
What is `scientific computing'?
• Using computers to solve scientific problems. Examples: understanding the origins of the Universe, understanding functioning of the human brain, clean energy sources, finding the cure to cancer, personalized medicine, etc.
• What are examples of non-scientific computing applications of HPC: Use on the Wall Street (Computational Finance), use cases in manufacturing, oil exploration, intelligence and surveillance, nuclear security.
• Key Observation: Common computational building blocks/structural patterns/motifs in diverse application domains.
CMPSC 450
What are `accelerators'?
• Non-general purpose compute elements
• GPUs, FPGAs
• Specialized and custom-built designs are typically more efficient (in terms of various performance metrics) than general-purpose hardware.
• HPC and Supercomputing enthusiasts are early adopters of accelerators.
• Cool features of accelerators eventually trickle down to general- purpose hardware.
CMPSC 450
Why all this information about supercomputers?
• Because it is important to effectively use these expensive and incredibly powerful systems
• Some challenges only arise at a very large scale (say, million-way concurrency vs 4-way concurrency). It is good to develop an appreciation of these challenges.
• Fun and exciting to learn about the latest advances in high-end computing!
CMPSC 450
What are the main challenges in building an exascale (1018 operations per second) system?
• Energy efficiency: 6 GFlops/Watt (today) to 50 GFlops/Watt (target for future exascale system)
• Cost: “A million US dollars in annual operating costs per MegaWatt of system power."
• Machine balance
• Processor, Memory, Interconnect, I/O design
• General purpose vs Specialization
CMPSC 450
What is a manycore processor?
• Awesome.
• Multicore processor: 2, 4, 8 cores
• Manycore >> # of cores than multicore
• GPUs can be considered manycore platforms
CMPSC 450
What is a Xeon Phi processor?
• Brand name for a new line of microprocessors from Intel
• Designed to compete with GPUs
• Now in fourth generation: Knights Mill platform
• Xeon Phi processors are used in several top supercomputers
Source: https://en.wikipedia.org/wiki/Xeon_Phi#Knights_Mil CMPSC 450
What’s Next?
• https://www.nextplatform.com/2018/05/24/a-peek-inside-that-intel- xeon-fpga-hybrid-chip/
CMPSC 450