CS计算机代考程序代写 cuda distributed system algorithm concurrency cache finance x86 CMPSC 450 definitions

CMPSC 450 definitions
CMPSC 450

What is a ‘parallel computer’?
• A parallel computer consists of a number of tightly-coupled compute elements that cooperatively solve a problem.
• Example of `tight coupling’: shared caches, shared main memory, shared led system, high-speed access to data, high-speed network connecting compute nodes.
• Cooperatively solving implies manual or automated work partitioning, load balancing, synchronization.
• Examples of `compute elements’: NVIDIA CUDA cores in NVIDIA GeForce GTX 1080 (2560), x86 cores in Intel Core i7-7700 processor (4), processors in Sunway TaihuLight supercomputer (40,960), Qualcomm Snapdragon processor in a Samsung Galaxy S7 Edge smartphone (1). There is no common definition for a compute element.
CMPSC 450

What is ‘high-performance computing’ (HPC)?
• Any computation that is not ‘low performance’?
• HPC is used synonymously with parallel computing and supercomputing. However, achieving high serial performance can also be considered an example of HPC.
• A performance model and performance metrics let us quantify efficiency.
CMPSC 450

What is ‘supercomputing’?
• Anything to do with supercomputers: applications, algorithms, hardware design, software and middleware, performance analysis.
• What is a supercomputer? A system that appears in the Top 500 list of supercomputers.
• NVIDIA calls DGX-1 an `AI Supercomputer’. What does that mean? Marketing.
• Are the Penn State ACI clusters supercomputers? No.
• A reasonable ordering of computational capabilities:
phone < laptop < workstation < cluster << supercomputer CMPSC 450 What is the Top 500 list of supercomputers? • A ranking of supercomputers updated twice a year (by a small group of researchers). • A benchmark called High Performance LINPACK (HPL) is executed (by owners of supercomputers interested in having their system on the list). • LINPACK solves a large linear system, Ax = b, where A is a dense matrix. A performance rate Rmax (units: FLOPS) is used for the ranking. • Rpeak is the theoretical peak performance. • FLOPS: (double-precision) oating point operations per second. • How is theoretical peak computed? See Wikipedia page for FLOPS. CMPSC 450 What is the difference between supercomputing and 'distributed computing'? • There is no tight coupling of compute elements in a distributed system. • Compute elements may be heterogeneous and geographically dispersed. • Inter-node communication latency in distributed systems is signicantly higher (3-4 orders of magnitude) than (centralized) clusters or supercomputers. • Multiple jobs may be concurrently (simultaneously) running on a distributed system. • Distributed computing has an entirely different set of design and research challenges. • Volunteer computing? A type of distributed computing where volunteers donate compute cycles for a common cause (e.g., SETI@home). CMPSC 450 What is 'cloud computing'? • Internet-based computing. • On-demand access to compute resources. • Service models: IaaS, PaaS, SaaS. • Deployment models: public, private, hybrid. • Closer to distributed computing than parallel computing. CMPSC 450 What is the difference between concurrency and parallelism? • Concurrency: simultaneous execution of computational tasks/units/components. • Parallelism: using multiple compute elements efficiently to solve a single task. • Concurrency is more general concept than parallelism. • Concurrency issues can arise in distributed systems as well. • In this class, we will synonymously use concurrency and parallelism in the context of tightly-coupled compute systems. CMPSC 450 What is `scientific computing'? • Using computers to solve scientific problems. Examples: understanding the origins of the Universe, understanding functioning of the human brain, clean energy sources, finding the cure to cancer, personalized medicine, etc. • What are examples of non-scientific computing applications of HPC: Use on the Wall Street (Computational Finance), use cases in manufacturing, oil exploration, intelligence and surveillance, nuclear security. • Key Observation: Common computational building blocks/structural patterns/motifs in diverse application domains. CMPSC 450 What are `accelerators'? • Non-general purpose compute elements • GPUs, FPGAs • Specialized and custom-built designs are typically more efficient (in terms of various performance metrics) than general-purpose hardware. • HPC and Supercomputing enthusiasts are early adopters of accelerators. • Cool features of accelerators eventually trickle down to general- purpose hardware. CMPSC 450 Why all this information about supercomputers? • Because it is important to effectively use these expensive and incredibly powerful systems • Some challenges only arise at a very large scale (say, million-way concurrency vs 4-way concurrency). It is good to develop an appreciation of these challenges. • Fun and exciting to learn about the latest advances in high-end computing! CMPSC 450 What are the main challenges in building an exascale (1018 operations per second) system? • Energy efficiency: 6 GFlops/Watt (today) to 50 GFlops/Watt (target for future exascale system) • Cost: “A million US dollars in annual operating costs per MegaWatt of system power." • Machine balance • Processor, Memory, Interconnect, I/O design • General purpose vs Specialization CMPSC 450 What is a manycore processor? • Awesome. • Multicore processor: 2, 4, 8 cores • Manycore >> # of cores than multicore
• GPUs can be considered manycore platforms
CMPSC 450

What is a Xeon Phi processor?
• Brand name for a new line of microprocessors from Intel
• Designed to compete with GPUs
• Now in fourth generation: Knights Mill platform
• Xeon Phi processors are used in several top supercomputers
Source: https://en.wikipedia.org/wiki/Xeon_Phi#Knights_Mil CMPSC 450

What’s Next?
• https://www.nextplatform.com/2018/05/24/a-peek-inside-that-intel- xeon-fpga-hybrid-chip/
CMPSC 450