MEC302 Embedded Computer Systems
Embedded Processor
Dr. Sanghyuk Lee
Email: Dept. Mechatronics and Robotics
Copyright By PowCoder代写 加微信 powcoder
Embedded Computer Systems
Embedded Processors and Parallelism
Types of Processors
Microprocessors and Microcontrollers DSP Processors
Graphics Processors
Parallelism
Parallelism vs Concurrency
Pipelining
Instruction-Level Parallelism Multicore Architectures
• In general-purpose computing, the variety of instruction set architectures today is limited, with the Intel x86 architecture overwhelmingly dominating all. There is no such dominance in embedded computing.
• On the contrary, the variety of processors can be daunting to a system designer; understand the options and to critically evaluate the properties of processors, particularly focus on the mechanisms that provide concurrency and control over timing, because these issues loom large in the design of cyber-physical systems.
• Embedded processors typically have a dedicated function. They control an automotive engine or measure ice thickness in the Arctic. They are not asked to perform arbitrary functions with user-defined software.
• Consequently, the processors can be more specialized. Making them more specialized can bring enormous benefits; they may consume far less energy, and consequently be usable with small batteries for long periods of time. Or they may include specialized hardware to perform operations that would be costly to perform on general-purpose hardware, such as image analysis.
• When evaluating processors, it is important to understand the difference between an instruction set architecture (ISA) and a processor realization or a chip. The latter is a piece of silicon sold by a semiconductor vendor. The former is a definition of the instructions that the processor can execute and certain structural constraints (such as word size) that realizations must share. x86 is an ISA.
• There are many realizations. An ISA is an abstraction shared by many realizations. A single ISA may appear in many different chips, often made by different manufacturers, and often having widely varying performance profiles.
• The advantage of sharing an ISA in a family of processors is that software tools, which are costly to develop, may be shared, and (sometimes) the same programs may run correctly on multiple realizations. This latter property, however, is rather treacherous, since an ISA does not normally include any constraints on timing. Hence, although a program may execute logically the same way on multiple chips, the system behavior may be radically different when the processor is embedded in a cyber-physical system.
Introduction
• In general-purpose computing the variety of instruction set architectures is limited
• Intel x86 architecture is dominant
• There is no such dominance in embedded computing
• When deployed in a product embedded processors have a dedicated function
− Making them more specialized has many benefits
• Lower power consumption
• Include specialist hardware to perform operations too costly on a general purpose computer (image analysis)
• Small size
• Overall reduction in cost
Types of Processors
• As a consequence of huge variety of embedded applications, huge variety of processors are in use.
• They range from small, slow, inexpensive, low power devices to high performance, specialist devices.
• Microcontrollers
• DSP Processors
• Graphic Processors
Microcontrollers
• A microcontroller (μC) is a small computer on a single integrated circuit.
• It consists of a relatively simple CPU combined with peripheral devices (memories, I/O devices, timers, etc).
− More than half the CPUs sold in the world are microcontrollers.
• A microcontroller is a small and low-cost computer built for the purpose
of dealing with specific tasks.
• Microcontrollers are mainly used in products that require a degree of control to be exerted by the user.
Microprocessor vs microcontroller
• Microprocessor is an IC which has only the CPU inside them
• i.e. only the processing powers such as Intel’s Pentium 1,2,3,4, core 2 duo, i3, i5 etc.
• These microprocessors don’t have RAM, ROM, and other peripheral on the chip.
• A system designer has to add them externally to make them functional.
• Applications of microprocessor includes Desktop PC’s, Laptops, notepads etc.
• But this is not the case with Microcontrollers.
• A Microcontroller has a CPU, in addition with a fixed amount of RAM, ROM and other peripherals all embedded on a single chip.
• At times it is also termed as a mini computer or a computer on a single chip.
• Today different manufacturers produce microcontrollers with a wide range of features available in different versions.
What is a microprocessor system?
Evolution of microprocessor
Programming language in computers
Formal language designed to communicate with the computer
Two types: low-level and high-level languages
Low-level language
− machine oriented and require extensive knowledge of computer hardware
1) machine language − directly understood by the computer, no need to translate (0,1) 2) assembly − set of symbols and letters. An assembler is required to translate assembly
language to machine language (terms like MOVE, ADD, SUB, END)
High-level language – most people often use
− uses English and mathematical symbols (+, -, % etc) in its instructions − Ex: C, C++, Fortran, Java, Python
Example of ATM Machine withdrawal
Levels of representation in computers
A bit more detail:
A bit more detail:
Input unit: accepts the list of instructions and data from the outside world. For
instance, data enters from keyboard, mouse etc.
Output unit: supplies information and results of computation to the outside world. Since results produced in binary form, it must be converted to human acceptable form.
Memory: data and instructions that enter into computer thru input unit have to be stored inside computer before the actual processing starts. Similarly, results produced also stored
Central processing unit: It is the main unit and controls all internal and external devices, performs “arithmetic and logical operations”.
Can they be differentiated on cost?
• Comparing microcontroller and microprocessor in terms of cost is not justified.
− Undoubtedly a microcontroller is far cheaper than a microprocessor.
− However, a microcontroller cannot be used in place of microprocessor and using a microprocessor is not advised in place of a microcontroller as it makes the application quite costly.
− Microprocessor cannot be used stand alone.
• They need other peripherals like RAM, ROM, buffer, I/O ports etc and hence a system designed around a microprocessor is quite costly.
The die from an Intel 8742, 8-bit microcontroller that includes a CPU running at 12 MHz, 128 bytes of RAM, 2048 bytes of EPROM, and I/O in the same chip.
DSP Processors
• Many embedded applications may be required to do a lot of signal processing
• Signals of physical systems are collected at a sample rate
− A motion control application may read position or location information from sensors at sample rates range from a few Hz to 100’s of Hz.
− Audio signals 8 kHz (telephony) to 44 kHz (audio CD’s)
− Ultrasonics (medical imaging) may sample sound signals at higher rates − Video (25 or 30 Hz) for consumer devices (frames and pixels)
− Software-defined radio applications – 100 kHz to several GHz
DSP Characteristics
• Signal processing applications all share certain characteristics
− Deal with large amounts of data
− The data may represent a signal in time (wireless radio signal),
space (images) or both (video and radar).
− Typically, perform sophisticated mathematical operations (filtering, machine learning, feature extraction)
• Processors that are designed to specifically support numerically intensive signal processing applications are called Digital Signal Processors (DSPs).
FIR filtering
• To gain some insight into the structure of a DSP we shall consider the structure of a typical signal processing algorithm.
• A canonical signal processing algorithm used in most applications is finite impulse response (FIR) filtering.
− In its simplest form, we can consider an input signal x that is a infinite sequence of numerical values (by infinite we mean very long !).
− For each input value x(n), the FIR filter must compute an output value y(n) according to the formula:
𝑦𝑦 𝑛𝑛 =∑𝑁𝑁−1𝑎𝑎𝑖𝑖𝑥𝑥(𝑛𝑛−𝑖𝑖) 𝑖𝑖=0
where N is the length of the FIR filter and ai are called the tap values.
Example: Tapped delay line implementation
Structure of tapped delay line implementation of the FIR filter. For each 𝑛𝑛 ∈ N, each component consumes one input value from each input path and produces one output value on each output path.
The boxes 𝑧𝑧−1 are unit delays and the triangles multiply their inputs by a constant (0.25 in this case).
Suppose N=4 and a0=a1=a2=a3=1/4, then for all n that are natural numbers:
y(n)=(x(n)+x(n-1)+x(n-2)+x(n-3))/4
Each output is the average of the four most recent inputs. Input values come in from the left and propagate down the delay line, which is tapped after each delay element.
The rate at which the input values x(n) are provided and must be processed is the sample rate. You must know the sample rate and N to determine the number of arithmetic operations to be computed per second.
• An FIR filter is provided with samples at a rate of 1 MHz, and that N = 32.
• At what rate must the outputs be computed at?
(1 MHz) − Each output requires 32 multiplications and N-1 (31 additions)
• How many arithmetic operations are required per second to implement this
application?
63 million arithmetic operations/sec
• To sustain the computation rate:
arithmetic hardware be fast enough, also data in & out of memory be fast
• Consider the two dimensional FIR filter,
• 𝑁𝑁 = 1 and 𝑀𝑀 = 1 are the minimal values for useful filters.
• Each pixel of the output image will need 9 multiplications and 8
additions.
• Consider a color image with 3 channels (RGB) with resolution 1080×1920. How many multiplications are required per image?
1080x1920x3x9=55,987,200 multiplications and additions.
• How many per second if there are 30 frames per second?
55,987,200×30=1,679,616,000 multiplications/sec and additions.
Graphics Processors
• A graphics processing unit (GPU) is a specialized processor designed to perform the calculations required for graphics rendering.
• Date back to the 1970’s
• Modern GPUs support 3D graphics, shading and ditital video.
• Intel,NVIDIAandAMD
• Important in high end gaming Laptops !
• Some embedded applications (games) are a good match for GPUs but they have evolved towards mode general programming machines. Often used in instrumentation.
• Very power hungry.
Parallelism
Parallelism
Parallelism vs Concurrency Pipelining
Instruction-Level Parallelism Multicore Architectures
Parallelism vs Concurrency
• Most processors today provide various forms of parallelism
− These mechanisms affect the timing of the execution of a program − Embedded system designers have to understand them.
• Concurrent
A computer program is concurrent if different parts of the program
conceptually execute simultaneously • Parallel
A program is parallel if different parts of the program physically execute simultaneously on distinct hardware.
Parallelism vs Concurrency
Concurrency is when certain Parallelism is when tasks tasks can start, run, & complete literally run at the same time in overlapping time periods
Non-concurrent programs
• Specify a sequence of instructions to execute.
• Imperative language: if a programming language expresses a computation as a sequence of operations (C is an imperative)
− to write concurrent programs using C: use thread library
− thread library uses facilities provided by OS and/or the hardware − Java is imperative language that directly supports threads
• The correct execution of a program requires that the instructions are executed in the specified sequence.
− often possible to execute instructions in parallel
double x, xSquared, xCubed; x = 3;
xSquared = x * x;
xCubed = xSquared * x;
In this example each statement must be completed in sequence.
We must know “x’ to find xSquared and we cannot find xCubed until we know xSquared.
double x, xSquared, xCubed; x = 3;
xSquared = x * x;
xCubed = x * x * x;
Do you need to know xSquared to find xCubed?
What does this mean?
Statements are independent and executed in parallel
Concurrency in Embedded Systems
• Embedded programs interact with physical processes and many activities progress at the same time.
• An embedded program often needs to monitor and react to multiple concurrent sources of stimulus and simultaneously control multiple output devices.
• Embedded programs are almost always concurrent programs.
− Timeliness matters: actions in the physical world should be performed on
right time
• Imperative and concurrent programs can be executed both sequentially and in parallel
Parallelism in Hardware
• The application does not (necessarily) demand multiple activities execute simultaneously
− it demands that things be done very quickly
• Of course, many other applications will combine both forms of concurrency, arising from parallelism and from application requirements.
• Here we will focus on hardware approaches to deliver parallelism
Pipelining Instruction-level parallelism Multicore architectures
• Later we will look at memory systems. These strongly influence how parallelism is handled.
The End of the Lecture
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com