程序代写代做代考 kernel compiler Haskell database Java concurrency C x86 assembly AI computer architecture GPU cache cuda flex graph clock chain file system game distributed system Systems, Networks & Concurrency 2019

Systems, Networks & Concurrency 2019
Architectures9 Uwe R. Zimmer – The Australian National University

[Bacon98]
J. Bacon
Concurrent Systems
1998 (2nd Edition) Addison Wesley Longman Ltd, ISBN 0-201-17767-6
[Stallings2001]
Stallings, William
Operating Systems
Prentice Hall, 2001
[Intel2010]
Intel® 64 and IA-32 Architectures Optimization Reference Manual http://www.intel.com/products/processor/manuals/
Architectures
References
© 2019 Uwe R. Zimmer, The Australian National University page 642 of 757 (chapter 9: “Architectures” up to page 745)

Hardware architectures:
☞ From simple logic to multi-core CPUs
☞ Concurrency on different levels Software architectures:
Architectures
In this chapter
☞ Languages of Concurrency ☞ Operating systems and libraries
© 2019 Uwe R. Zimmer, The Australian National University page 643 of 757 (chapter 9: “Architectures” up to page 745)

Abstraction Layer Application level
Form of concurrency
(user interface, specific functionality…)
Distributed systems, servers, web services, “multitasking” (popular understanding)
Language level
Process libraries, tasks/threads (language), syn- chronisation, message passing, intrinsic, …
Operating system
OS processes/threads, signals, events, multitasking, SMP, virtual parallel machines,…
(data types, tasks, classes, API, …)
(HAL, processes, virtual memory)
CPU / instruction level
Logically sequential: pipelines, out-of-order, etc. logically concurrent: multicores, interrupts, etc.
Device / register level
Parallel adders, SIMD, multiple execution units, caches, prefetch, branch prediction, etc.
Logic gates
Inherently massively parallel,
synchronised by clock; or: asynchronous logic
Digital circuitry
Multiple clocks, peripheral hardware, memory, …
Analog circuitry
Continuous time and inherently concurrent
© 2019 Uwe R. Zimmer, The Australian National University
page 644 of 757 (chapter 9: “Architectures” up to page 745)
(assembly instructions)
(arithmetic units, registers,…)
(‘and’, ‘or’, ‘not’, flip-flop, etc.)
(gates, buses, clocks, etc.)
(transistors, capacitors, …)
Architectures
Layers of abstraction

Strandbeest Theo Jansen 1990
Antikythera Mechanism Greek 150-100 BC- Credit: Wikipedia
Difference Engine Charles Babbage 1822
First transistor
John Bardeen and Walter Brattain 1947
Controllable Switches & Ratios as transistors, relays, vacuum tubes, valves, etc.
Architectures
Logic – the basic building blocks
© 2019 Uwe R. Zimmer, The Australian National University page 645 of 757 (chapter 9: “Architectures” up to page 745)

Constructing logic gates – for instance NAND in CMOS:
AB&Q 00&1
01&1 10&1 11&0
PMOS
A B
Q NMOS
© 2019 Uwe R. Zimmer, The Australian National University
page 646 of 757 (chapter 9: “Architectures” up to page 745)
A B
NAND Q
Architectures
Logic – the basic building blocks for digital computers

AB&Q 00&1
01&1 PMOS 10&1
ANOTQA NAND
Q
11&0
A
Q NMOS
A
B OR Q
NAND Q
B
© 2019 Uwe R. Zimmer, The Australian National University
page 647 of 757 (chapter 9: “Architectures” up to page 745)
A B
NAND
Q
… and subsequently all other logic gates:
Architectures
Logic – the basic building blocks for digital computers
Constructing logic gates – for instance NAND in CMOS:
AND NAND ABQAB
NAND
Q
A XOR Q
B
NAND
A

A NAND
B NAND
B
NAND
NAND
NAND Q

Half adder: A
Full adder:
Ripple carry adder:
Si
A0 B 0
A1 B1
A2 B2
XOR
XOR XOR AND AND
XOR
XOR
AND
AND
AND
XOR
S
XOR
Architectures
Logic – the basic building blocks
BC CANDAND
Ci
AND
i-1
OR
OR
C
S0 S1 S2
© 2019 Uwe R. Zimmer, The Australian National University page 648 of 757 (chapter 9: “Architectures” up to page 745)
Ai Bi
XOR
OR

Basic Flip-Flops
S RQ
NAND Q
S NAND
NAND Q
© 2019 Uwe R. Zimmer, The Australian National University
page 649 of 757 (chapter 9: “Architectures” up to page 745)
S
Q
NAND
NAND Q
KJS Q KJ R
NAND
NAND NAND
NAND
Q
C NOT
Architectures
Logic – the basic building blocks
RDQ CC
NAND
NAND NAND
NAND
Q
S
R
Q
NAND R
Q
D
NAND
NAND

Counting register:
Q0 Q1 Q2 Q3 Q4 Q5 Q6 Q7 S0 S1 S2 S3 S4 S5 S6 S7
© 2019 Uwe R. Zimmer, The Australian National University
page 650 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Logic – the basic building blocks
SQ≡SJSQDQ≡D JSQTSQ≡TJSQ CC
RQKRQQKRQRQKRQ R
JK- and D- Flip-Flops as universal Flip-Flops
T Q≡T XOR D Q J Q≡K
R
AND C AND
OR D Q QQKQCQ
C R
11111111
TSTSTSTSTSTSTSTS RRRRRRRR
J
S

Memory
IP
• Execution Unit / Arithmetic-Logic-Unit (ALU) A collection of transformational logic.
Code management
A simple CPU
Registers SP
Flags
• Memory
Decoder Sequencer
• Decoder/Sequencer
Can be a machine in itself which breaks CPU instructions into concurrent micro code.
Data management
• Code/Data management Fetching, Caching, Storing
© 2019 Uwe R. Zimmer, The Australian National University
page 651 of 757 (chapter 9: “Architectures” up to page 745)
ALU
• Registers
Instruction pointer, stack pointer,
general purpose and specialized registers
Architectures
Processor Architectures
• Flags
Indicating the states of the latest calculations.

Memory
Int.
Decoder Sequencer
• One or multiple lines wired directly into the sequencer
IP
☞ Required for:
Pre-emptive scheduling, Timer driven actions, Transient hardware interactions, …
Code management
Interrupts
Registers SP
Flags
☞ Usually preceded by an external logic (“interrupt controller”) which accumu- lates and encodes all external requests.
Data management
• CPU stops normal sequencer flow.
• Lookup of interrupt handler’s address
• Current IP and state pushed onto stack.
• IP set to interrupt handler.
© 2019 Uwe R. Zimmer, The Australian National University
page 652 of 757 (chapter 9: “Architectures” up to page 745)
ALU
On interrupt (if unmasked):
Architectures
Processor Architectures

We successfully interrupted a sequence of operations …
Ba©hi2a0H1o9nUdwaeRaRi.lZBirmidmger,(CThreaAtiuvsetrCaloiamnmNoatniosnAatltrUibnuivteiorsnit-yShareAlike 3.0, Photography by MrX at English Wikipedia) page 653 of 757 (chapter 9: “Architectures” up to page 745)

Program
Stack
Code
PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
© 2019 Uwe R. Zimmer, The Australian National University
page 654 of 757 (chapter 9: “Architectures” up to page 745)
SP
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

Program
Stack
PC

Code
………
………………
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
SP
© 2019 Uwe R. Zimmer, The Australian National University
page 655 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

Program
Stack
PC
Push registers
Declare local variables
Code
………
………………
………
SP
Local
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
Registers
© 2019 Uwe R. Zimmer, The Australian National University
page 656 of 757 (chapter 9: “Architectures” up to page 745)
Base
variables
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
PC
.. do some I/O ..
.. or run some time
critical code ..
………
………………
………
SP
Local
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
Registers
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 657 of 757 (chapter 9: “Architectures” up to page 745)
Base
variables
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
………
………………
………
PC Registers
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
SP
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 658 of 757 (chapter 9: “Architectures” up to page 745)
Base
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
………
………………
………
PC
Pop registers
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
SP
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 659 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
………
………………
………
PC
Pop registers
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
SP
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 660 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Bahia Honda Rail Bridge (Creative Commons Attribution- ShareAlike 3.0, Photography by MrX at English Wikipedia)
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

We successfully interrupted a sequence of operations …
… and now the trick to get to the other side.
Ba©hi2a0H1o9nUdwaeRaRi.lZBirmidmger,(CThreaAtiuvsetrCaloiamnmNoatniosnAatltrUibnuivteiorsnit-yShareAlike 3.0, Photography by MrX at English Wikipedia) page 661 of 757 (chapter 9: “Architectures” up to page 745)

Program
Stack
Code
PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
© 2019 Uwe R. Zimmer, The Australian National University
page 662 of 757 (chapter 9: “Architectures” up to page 745)
SP
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

Program
Stack
PC

Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
PC
© 2019 Uwe R. Zimmer, The Australian National University
page 663 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Flags
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

The CPU hardware (!) did that, before anything was changed
Program
Stack
PC

Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
PC
© 2019 Uwe R. Zimmer, The Australian National University
page 664 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Flags
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

Program
Stack
PC
Push registers
Declare local variables
Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Local
© 2019 Uwe R. Zimmer, The Australian National University
page 665 of 757 (chapter 9: “Architectures” up to page 745)
Base
variables
Registers
Flags PC
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
PC
.. do some I/O ..
.. or run some time
critical code ..
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Local
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 666 of 757 (chapter 9: “Architectures” up to page 745)
Base
variables
Registers
Flags PC
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
PC
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 667 of 757 (chapter 9: “Architectures” up to page 745)
SP
Registers
Base
Flags PC
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
PC
Pop registers
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 668 of 757 (chapter 9: “Architectures” up to page 745)
SP
PC
FP
Return address Context
Base
Flags
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Push registers
Declare local variables
Run handler code
Code
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Pop registers
Return from interrupt
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 669 of 757 (chapter 9: “Architectures” up to page 745)
SP
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Stack
Code
PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
© 2019 Uwe R. Zimmer, The Australian National University
page 670 of 757 (chapter 9: “Architectures” up to page 745)
SP
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

LR is loaded with a special value
Program
Stack
PC

Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Scratch
registers
© 2019 Uwe R. Zimmer, The Australian National University
page 671 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Flags PC
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Architectures
Interrupt processing
Interrupt handler

Program
PC
Clear interrupt flag
(Adjust priorities)
(Re-enable interrupt)
Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Scratch
registers
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 672 of 757 (chapter 9: “Architectures” up to page 745)
FP
Return address Context
Base
Flags PC
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Clear interrupt flag
(Adjust priorities)
(Re-enable interrupt)
Push other registers
Code
SP
Local
variables
PC Declare local variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Registers
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 673 of 757 (chapter 9: “Architectures” up to page 745)
Base
Scratch
registers
Flags PC
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Clear interrupt flag
(Adjust priorities)
(Re-enable interrupt)
Push other registers
Declare local variables
Code
SP
Local
variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Registers
PC Run handler code
.. do some I/O ..
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 674 of 757 (chapter 9: “Architectures” up to page 745)
Base
Scratch
registers
.. or run some time
critical code ..
Flags PC
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Clear interrupt flag
(Adjust priorities)
(Re-enable interrupt)
Push other registers
Declare local variables
Run handler code
Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Scratch
registers
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 675 of 757 (chapter 9: “Architectures” up to page 745)
Base
Flags PC
PC
Pop other registers
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Clear interrupt flag
(Adjust priorities)
(Re-enable interrupt)
Push other registers
Declare local variables
Run handler code
Code
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Scratch
registers
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 676 of 757 (chapter 9: “Architectures” up to page 745)
Base
Flags PC
PC
Pop other registers
Return (“bx lr”)
Local variables
Return address Context
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

Program
Clear interrupt flag
(Adjust priorities)
(Re-enable interrupt)
Push other registers
Declare local variables
Run handler code
Code
PC
……… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ……………… ……… ……… ……… ………
.. do some I/O ..
.. or run some time
critical code ..
Remove local variables
Stack
© 2019 Uwe R. Zimmer, The Australian National University
page 677 of 757 (chapter 9: “Architectures” up to page 745)
SP
Pop other registers
Return (“bx lr”)
FP
Return address Context
Base
Local variables
Parameters
Local variables
Return address Context
Parameters Global variables
Interrupt handler
Architectures
Interrupt processing

☞ Interrupt handler code can be interrupted as well.
☞ Are you allowing to interrupt an interrupt handler with an
Architectures
Interrupt handler
Things to consider
interrupt on the same priority level (e.g. the same interrupt)? ☞ Can you overrun a stack with interrupt handlers?
© 2019 Uwe R. Zimmer, The Australian National University page 678 of 757 (chapter 9: “Architectures” up to page 745)

Busy!
Do Not Disturb!
☞ Interrupt handler code can be interrupted as well.
☞ Are you allowing to interrupt an interrupt handler with an
Architectures
Interrupt handler
Things to consider
interrupt on the same priority level (e.g. the same interrupt)? ☞ Can you overrun a stack with interrupt handlers?
☞ Can we have one of those?
© 2019 Uwe R. Zimmer, The Australian National University page 679 of 757 (chapter 9: “Architectures” up to page 745)

If we can execute interrupt handler code “concurrently” to our “main” program:
Architectures
Multiple programs
☞ Can we then also have multiple “main” programs?
© 2019 Uwe R. Zimmer, The Australian National University page 680 of 757 (chapter 9: “Architectures” up to page 745)

Process 1
Process 2
Code
Code
Stack
PC
Registers
PCB PID
… … … Stack
PCB
PID SP … … …
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
© 2019 Uwe R. Zimmer, The Australian National University
page 681 of 757 (chapter 9: “Architectures” up to page 745)
SP FP
Flags PC
Base
Base
Local variables
Local variables
Return address Context
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Process 1
PC

Process 2
Code
Code
Stack
PCB PID
… … … Stack
PCB
PID SP … … …
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
© 2019 Uwe R. Zimmer, The Australian National University
page 682 of 757 (chapter 9: “Architectures” up to page 745)
SP
Registers
FP
Return address Context
Return address Context
Base
Base
Flags PC
Flags PC
Local variables
Local variables
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Process 1
Push registers
Declare local variables
Process 2 PCB
Code
Code
Stack
PCB PID
PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Registers
Registers
… … … Stack
PID SP … … …
© 2019 Uwe R. Zimmer, The Australian National University
page 683 of 757 (chapter 9: “Architectures” up to page 745)
SP
Context- switch- variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
Base
Base
Flags PC
Flags PC
Local variables
Local variables
Return address Context
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Process 1 PCB
Push registers
Process 2 PCB
Code
Stack SP
Code
Stack
PID SP … …

Declare local variables PC Store SP to PCB 1
PID SP …


………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
© 2019 Uwe R. Zimmer, The Australian National University
page 684 of 757 (chapter 9: “Architectures” up to page 745)
Base
Base
Registers
Registers
Flags PC
Flags PC
Local variables
Local variables
Return address Context
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Process 1 PCB
Push registers
Declare local variables
Store SP to PCB 1
Scheduler
Process 2 PCB
Code
Stack SP
Code
Stack
PID SP … … …
PC
PID SP …


………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
© 2019 Uwe R. Zimmer, The Australian National University
page 685 of 757 (chapter 9: “Architectures” up to page 745)
Base
Base
Registers
Registers
Flags PC
Flags PC
Local variables
Local variables
Return address Context
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Process 1 PCB
Push registers
Declare local variables
Store SP to PCB 1
Scheduler
Load SP from PCB 2
Process 2
PID SP … … … Code Stack
PCB PID
… … Stack

………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
……… Local variables ………………
Registers
………
………
……… Return address ………………
………
………
………
………………
………
………
………
………………
………
………
……… Global variables ………
Return address Context
© 2019 Uwe R. Zimmer, The Australian National University
page 686 of 757 (chapter 9: “Architectures” up to page 745)
Base
Base
Context- switch- variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
SP
Context- switch- variables
Registers
Flags PC
Flags PC
Local variables
Local variables
Return address Context
Return address Context
Parameters
Parameters
Context Parameters
Parameters Global variables
PC
Code
Dispatcher
Architectures
Context switch
Local variables

Process 1 PCB
Push registers
Declare local variables
Store SP to PCB 1
Scheduler
Load SP from PCB 2
Remove local variables
Process 2
Code
Stack
Code
PID SP … …

PCB PID
… … Stack

………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Registers
SP
Registers
© 2019 Uwe R. Zimmer, The Australian National University
page 687 of 757 (chapter 9: “Architectures” up to page 745)
Base
Base
Context- switch- variables
PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Flags PC
Flags PC
Local variables
Local variables
Return address Context
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Process 1 PCB
Push registers
Declare local variables
Store SP to PCB 1
Scheduler
Load SP from PCB 2
Remove local variables
Pop registers
Process 2
Code
Stack
Code
PID SP … … …
PCB PID
… … Stack

………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- PC
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
© 2019 Uwe R. Zimmer, The Australian National University
page 688 of 757 (chapter 9: “Architectures” up to page 745)
Base
Base
variables Registers Flags
SP
PC
PC
Local variables
Local variables
Return address Context
FP
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch
Flags

Process 1 PCB
Push registers Process 2 Declare local variables PCB Store SP to PCB 1 PID Scheduler
Code
Stack
Load SP from PCB 2 Code Remove local variables
Pop registers
Return from interrupt
PID SP … …

… … Stack

………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
Context- switch- variables
………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………………
………
………
………
………
© 2019 Uwe R. Zimmer, The Australian National University
page 689 of 757 (chapter 9: “Architectures” up to page 745)
Base
Base
Registers
Flags PC
SP
Local variables
Local variables
Return address Context
PC
FP
Return address Context
Parameters
Parameters
Local variables
Local variables
Return address Context
Return address Context
Parameters Global variables
Parameters Global variables
Dispatcher
Architectures
Context switch

Memory
Int.
SeqSueeqnuceenrcer
Some CPU actions are naturally sequential (e.g. instructions need to be first loaded, then decoded before they can be executed).
CoCdeodmeamnagneamgemntent Decoder
Pipeline
IP
More fine grained sequences can be introduced by breaking CPU instructions into micro code.
Registers SP
Flags
D a tD Da a amt t a aa nm ma a ag n ne a amg g e e e nm mt e e n n t t
☞ (Conditional) branches
might break the pipelines
☞ Branch predictors become essential.
© 2019 Uwe R. Zimmer, The Australian National University
page 690 of 757 (chapter 9: “Architectures” up to page 745)
ALU
☞ Same latency, yet higher throughput.
Architectures
Processor Architectures
☞ Overlapping those sequences in time will lead to the concept of pipelines.

Memory
Int.
SeqSueeqnuceenrcer
Filling parallel pipelines
(by alternating incoming commands between pipelines) may employ multiple ALU’s.
CoCdeodmeamnagneamgemntent Decoder
Processor Architectures
Parallel pipelines
IP
☞ (Conditional) branches might again break the pipelines.
Registers SP
Flags
D D a a t tD D Da a a a am mt t t a a aa a n nm m ma a a a ag g n n ne e a a am mg g g e e e e e n nm m mt t e e e n n n t t t
© 2019 Uwe R. Zimmer, The Australian National University
page 691 of 757 (chapter 9: “Architectures” up to page 745)
ALU ALU
☞ Same latency, yet even higher throughput.
☞ Compilers need to be aware of the options.
Architectures
☞ Interdependencies might limit the degree of concurrency.

Memory
Int.
Breaking the sequence inside each pipe- line leads to ‘out of order’ CPU designs.
CoCdeodmeamnagneamgemntent Decoder
Processor Architectures
Out of order execution
IP
☞ Replace pipelines with hardware scheduler. ☞ Results need to be
Registers SP
Flags
“re-sequentialized” or possibly discarded.
SeqSueeqnuceenrcer
D D a a t t D D D a a a a a m m t t t a a a a a n n m m m a a a a a g g n n n e e a a a m m g g g e e e e e n mn m m t t e e e n n n t t t
☞ This hardware will require (extensive) code optimization to be fully utilized.
© 2019 Uwe R. Zimmer, The Australian National University
page 692 of 757 (chapter 9: “Architectures” up to page 745)
ALU ALU
☞ Works better if the presented code sequence has more independent instructions and fewer conditional branches.
Architectures
☞ “Conditional branch prediction” executes the most likely branch or multiple branches.

Memory
Int.
Decoder Sequencer
Provides the facility to apply the same in- struction to multiple data concurrently. Also referred to as “vector units”.
IP
Examples: Altivec, MMX, SSE[2|3|4], …
Code management
SIMD ALU units
Registers SP
Flags
☞ Requires specialized compilers or programming languages with implicit concurrency.
Data management
© 2019 Uwe R. Zimmer, The Australian National University
☞ Unifying architecture languages are used (OpenCL, CUDA, GPGPU).
ALU ALU
GPU processing
ALU ALU
Architectures
Processor Architectures
Graphics processor as a vector unit.
page 693 of 757 (chapter 9: “Architectures” up to page 745)

Memory
Int.
Decoder
Emulates multiple virtual CPU cores by means of replication of:
IP IP
Code management Decoder
Processor Architectures Hyper-threading
Registers Registers
Flags Flags
SP SP
while keeping the “expensive” resources like the ALU central yet accessible by multiple hyper-threads concurrently.
Data management
© 2019 Uwe R. Zimmer, The Australian National University
Examples: Intel Pentium 4, Core i5/i7, Xeon, Atom, Sun UltraSPARC T2 (8 threads per core)
Sequencer Sequencer
• Registersets
• Sequencer
• Flags
• Interruptlogic
ALU
Architectures
☞ Requires programming languages with implicit or explicit concurrency.
page 694 of 757 (chapter 9: “Architectures” up to page 745)

Memory
Int.
Decoder
Full replication of multiple CPU cores on the same chip package.
IP IP
Code management Decoder
Processor Architectures
Multi-core CPUs
Registers Registers
Flags Flags
• Cleanest and most explicit implementation of concurrency on the CPU level.
SP SP
Data management
© 2019 Uwe R. Zimmer, The Australian National University
Historically the introduction of multi-core CPUs ended the “GHz race” in the early 2000’s.
Sequencer Sequencer
• Often combined with hyper-thread- ing and/or multiple other means (as introduced above) on each core.
ALU
☞ Requires synchronized atomic operations.
☞ Requires programming languages with
Architectures
implicit or explicit concurrency.
page 695 of 757 (chapter 9: “Architectures” up to page 745)

VirtMuaelmoermyory Physical memory
Int.
Decoder Sequencer
Translates logical memory addresses
into physical memory addresses
and provides memory protection features.
IP
• Does not introduce concurrency by itself.
Code management
Virtual memory
Registers SP
Flags
☞ Is still essential for concurrent programming as hardware memory protection
guarantees memory integrity for
individual processes / threads.
Data management
© 2019 Uwe R. Zimmer, The Australian National University
page 696 of 757 (chapter 9: “Architectures” up to page 745)
ALU
Architectures
Processor Architectures

Architectures
Alternative Processor Architectures: Parallax Propeller
© 2019 Uwe R. Zimmer, The Australian National University page 697 of 757 (chapter 9: “Architectures” up to page 745)

8 semaphores
No interrupts!
8 cores with 2 kB local memory
40 kB shared memory
Low cost 32 bit processor ($8)
Architectures
Alternative Processor Architectures: Parallax Propeller (2006)
© 2019 Uwe R. Zimmer, The Australian National University page 698 of 757 (chapter 9: “Architectures” up to page 745)

theoretical 25.6 GFLOPS at 3.2 GHz
8 cores for specialized high- bandwidth floating point operations and 128 bit registers
Multiple interconnect topologies
64 bit PowerPC core
Cache
Architectures
Alternative Processor Architectures: IBM Cell processor (2001)
© 2019 Uwe R. Zimmer, The Australian National University page 699 of 757 (chapter 9: “Architectures” up to page 745)

Scaling up:
• Multi-CPU with high-speed interconnects various supercomputer architectures, e.g. Cray XE6:
• 12-core AMD Opteron, up to 192 per cabinet (2304 cores)
• 3D torus interconnect (160GB/sec capacity, 48 ports per node)
• Cluster computer (Multi-CPU over network) multiple computers connected by network interface,
e.g. Sun Constellation Cluster at ANU:
• 1492 nodes, each: 2x Quad core Intel Nehalem, 24 GB RAM • QDR Infiniband network, 2.6 GB/sec
Architectures
Multi-CPU systems
• Multi-CPU on the same memory
multiple CPUs on same motherboard and memory bus, e.g. servers, workstations
© 2019 Uwe R. Zimmer, The Australian National University page 700 of 757 (chapter 9: “Architectures” up to page 745)

Translates into
CPU-level vector operations
Buzzword collection: AltiVec, SPE, MMX, SSE, NEON, SPU, AVX, …
Combined with
in-lining, loop unrolling and caching this is as fast as a single CPU will get.
a$v=a$ y = a$y fxp fa$xp
z a$z type Real is digits 15;
type Vectors is array (Positive range <>) of Real;
function Scale (Scalar : Real; Vector : Vectors) return Vectors is
Scaled_Vector : Vectors (Vector’Range);
begin
for i in Vector’Range loop
Scaled_Vector (i) := Scalar * Vector (i); end loop;
return Scaled_Vector; end Scale;
© 2019 Uwe R. Zimmer, The Australian National University
page 701 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Vector Machines
Vectorization

Translates into CPU-level vector operations as well as multi-core or
fully distributed operations
Function is
“promoted”
a$v=a$ y = a$y fxp fa$xp
z a$z const Index = {1 .. 100000000},
Vector_1 : [Index] real = 1.0,
Scale : real = 5.1,
Scaled : [Vector] real = Scale * Vector_1;
© 2019 Uwe R. Zimmer, The Australian National University
page 702 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Vector Machines
Vectorization

Translates into
CPU-level vector operations
/-chain is evaluated lazy sequentially.
type Real is digits 15;
z1 z2
type Vectors is array (Positive range <>) of Real;
function ”=” (Vector_1, Vector_2 : Vectors) return Boolean is
(for all i in Vector_1’Range => Vector_1 (i) = Vector_2 (i));
Architectures
Vector Machines
Reduction
v=v&y=y&x=x/y=y/z=z
1 2 fx1p fx2p ^1 2h ^1 2h ^1 2h
© 2019 Uwe R. Zimmer, The Australian National University page 703 of 757 (chapter 9: “Architectures” up to page 745)

/-operations are evaluated in a concurrent divide-and-conquer (binary tree) structure.
Translates into CPU-level vector operations as well as multi-core or
fully distributed operations
Function is
“promoted”
z1 z2
const Index = {1 .. 100000000},
Vector_1, Vector_2 : [Index] real = 1.0; proc Equal (v1, v2) : bool
{return && reduce (v1 == v2);}
© 2019 Uwe R. Zimmer, The Australian National University
page 704 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Vector Machines
Reduction
x1 x2
v1 = v2 & fy1p= fy2p& ^x1 = x2h/^y1 = y2h/^z1 = z2h

Translates into CPU-level vector operations as well as multi-core or
fully distributed operations
Architectures
Vector Machines
General Data-parallelism
R $ 6px”S-1
– 1 :V 5 -1W
const Mask : [1 .. 3, 1 .. 3] real = ((0, -1, 0), (-1, 5, -1), (0, -1, 0)); proc Unsharp_Mask (P, (i, j) : index (Image)) : real
{return + reduce (Mask * P [i – 1 .. i + 1, j – 1 .. j + 1]);}
const Sharpened_Picture = forall px in Image do Unsharp_Mask (Picture, px);
S S T
:
– 1
W :W X
© 2019 Uwe R. Zimmer, The Australian National University page 705 of 757 (chapter 9: “Architectures” up to page 745)

John Conway’s Game of Life rule:
proc Rule (S, (i, j) : index (World)) : Cell {
Vector Machines
General Data-parallelism

Architectures
Cellular automaton transitions from a state S into the next state Sl: S ” Sl + 6c ! S: c ” cl = r^S,ch, i.e. all cells of a state transition concurrently into new cells by following a rule r.
Next_State = forall World_Indices in World do Rule (State, World_Indices);
const Population : index ({0 .. 9}) =
+ reduce Count (Cell.Alive, S [i – 1 .. i + 1, j – 1 .. j + 1]);
return (if Population == 3
|| (Population == 4 && S [i, j] == Cell.Alive) then Cell.Alive
else Cell.Dead);
© 2019 Uwe R. Zimmer, The Australian National University page 706 of 757 (chapter 9: “Architectures” up to page 745)
}

Architectures
Operating Systems
What is an operating system?
© 2019 Uwe R. Zimmer, The Australian National University page 707 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
What is an operating system?
1. A virtual machine!
… offering a more comfortable and safer environment
(e.g. memory protection, hardware abstraction, multitasking, …)
© 2019 Uwe R. Zimmer, The Australian National University page 708 of 757 (chapter 9: “Architectures” up to page 745)

Tasks
OS Hardware
Tasks
Tasks
Typ. general OS
Typ. real-time system
© 2019 Uwe R. Zimmer, The Australian National University
page 709 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
What is an operating system?
1. A virtual machine!
… offering a more comfortable and safer environment
RT-OS Hardware
Hardware
Typ. embedded system
run-time environment

Architectures
What is an operating system?
2. A resource manager!
… coordinating access to hardware resources
© 2019 Uwe R. Zimmer, The Australian National University page 710 of 757 (chapter 9: “Architectures” up to page 745)

… coordinating access to hardware resources Operating systems deal with
• processors
• memory
• massstorage
• communicationchannels
• devices (timers, special purpose processors, peripheral hardware, …
Architectures
What is an operating system?
2. A resource manager!
☞ and tasks/processes/programs which are applying for access to these resources!
© 2019 Uwe R. Zimmer, The Australian National University page 711 of 757 (chapter 9: “Architectures” up to page 745)

• early 60s: Multiprogramming systems:
☞ employ the long device I/O delays for switches to other, runable programs
• early 60s: Multiprogramming, time-sharing systems:
☞ assign time-slices to each program and switch regularly
Architectures
The evolution of operating systems
• in the beginning: single user, single program, single task, serial processing – no OS
• 50s: System monitors / batch processing
☞ the monitor ordered the sequence of jobs and triggered their sequential execution
• 50s-60s: Advanced system monitors / batch processing:
☞ the monitor is handling interrupts and timers
☞ first support for memory protection
☞ firstimplementationsofprivilegedinstructions(accessiblebythemonitoronly).
• early 70s: Multitasking systems – multiple developments resulting in UNIX (besides others)
• early 80s: single user, single tasking systems, with emphasis on user interface or APIs. MS-DOS, CP/M, MacOS and others first employed ‘small scale’ CPUs (personal computers).
• mid-80s: Distributed/multiprocessor operating systems – modern UNIX systems (SYSV, BSD)
© 2019 Uwe R. Zimmer, The Australian National University page 712 of 757 (chapter 9: “Architectures” up to page 745)

• 80s: introduction of fast local networks (LANs): ethernet, token-ring
• 90s: mass introduction of wireless networks (LAN and WAN)
Current standard consumer computers might come with:
• High speed network connectors (e.g. GB-Ethernet)
• Wireless LAN (e.g. IEEE802.11g, …)
• Local device bus-system (e.g. Firewire 800, Fibre Channel or USB 3.0)
• Wireless local device network (e.g. Bluetooth)
• Infrared communication (e.g. IrDA)
• Modem/ADSL
Architectures
The evolution of communication systems
• 1901: first wireless data transmission (Morse-code from ships to shore)
• ‘56: first transmission of data through phone-lines
• ‘62: first transmission of data via satellites (Telstar)
• ‘69: ARPA-net (predecessor of the current internet)
© 2019 Uwe R. Zimmer, The Australian National University page 713 of 757 (chapter 9: “Architectures” up to page 745)

• 80s: PCs starting with almost none of the classical OS-features and services, but with an user-interface (MacOS) and simple device drivers (MS-DOS)
Architectures
Types of current operating systems
Personal computing systems, workstations, and workgroup servers:
• late 70s: Workstations starting by porting UNIX or VMS to ‘smaller’ computers.
☞ last 20 years: evolving and expanding into current general purpose OSs, like for instace:
• Solaris (based on SVR4, BSD, and SunOS)
• LINUX (open source UNIX re-implementation for x86 processors and others)
• current Windows (proprietary, partly based on Windows NT, which is ‘related’ to VMS)
• MacOS X (Mach kernel with BSD Unix and a proprietary user-interface)
• Multiprocessing is supported by all these OSs to some extent.
• None of these OSs are suitable for embedded systems, although trials have been performed.
• None of these OSs are suitable for distributed or real-time systems.
© 2019 Uwe R. Zimmer, The Australian National University page 714 of 757 (chapter 9: “Architectures” up to page 745)

Parallel operating systems
• support for a large number of processors, either:
• symmetrical: each CPU has a full copy of the operating system or
Architectures
Types of current operating systems
• asymmetrical: only one CPU carries the full operating system, the others are operated by small operating system stubs to transfer code or tasks.
© 2019 Uwe R. Zimmer, The Australian National University page 715 of 757 (chapter 9: “Architectures” up to page 745)

Distributed operating systems
• all CPUs carry a small kernel operating system for communication services.
• all other OS-services are distributed over available CPUs
• services may migrate
• services can be multiplied in order to
• guarantee availability (hot stand-by)
• or to increase throughput (heavy duty servers)
© 2019 Uwe R. Zimmer, The Australian National University
page 716 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Types of current operating systems

Real-time operating systems
• Fast context switches?
• Smallsize?
• Quick response to external interrupts?
• Multitasking?
• ‘low level’ programming interfaces?
• Interprocess communication tools?
• High processor utilization?
© 2019 Uwe R. Zimmer, The Australian National University
page 717 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Types of current operating systems

Real-time operating systems
• Fast context switches?
• Small size?
• Quick response to external interrupts?
• Multitasking?
• ‘low level’ programming interfaces?
• Interprocess communication tools?
• High processor utilization?
should be fast anyway
should be small anyway
not ‘quick’, but predictable
often, not always
needed in many operating systems needed in almost all operating systems fault tolerance builds on redundancy!
© 2019 Uwe R. Zimmer, The Australian National University
page 718 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Types of current operating systems

☞ the correctness of the time, when the results are delivered
☞ Predictability! (not performance!) ☞ All results are to be delivered just-in-time – not too early, not too late.
© 2019 Uwe R. Zimmer, The Australian National University
page 719 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Types of current operating systems
Real-time operating systems need to provide… ☞ the logical correctness of the results as well as
Timing constraints are specified in many different ways … … often as a response to ‘external’ events
☞ reactive systems

Embedded operating systems
• usually real-time systems, often hard real-time systems
• very small footprint (often a few KBs)
• none or limited user-interaction
☞ 90-95% of all processors are working here!
© 2019 Uwe R. Zimmer, The Australian National University
page 720 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Types of current operating systems

Architectures
What is an operating system?
Is there a standard set of features for operating systems?
© 2019 Uwe R. Zimmer, The Australian National University page 721 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
What is an operating system?
Is there a standard set of features for operating systems?
☞ no:
the term ‘operating system’ covers 4 kB microkernels,
as well as > 1 GB installations of desktop general purpose operating systems.
© 2019 Uwe R. Zimmer, The Australian National University page 722 of 757 (chapter 9: “Architectures” up to page 745)

☞ no:
the term ‘operating system’ covers 4 kB microkernels,
as well as > 1 GB installations of desktop general purpose operating systems.
Architectures
What is an operating system?
Is there a standard set of features for operating systems?
Is there a minimal set of features?
© 2019 Uwe R. Zimmer, The Australian National University page 723 of 757 (chapter 9: “Architectures” up to page 745)

☞ no:
the term ‘operating system’ covers 4 kB microkernels,
as well as > 1 GB installations of desktop general purpose operating systems.
Architectures
What is an operating system?
Is there a standard set of features for operating systems?
Is there a minimal set of features?
☞ almost:
memory management, process management and inter-process communication/synchronisation will be considered essential in most systems
© 2019 Uwe R. Zimmer, The Australian National University page 724 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
What is an operating system?
Is there a standard set of features for operating systems?
☞ no:
the term ‘operating system’ covers 4 kB microkernels,
as well as > 1 GB installations of desktop general purpose operating systems.
Is there a minimal set of features?
☞ almost:
memory management, process management and inter-process communication/synchronisation will be considered essential in most systems
Is there always an explicit operating system?
© 2019 Uwe R. Zimmer, The Australian National University page 725 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
What is an operating system?
Is there a standard set of features for operating systems?
☞ no:
the term ‘operating system’ covers 4 kB microkernels,
as well as > 1 GB installations of desktop general purpose operating systems.
Is there a minimal set of features?
☞ almost:
memory management, process management and inter-process communication/synchronisation will be considered essential in most systems
Is there always an explicit operating system?
☞ no:
some languages and development systems operate with standalone runtime environments
© 2019 Uwe R. Zimmer, The Australian National University page 726 of 757 (chapter 9: “Architectures” up to page 745)

Process management:
• Contextswitch
• Scheduling
• Book keeping (creation, states, cleanup)
☞ context switch:
☞ needs to…
• ‘remove’ one process from the CPU while preserving its state
• choose another process (scheduling)
• ‘insert’ the new process into the CPU, restoring the CPU state
Some CPUs have hardware support for context switching, otherwise: ☞ use interrupt mechanism
Architectures
Typical features of operating systems
© 2019 Uwe R. Zimmer, The Australian National University page 727 of 757 (chapter 9: “Architectures” up to page 745)

Memory management:
☞ tightly coupled to scheduling / task switching! Hardware abstraction
• Devicedrivers
• API
• Protocols, file systems, networking, everything else…
© 2019 Uwe R. Zimmer, The Australian National University
page 728 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Typical features of operating systems
• Allocation / Deallocation
• Virtual memory: logical vs. physical addresses, segments, paging, swapping, etc.
• Memory protection (privilege levels, separate virtual memory segments, …)
• Sharedmemory
Synchronisation / Inter-process communication
• semaphores, mutexes, cond. variables, channels, mailboxes, MPI, etc. (chapter 4)

• non-portable
• hard to maintain
• lacksreliability
• all services are in the kernel (on the same privilege level)
Tasks
☞ but: may reach high efficiency
Hardware Monolithic
e.g. most early UNIX systems,
MS-DOS (80s), Windows (all non-NT based versions) MacOS (until version 9), and many others…
© 2019 Uwe R. Zimmer, The Australian National University
page 729 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Typical structures of operating systems
Monolithic (or ‘the big mess…’ )
APIs
OS

☞ may reach high efficiency
Hardware Modular
e.g. current Linux versions
© 2019 Uwe R. Zimmer, The Australian National University
page 730 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Typical structures of operating systems
Monolithic & Modular
• Modules can be platform independent
• Easier to maintain and to develop
• Reliability is increased
• all services are still in the kernel (on the same privilege level)
Tasks
M1 M1… Mn
OS
APIs

• easily portable
• significantlyeasiertomaintain
• crashing layers do not necessarily stop the whole OS
• possibly reduced efficiency through many interfaces
• rigorous implementation of the stacked virtual machine
Tasks
perspective on OSs
Hardware Layered
e.g. some current UNIX implementations (e.g. Solaris) to a certain de- gree, many research OSs (e.g. ‘THE system’, Dijkstra ‘68)
Architectures
Typical structures of operating systems
Monolithic & layered
© 2019 Uwe R. Zimmer, The Australian National University page 731 of 757 (chapter 9: “Architectures” up to page 745)
Mn
… OS
M1 layers M0
APIs

memory, and message handling
• all ‘higher’ services are dealt with outside the
Tasks Tasks APIs APIs
Tasks
• multiple OSs can be executed at the same time
OS 1 1 … n OS M1 M0
layers
• μkernel is highly hardware dependent ☞ only the μkernel needs to be ported.
Hardware μkernel, virtual machine
• possibly reduced efficiency through increased communications
e.g. wide spread concept: as early as the CP/M, VM/370 (‘79) or as recent as MacOS X (mach kernel + BSD unix), …
Architectures
Typical structures of operating systems
μKernels & virtual machines • μkernel implements essential process,
kernel ☞ no threat for the kernel stability
• significantlyeasiertomaintain Mn
© 2019 Uwe R. Zimmer, The Australian National University page 732 of 757 (chapter 9: “Architectures” up to page 745)
APIs MMM … OS
μkernel

• μkernel implements essential process, memory, and message handling
• all ‘higher’ services are user level servers
task 1
task n
service 1
service m
• significantlyeasiertomaintain
μkernel
• kernel ensures reliable message passing between clients and servers
Hardware
μkernel, client server structure
• highly modular and flexible
• servers can be redundant and easily replaced
• possibly reduced efficiency through increased communications
e.g. current research projects, L4, etc.
© 2019 Uwe R. Zimmer, The Australian National University
page 733 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Typical structures of operating systems
μKernels & client-server models

μKernels & client-server models • μkernel implements essential process,
memory, and message handling
• all ‘higher’ services are user level servers
• significantlyeasiertomaintain
task 1
task n
service 1
service m
• kernel ensures reliable message passing between clients and servers:
locally and through a network
Hardware
• highly modular and flexible
• servers can be redundant and easily replaced
• possibly reduced efficiency through increased communications
e.g. Java engines,
distributed real-time operating systems, current distributed OSs research projects
Architectures
Typical structures of operating systems
© 2019 Uwe R. Zimmer, The Australian National University page 734 of 757 (chapter 9: “Architectures” up to page 745)
μkernel
μkernel
μkernel
μkernel, distributed systems
Network

• Hierarchical file-system (maintained via ‘mount’ and ‘unmount’)
• Universal file-interface applied to files, devices (I/O), as well as IPC
• Dynamic process creation via duplication
• Choice of shells
• Internal structure as well as all APIs are based on ‘C’
• Relatively high degree of portability
☞ UNICS, UNIX, BSD, XENIX, System V, QNX, IRIX, SunOS, Ultrix, Sinix, Mach, Plan 9, NeXTSTEP, AIX, HP-UX, Solaris, NetBSD, FreeBSD, Linux, OPEN- STEP, OpenBSD, Darwin, QNX/Neutrino, OS X, QNX RTOS, … … .
Architectures
UNIX
UNIX features
© 2019 Uwe R. Zimmer, The Australian National University page 735 of 757 (chapter 9: “Architectures” up to page 745)

pid = fork ();
resulting a duplication of the current process
• returning 0 to the newly created process
Architectures
UNIX
Dynamic process creation
• returning the process id of the child process to the creating process (the ‘parent’ process) or -1 for a failure
© 2019 Uwe R. Zimmer, The Australian National University page 736 of 757 (chapter 9: “Architectures” up to page 745)

pid = fork ();
resulting a duplication of the current process
• returning 0 to the newly created process
Architectures
UNIX
Dynamic process creation
• returning the process id of the child process to the creating process (the ‘parent’ process) or -1 for a failure
Frequent usage:
if (fork () == 0) {
// … the child’s task … often implemented as: exec (“absolute path to executable file“, “args“); exit (0); /* terminate child process */
} else {
//… the parent’s task …
pid = wait (); /* wait for the termination of one child process */
}
© 2019 Uwe R. Zimmer, The Australian National University page 737 of 757 (chapter 9: “Architectures” up to page 745)

#include
#include
#include
if (id == 0) {
signal (SIGSTOP, catch_stop); pause ();
exit (0);
pid_t id;
void catch_stop (int sig_num)
{
} else {
kill (id, SIGSTOP); pid = wait ();
/* do something with the signal */
}
© 2019 Uwe R. Zimmer, The Australian National University
page 738 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
UNIX
Synchronization in UNIX ☞ Signals id = fork ();
}

int data_pipe [2], c, rc;
if (pipe (data_pipe) == -1) {
} else { // parent
close (data_pipe [0]);
while ((c = getchar ()) > 0) {
perror (“no pipe“); exit (1); }
if (write
(data_pipe[1], &c, 1) == -1) { perror (“pipe broken“);
close (data_pipe [1]);
exit (1);
if (fork () == 0) { // child close (data_pipe [1]); while ((rc = read
(data_pipe [0], &c, 1)) >0) {
putchar (c);
}
}; }
if (rc == -1) {
perror (“pipe broken“);
close (data_pipe [0]); exit (1);}
close (data_pipe [1]);
close (data_pipe [0]); exit (0); © 2019 Uwe R. Zimmer, The Australian National University
pid = wait ();
}
Architectures
UNIX
Message passing in UNIX ☞ Pipes
page 739 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
☞ inefficient, but can generate new tasks out of any user process – no shared memory! Signals:
UNIX
Processes & IPC in UNIX
• Process creation results in a duplication of address space (‘copy-on-write’ becomes necessary)
Processes:
• limited information content, no buffering, no timing assurances (signals are not interrupts!) ☞ very basic, yet not very powerful form of synchronisation
Pipes:
• unstructured byte-stream communication, access is identical to file operations
☞ not sufficient to design client-server architectures or network communications
© 2019 Uwe R. Zimmer, The Australian National University page 740 of 757 (chapter 9: “Architectures” up to page 745)

Connectionless interfaces (e.g. UDP/IP): • Serverside:socket➠bind➠recvfrom➠close
• Client side: socket ➠ sendto ➠ close
Connection oriented interfaces (e.g. TCP/IP):
• Server side: socket ➠ bind ➠ {select} [connect | listen ➠
accept ➠ read | write ➠ [close | shutdown]
Architectures
UNIX
Sockets in BSD UNIX
Sockets try to keep the paradigm of a universal file interface for everything and introduce:
• Client side: socket ➠ bind ➠ connect ➠ write | read ➠ [close | shutdown]
© 2019 Uwe R. Zimmer, The Australian National University page 741 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
Portable Operating System Interface for Unix • IEEE/ANSI Std 1003.1 and following.
POSIX
• LibraryInterface(API)
[C Language calling conventions – types exit mostly in terms of (open) lists of pointers and integers with overloaded meanings].
• More than 30 different POSIX standards (and growing / changing). ☞ a system is ‘POSIX compliant’, if it implements parts of one of them!
☞ a system is ‘100% POSIX compliant’, if it implements one of them!
© 2019 Uwe R. Zimmer, The Australian National University page 742 of 757 (chapter 9: “Architectures” up to page 745)

1003.1 12/01
OS Definition
single process, multi process, job control, signals, user groups, file system, file at- tributes, file device management, file locking, device I/O, device-specific control, system database, pipes, FIFO, …
1003.1b 10/93
Real-time Extensions
real-time signals, priority scheduling, timers, asynchronous I/O, prioritized I/O, syn- chronized I/O, file sync, mapped files, memory locking, memory protection, mes- sage passing, semaphore, …
1003.1c 6/95
Threads
multiple threads within a process; includes support for: thread control, thread attrib- utes, priority scheduling, mutexes, mutex priority inheritance, mutex priority ceiling, and condition variables
1003.1d 10/99
Additional Real- time Extensions
new process create semantics (spawn), sporadic server scheduling, execution time monitoring of processes and threads, I/O advisory information, timeouts on block- ing functions, device control, and interrupt control
1003.1j 1/00
Advanced Real- time Extensions
typed memory, nanosleep improvements, barrier synchronization, reader/writer locks, spin locks, and persistent notification for message queues
1003.21 -/-
Distributed Real-time
buffer management, send control blocks, asynchronous and synchronous oper- ations, bounded blocking, message priorities, message labels, and implementation protocols
Architectures
POSIX – some of the relevant standards…
© 2019 Uwe R. Zimmer, The Australian National University page 743 of 757 (chapter 9: “Architectures” up to page 745)

Architectures
POSIX – 1003.1b/c
Frequently employed POSIX features include:
• Threads: a common interface to threading – differences to ‘classical UNIX processes’ • Timers: delivery is accomplished using POSIX signals
• Priority scheduling: fixed priority, 32 priority levels
• Real-time signals: signals with multiple levels of priority
• Semaphore: named semaphore
• Memory queues: message passing using named queues
• Shared memory: memory regions shared between multiple processes
• Memory locking: no virtual memory swapping of physical memory pages
© 2019 Uwe R. Zimmer, The Australian National University page 744 of 757 (chapter 9: “Architectures” up to page 745)

• Data-Parallelism
• Vectorization, Reduction, General data-parallelism
• Concurrency in languages
• Some examples: Haskell, Occam, Chapel
• Operating systems
• Structures: monolithic, modular, layered, μkernels • UNIX,POSIX
© 2019 Uwe R. Zimmer, The Australian National University
page 745 of 757 (chapter 9: “Architectures” up to page 745)
Architectures
Summary
Architectures
• Hardware architectures – from simple logic to supercomputers • logic, CPU architecture, pipelines, out-of-order execution, multithreading, …