Computer Systems Organization
Session 1 Main Theme Overview
Dr. Jean-Claude Franchitti
New York University
Computer Science Department Courant Institute of Mathematical Sciences
Presentation material partially based on textbook slides Ce Se: A Pae Peece
b Rada Ba ad Dad OHaa
Slides copyright © 2020
1
Agenda
1
2
3
Instructor and Course Introduction Computer Systems Organization Overview Summary and Conclusion
2
Who am I?
– Profile –
37 years of experience in the Information Technology Industry, including 20 years of experience working for leading business technology consulting firms such as Computer Sciences Corporation
CEO, Archemy Inc. and Professor of Computer Science at New York University
PhD in Computer Science from University of Colorado at Boulder (UCB)
Held senior management and technical leadership (SVP, CTO, CIO) roles in many large business technology strategy and modernization projects for fortune 500 corporations in the insurance, banking, investment banking, pharmaceutical, retail, and information management industries
Contributed to several high-profile ARPA and NSF research projects
Played an active role as a member of the OMG, ODMG, and X3H2 standards committees and as a
Professor of Computer Science at Columbia initially and New York University since 1997
Proven record of delivering business solutions on time and on budget
Original designer and developer of jcrew.com and the suite of products now known as IBM InfoSphere DataStage
Creator of the Enterprise Architecture Management Framework (EAMF) and main contributor to the creation of various maturity assessment methodologies
Developed partnerships between several companies and New York University to incubate new methodologies (e.g., EA maturity assessment methodology developed in Fall 2008), develop proof of concept software, recruit skilled graduates, and increase the caie visibility
3
How to reach me?
Cme nha ele did you expect?
Cell
(212) 203-5004
Email
jcf@cs.nyu.edu, jcf@archemy.com
Facebook
Jean-Claude Franchitti
LinkedIn
http://www.linkedin.com/in/jcfranchitti
Twitter
Tweets by jcfranchitti
Skype
jcfranchitti
WeChat
metacomp
4
What is the class about?
Course description and syllabus:
» http://www.nyu.edu/classes/jcf/CSCI-UA.0201-005/
» http://www.cs.nyu.edu/courses/fall20/CSCI-UA.0201-005/ » Accessible via NYU Classes
Textbooks:
» ComperSsems:AProgrammersPerspecie
Rada Ba ad Daid OHaa Pearson
ISBN-10: 013409266X, ISBN-13: 978-0134092669, 3rd Edition (03/2015) » Recommended:
» The C Programmning Language, Brian W Kernighan and Dennis Ritchie
Prentice Hall; ISBN-10: 0131103628; ISBN-13: 978-01311103627, 2nd Edition (04/88)
5
Icons / Metaphors
Information
Common Realization Knowledge/Competency Pattern Governance
Alignment
Solution Approach
6
6
Course topics overview
C Programming Language
x86_64 Assembly Language
Linking
Architecture
Memory System
Dynamic Memory Allocation
Processes and Concurrency (time permitting)
7
What you will learn in this course
What happens under the hood when you boot your computer and start running applications?
How do software and hardware interact?
This course is programmer-centric
» Understanding of underlying system makes you a more effective programmer and helps you find hidden bugs!
» Bring out the hidden hacker in everyone » Be way more efficient debugger
» Tune your programs for performance
8
But e also ant ou to
Use what you have learned in MANY different contexts
Start your research project if you want Know the big picture of the whole
computing stack.
Enjoy the course!
9
Course components
Lectures
» Higher level concepts
» slides + reading material from the textbook
Programming labs (~3-4 of them 20%)
» 1-2 weeks each
» Provide in-depth understanding of some aspect of systems
Homework assignments (~2-3 of them10%) » Labs do not cover all the material we will study!
» For theoretical knowledge
Midterm exam (25%) Final exam (35%) Recitation activities (5%) Quizzes (5%)
10
Exams
Closed books but cheat sheets allowed » Midterm: one cheat sheets (i.e. 2 A4 pages) » Final two cheat sheets (i.e. 4 A4 pages)
No electronic devices allowed Cheat sheet:
» Feel free to write anything on it (front and back), before the exam.
» Printed or hand-written
» You are allowed to take it to the exam.
11
Rules for working on assignments
All assignments must be done individually (see Cheating Policy and consequences next)
» Post all questions on the forums on NYU classes
» Y ae ecaged ae he ei, b efai f
explicitly giving away solutions
Unless stated otherwise in the assignment, all writing and coding must be original
All assignments must be emailed to the appropriate grader
T aid be ih “ eai” (e.g., he Iee ae homework”), you should save a copy of your EMAILs (not simply the assignment itself)
12
Cheating policy summarized
Please do NOT:
Copy any part of another student’s homework answers
Allow another student to copy your homework
Copy any part of code found in a book, magazine, the Internet, or
other resource
Present the work of another as your own
If you use the idea of another in your work, you MUST provide appropriate attribution (that is, cite the work and the author).
The penalty for first cheating offense will be a grade of F for the course
Computer Science Department Academic Integrity Policy
13
Integrity and collaboration
What is cheating?
» Sharing code: by copying, retyping, looking at, or
supplying a file
» Describing code: verbal description of code from one person to another.
» Coaching: helping your friend to write a lab, line by line
» Searching the Web for solutions
» Copying code from a previous course or online solution You are only allowed to use code we supply
What is NOT cheating?
» Explaining how to use systems or tools
» Helping others with high-level design issues
Ignorance is not an excuse
Sophisticated tools are used to detect code plagiarism
14
What can positively affect your grade (i.e. help boost your final grade)?
Participate in the forums at NYU classes, by asking questions or answering questions of other
Submit your assignments on time and in the correct format
15
What can negatively affect your grade ?
Coming up with a lot of excuses to get extensions (except documented health problems), or higher grades, examples:
» My machine crashed just before the deadline
you are better off submitting a version each time you complete part of the lab as you are allowed to submit several times; the last submission will be graded
» I have many assignments on other courses so please give me extension
» I submitted one minute after the deadline but the server did not accept it
» I submitted an older version of the lab
Asking questions on the forums that have been asked before
» Yes, you better read the previous questions and answers on the forums
It is a good way of studying because people may ask good questions that did not come to your mind
Asking about a rule that has already been mentioned in an announcement on NYU classes, slides, or assignment description
» You simply need to pay attention
16
Web Resources
Course webpage:
» http://cs.nyu.edu/courses/spring20/CSCI-UA.0201-005/
» For slides, assignments, syllabus, and useful links » Announcements related to material postings
NYU Classes
» Course material mirrored there
» Announcements
» Google groups mailing list
» Forums (arranged by topics)
» Labs and assignments submissions » Grades
17
What to study to get a good grade?
The slides for each lecture
The reading material for each lecture, if any
The questions and answers in the forums and Google Groups mailing list
Ad Attend the recitations
18
Important!!
Attending the lectures helps you a lot when it comes to understanding the material covered in this course
Do not rely on just reading the slides, the reading material, and/or material available via standard web searching
19
Computer Accounts
Students that do not already have a CIMS network account will be provided with one
20
Software Requirements
Software tools will be available on the CIMS cluster or, in some cases, on the course Web site de de ad a a chice f feeae commercial tools
Microsoft Windows 10 and Mac OS/X will be used at times when discussing code portability and tools compatibility
References will be provided on the course Web site
21
Class Mailing List
All students should register themselves with the NYU Google Groups list, which is used for all technical discussions concerning the course
You will be notified in return that you are a list participant
Please send all of your questions to this list (not to the instructor) so that everyone can participate
22
Handing in Assignments and Assignments Format
Hand-ins
Labs/homework assignments due as per the due date
(unless otherwise) indicated in NYU Classes
Submitted through NYU classes
Please include: Your name
Your SID
Aige be (1, 2, ) i bec
Eae, aige 1
NO credit will be given for ANY assignment after
the due date
23
Agenda
1
2
3
Instructor and Course Introduction Computer Systems Organization Overview Summary and Conclusion
24
Course Theme: Abstraction Is Good But Dont Forget Realit
Most CS courses emphasize abstraction » e.g. data types
Abstraction is good but has limitations » Especially in the presence of bugs!
This class helps you:
» peek under-the-hood
» become more effective programmers Debug problems
Tune performance
» prepare for later courses in CS
Compilers, Operating Systems, Networks, Computer Architecture, Embedded Systems, Distributed Systems, Parallel Cig, Sage Se,
25
Course Perspective (1/2)
Most Systems Courses are Builder-Centric » Computer Architecture
Design pipelined processor in Verilog
» Operating Systems
Implement sample portions of operating system
» Compilers
Write compiler for simple language
» Networking
Implement and simulate network protocols
26
Course Perspective (2/2)
This Course is Programmer-Centric
» Purpose is to show that by knowing more about the underlying system, one can be more effective as a programmer
» Enable you to
Write programs that are more reliable and efficient
Incorporate features that require hooks into OS E.g., concurrency, signal handlers
» Ce aeia i hi ce ha see elsewhere
» Not just a course for dedicated hackers
We bring out the hidden hacker in everyone!
27
Reality #1: Ints are not Integers and Floats are not Reals
x2 0?
» 400002 = 1600000000
» 500002 = ?
(x + y) + z = x + (y + z)?
» Unsigned and signed Ints: Yes
Ok for floats but Ints may overflow!!
» (1020+- 1020 )+3.14 =3.14 but 1020+(- 1020 +3.14) !=3.14
Source: xkcd.com/571
28
Computer Arithmetic
Does not generate random values
» Arithmetic operations have important mathematical properties
Cannot assume all usual mathematical properties
» Due to finiteness of representations
» Iege eai aif ig eie Commutativity, associativity, distributivity
» Faig i eai aif deig eie Monotonicity, values of signs
Observation
» Need to understand which abstractions apply in which
contexts
» Important issues for compiler writers and serious application programmers
29
Realit #2: Youve Got to Kno Assembl
No need to program in assembly
Knowledge of assembly helps one understand
machine-level execution
» Debugging
High-level language models break down
» Performance tuning
Understand optimizations done/not done by compiler
» Writing system software (e.g. compilers , OS)
» Implementing system software
Operating systems must manage process state Compiler has machine code as target
» Creating / fighting malware
x86 assembly is the language of choice!
30
Reality #3: Memory Matters
Memory is not unbounded
» It must be allocated and managed
Memory referencing bugs are very tricky » Effects are distant in both time and space
Memory performance is not uniform
» Cache and virtual memory effects can greatly
affect performance
» Adapting programs to characteristics of memory system can lead to major speed improvements
31
Memory Referencing Errors
C/C++ let programmers make memory errors » Out of bounds array references
» Invalid pointer values
» Abuse of malloc / free
Errors can lead to nasty bugs
» Corrupt program objects logically unrelated to one
being accessed
» Effect of bug observed long after the corruption
Need tools to detect referencing errors
32
Memory Referencing Bug Example
double fun(int i)
{
int a[2];
double d[1] = {3.14};
a[i] = 1073741824; /* Possibly out of bounds */
return d[0];
}
fun(0) = 3.14 fun(1) = 3.14 fun(2) = ? fun(3) = ? fun(4) = ?
33
Memory Referencing Bug Example
typedef struct { int a[2]; double d;
} struct_t;
double fun(int i) {
volatile struct_t s;
s.d = 3.14;
s.a[i] = 1073741824; /* Possibly out of bounds */ return s.d;
}
fun(0) fun(1) fun(2) fun(3) fun(4) fun(6)
3.14
3.14 3.1399998664856 2.00000061035156 3.14
Segmentation fault
» Result is system specific
34
Memory Referencing Bug Example
typedef struct {
int a[2];
double d;
} struct_t;
Explanation:
struct_t
fun(0)
fun(1)
fun(2)
fun(3)
fun(4)
fun(6)
6 5 4 3 2 1 0
3.14
3.14 3.1399998664856 2.00000061035156 3.14
Segmentation fault
Critical State
?
?
d7 … d4
d3 … d0
a[1]
a[0]
Location accessed by
fun(i)
35
Code Security Example
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len);
return len;
}
Similar to code found in FreeBSD
There are legions of smart people trying to find vulnerabilities in programs
36
Typical Usage
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len);
return len;
}
#define MSIZE 528
void getstuff() {
char mybuf[MSIZE];
copy_from_kernel(mybuf, MSIZE);
printf(%s\n, mbuf);
}
37
Malicious Usage
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len);
return len;
}
#define MSIZE 528
void getstuff() {
char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE); ...
}
38
Reality #4: Asymptotic performance is not always sufficient
Factors like memory access, communication, etc. matter
Even operation count might not predict performance
» Easily see 10:1 performance range depending on how
code written
» Must optimize at multiple levels: algorithm, data representations, procedures, and loops
Must understand system to optimize performance
» How are programs compiled and executed?
» How to measure performance and identify bottlenecks?
» How to improve performance without destroying code modularity and generality?
39
Memory System Performance Example
void copyij(int src[2048][2048], int dst[2048][2048])
{
int i,j;
for (i = 0; i < 2048; i++)
for (j = 0; j < 2048; j++)
dst[i][j] = src[i][j]; }
void copyji(int src[2048][2048], int dst[2048][2048])
{
int i,j;
for (j = 0; j < 2048; j++)
for (i = 0; i < 2048; i++)
dst[i][j] = src[i][j]; }
~21 times slower
4.3ms 81.8ms 2.0 GHz Intel Core i7 Haswell
Hierarchical memory organization
Performance depends on access patterns
» Including how step through multi-dimensional array
40
Why the performance differs
16000
14000
12000
10000
8000
6000
4000 2000 0
copyij
copyji
s1
s3
Stride (x8 bytes)
32k 128k
s5
s7
8m 32m
512k 2m
Size (bytes)
s9
s11 128m
41
Read throughput (MB/s)
Example Matrix Multiplication
Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHz (double precision) Gflop/s
Best code (K. Goto)
160x
Triple loop
Standard desktop computer and compiler
Both implementations have exactly the same operations count (2n3)
42
MMM Plot: Analysis
Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHz Gflop/s
Multiple threads: 4x
Vector instructions: 4x
Memory hierarchy and other optimizations: 20x
Reason for 20x: Blocking or tiling, loop unrolling, array scalarization Effect: fewer register spills, L1/L2 cache misses, and TLB misses
43
Reality #5: Computers include more than a CPU
They need to do I/O (to get data in and out)
They communicate with each other over networks
Concurrent operations by autonomous processes Coping with unreliable media
Cross platform compatibility
Complex performance issues
44
Sample Computer System Architecture
45
A Little Bit of History (1/4)
Eckert and Mauchly
1st working electronic computer (1946)
18,000 Vacuum tubes
1,800 instructions/sec
3,000 ft3
46
A Little Bit of History (2/4)
EDSAC 1 (1949)
http://www.cl.cam.ac.uk/UoCCL/misc/EDSAC99/
Maurice Wilkes
1st stored program computer
650 instructions/sec 1,400 ft3
47
A Little Bit of History (3/4)
1954: IBM developed 704
All programming done in assembly
Software costs exceed hardware costs!
48
A Little Bit of History (4/4)
Fortran I (project 1954- 57)
The main idea is to translate high level language to assembly
Many thought this was impossible!
In 1958 more than 50% of software in assembly!
Development time halved!
John Backus
(December 3, 1924 March 17, 2007)
49
A Little Bit of History - Early Programming Tools
50
A Little Bit of History - First Popular PCs
51
A Little Bit of History - Early PCs
Intel 8086 processor
768KB memory
20MB disk
Dot-Matrix printer (9-pin)
52
ProblemAlgorithm DevelopmentProgrammer
High Level Language
Compiler (translator)
Assembly Language
Assembler (translator)
Machine Language
Control Unit (Interpreter)
Microarchitecture
Microsequencer (Interpreter)
Logic Level
Device LevelSemiconductorsQuantum
computers levels of abstraction
High-Level Language
Assembly Language
Operating System
Instruction Set Architecture
Microarchitecture
Digital Logic
Level 5 Level 4
Level 3
Level 2 Level 1
Level 0
53
Source code to execution
C source Assembly
Assembly
Object File Object File
Assembly
Compiler
Assembly Assembly
Assembler
Object File Linker
Library Library
Library Loader
Executable
DLL DLL
DLL
54
Conclusions
This first lecture was just an overview. More fun is yet to come!
Computer system can be viewed as layers of abstractionsknowing these layers helps us see the big picture
Wece Ad Ej
55
Agenda
1
2
3
Instructor and Course Introduction Computer Systems Organization Overview Summary and Conclusion
56
Assignments & Readings
Readings
» Slides and Handouts posted on the course web site
» Syllabus, make sure you understand the rules associated with grading, due dates, lateness, etc.
» Textbook: Chapter 1
» Dive into Systems: chapter 1 (recommended)
» C references (The C Programming Language or any other references)
Attend Recitation
Share pointers to C references on the Google Class mailing list or
forum under NYU Classes
Assignment #1 / Lab #1 » TBD in later sessions
Development environment » TBD in session 2
57
Next Session: Data Presentation - Bits and Bytes
58
Questions, Comments, Discussions ?
59